Cluster has preemption enabled-checkpoint-False

warn
high-availability
false
checkpoint
Cluster has preemption enabled-checkpoint-False
0

#1

Cluster has preemption enabled-checkpoint-False

Vendor: checkpoint

OS: False

Description:
Preemption is generally a bad idea in clustering, although sometimes it is the default setting. indeni will trigger an issue if it’s on.

Remediation Steps:
It is generally best to have preemption disabled. Instead, once this device returns from a crash, you can conduct the failover manually.

How does this work?
By parsing the “cphaprob state” command and $FWDIR/state/local/FW1/local.cluster_member file, it is possible to retrieve the preempt setting for the device.

Why is this important?
When preemption is enabled, if the primary firewall fails, then the secondary firewall will take the active role and start to forward traffic. But, when the primary firewall comes back up, it will immediately resume the active role. If this happens repeatedly in a short period of time, it can have a negative effect on performance. Therefore, best practice is not to use preemption.

Without Indeni how would you find this?
An administrator can view the setting in Check Point SmartDashboard.

chkp-cphaprob-state-preempt

#! META
name: chkp-cphaprob-state-preempt
description: Parse out information about objects from the database regarding clusters
type: monitoring
monitoring_interval: 10 minutes
requires:
    vendor: checkpoint
    high-availability: "true"
    role-firewall: "true"
    vsx:
        neq: true
    or:
        -
            os.version: "R80.10"
        -
            os.version: "R80.20"

#! COMMENTS
known-devices:
    skip-documentation: true

cluster-preemption-enabled:
    why: |
        When preemption is enabled, if the primary firewall fails, then the secondary firewall will take the active role and start to forward traffic. But, when the primary firewall comes back up, it will immediately resume the active role. If this happens repeatedly in a short period of time, it can have a negative effect on performance. Therefore, best practice is not to use preemption.
    how: |
        By parsing the "cphaprob state" command and $FWDIR/state/local/FW1/local.cluster_member file, it is possible to retrieve the preempt setting for the device.
    without-indeni: |
        An administrator can view the setting in Check Point SmartDashboard.
    can-with-snmp: false
    can-with-syslog: false
    vendor-provided-management: |
        An administrator can view the setting in Check Point SmartDashboard.

#! REMOTE::SSH
${nice-path} -n 15 ifconfig | grep inet | awk '/Bcast/ {gsub("addr:","", $2);print "local IP: "$2}'; ${nice-path} -n 15 cat $FWDIR/state/local/FW1/local.cluster_member | awk '/ip_address / {gsub(/^\(/,"",$2);gsub(/\)$/,"",$2); print "object IP: "$2}'; ${nice-path} -n 15 cat $FWDIR/state/HAGW01/FW1/local.cluster_member | grep "\: (" | awk '{gsub(/^\(/,"",$2);gsub(/\)$/,"",$2); print "object name: "$2}'; ${nice-path} -n 15 cphaprob state




#! PARSER::AWK

##################################################################################
#                            OUTPUT                                              #
##################################################################################
#local IP: 10.0.2.34
#local IP: 192.168.100.34
#local IP: 192.168.101.34
#local IP: 192.168.102.34
#object IP: 10.0.2.33
#object IP: 10.0.2.34
#object name: GW12
#object name: GW11
#
#Cluster Mode:   High Availability (Active Up) with IGMP Membership
#
#Number     Unique Address  Assigned Load   State
#
#1 (local)  192.168.102.34  0%              Standby
#2          192.168.102.33  100%            Active
#
#Local member is in current state since Sat Sep  8 19:21:00 2018
#
#
#
##################################################################################



BEGIN{
    ip_index = 0
    object_ip_index = 0
    object_name_index = 0
    local_ip_list[1] = ""
    cluster_memebers_ips[1] = ""
    cluster_memebers_name[1] = ""
    is_local[1] = 0
}

#local IP: 10.0.2.34
/^local IP/{
    ip_index++
    eth = $4
    local_ip_list[ip_index] = eth
}

#object IP: 10.0.2.33
/^object IP/{
    object_ip_index++
    cluster_member_ip = $3
    cluster_memebers_ips[object_ip_index] = cluster_member_ip

}
#object name: GW12
/^object name/{
    object_name_index++
    member_name = $3
    cluster_memebers_name[object_name_index] = member_name
}

#Cluster Mode:   High Availability (Active Up) with IGMP Membership
/Cluster Mode/{
    HAmode = $5 " " $6
    sub( /^\(/ , "" , HAmode)
    sub( /\)$/ , "" , HAmode)
}




######## END tasks ########

END {
    for (i = 1; i <= object_name_index; i++ )  {
        for (j = 1; j <= ip_index ; j++ ) {
            if ( cluster_memebers_ips[i] == local_ip_list[j] ) {
                is_local[i] = 1
                j = ip_index+1
            }
            else {
                is_local[i] = 0
            }
        }
    }

    device_index = 1
    for (i = 1; i <= object_name_index; i++ )  {
        if ( is_local[i] == 0) {
            known_devices[device_index, "name"] =  cluster_memebers_name[i]
            known_devices[device_index, "ip"] = cluster_memebers_ips[i]
            wrote_known_devices = 1
            device_index++
        }
    }


    if (wrote_known_devices == 1) {
        writeComplexMetricObjectArray("known-devices", null, known_devices)
    }

    if ( HAmode == "Primary Up") {
        preemption_enabled = 1
    }
    else {
        preemption_enabled = 0
    }
    writeDoubleMetric("cluster-preemption-enabled", null, "gauge", 300, preemption_enabled)
 }


cross_vendor_cluster_preempt

package com.indeni.server.rules.library

import com.indeni.apidata.time.TimeSpan
import com.indeni.ruleengine.expressions.conditions.Equals
import com.indeni.ruleengine.expressions.core._
import com.indeni.ruleengine.expressions.data.{SelectTagsExpression, SelectTimeSeriesExpression, TimeSeriesExpression}
import com.indeni.server.common.data.conditions.True
import com.indeni.server.rules.library.core.PerDeviceRule
import com.indeni.server.rules.{RuleContext, _}
import com.indeni.server.sensor.models.managementprocess.alerts.dto.AlertSeverity


case class ClusterPreemptionEnabledRule(context: RuleContext) extends PerDeviceRule with RuleHelper {

  override val metadata: RuleMetadata = RuleMetadata.builder("cross_vendor_cluster_preempt", "Clustered Devices: Cluster has preemption enabled",
    "Preemption is generally a bad idea in clustering, although sometimes it is the default setting. indeni will trigger an issue if it's on.",
    AlertSeverity.WARN).interval(TimeSpan.fromMinutes(5)).build()


  override def expressionTree: StatusTreeExpression = {
    val inUseValue = TimeSeriesExpression[Double]("cluster-preemption-enabled").last

    StatusTreeExpression(
      // Which objects to pull (normally, devices)
      SelectTagsExpression(context.metaDao, Set(DeviceKey), True),

      StatusTreeExpression(
        // The time-series we check the test condition against:
        SelectTimeSeriesExpression[Double](context.tsDao, Set("cluster-preemption-enabled"), denseOnly = false),

        // The condition which, if true, we have an issue. Checked against the time-series we've collected
        Equals(
          inUseValue,
          ConstantExpression[Option[Double]](Some(1.0)))
      ).withoutInfo().asCondition()

      // Details of the alert itself
    ).withRootInfo(
      getHeadline(),
      ConstantExpression("This cluster member has preemption enabled. This means that it will have priority over other cluster members. If this device reboots or crashes, it'll try to assume priority in the cluster when it finishes its boot process. This may result in it crashing again, and causing a preemption loop."),
      ConditionalRemediationSteps("It is generally best to have preemption disabled. Instead, once this device returns from a crash, you can conduct the failover manually.",
        ConditionalRemediationSteps.VENDOR_PANOS ->
          """|Palo Alto Networks firewalls have a special way of handling preemption loops, review the following article:
             |<a target="_blank" href="https://live.paloaltonetworks.com/t5/Learning-Articles/Understanding-Preemption-with-the-Configured-Device-Priority-in/ta-p/53398">Understanding Preemption with the Configured Device Priority in HA Active/Passive Mode</a>.""".stripMargin,
        ConditionalRemediationSteps.OS_NXOS ->
          """|FHRP preemption and delays features are not required. The vPC will forward traffic as soon as the links become available. Once a device recovers from a crash or reboot, you can conduct the failover manually.
             |Cisco recommends:
             |1. Configuring the FHRP with the default settings and without preempt when using vPC.
             |2. Make the vPC primary switch the FHRP active switch. This is not intended to improve performance or stability. It does make one switch responsible for the control plane traffic. This is a little easier on the administrator while troubleshooting.""".stripMargin,
        ConditionalRemediationSteps.VENDOR_JUNIPER ->
          """1. Generally, it is recommended to have preemption disabled. Instead, once this device returns from a crash, you can conduct the failover manually.
            |2. If preemption is added to a redundancy group configuration, the device with the high priority in the group can initiate a failover to become a master.
            |3. On the device command line interface execute "request chassis cluster failover node"  or  "request chassis cluster failover redundancy-group"  commands to override the priority setting and preemption.
            |4. Review the following article on Juniper TechLibrary for more information: <a target="_blank" href="https://www.juniper.net/documentation/en_US/junos/topics/reference/command-summary/request-chassis-cluster-failover-node.html">Operational Commands: request chassis cluster failover node</a>""".stripMargin
      )
    )
  }
}