Cluster configuration not synced-paloaltonetworks-panos

error
high-availability
panos
paloaltonetworks
Cluster configuration not synced-paloaltonetworks-panos
5.0 1
#1

Cluster configuration not synced-paloaltonetworks-panos

Vendor: paloaltonetworks

OS: panos

Description:
For devices that support full configuration synchronization, indeni will trigger an issue if the configuration is out of sync.

Remediation Steps:
Log into the device and synchronize the configuration across the cluster.

How does this work?
This script uses the Palo Alto Networks API to retrieve the status of the high availability function of this cluster and specifically the status of the config synchronization.

Why is this important?
Normally two Palo Alto Networks firewalls in a cluster work together to ensure their configurations are synchronized. Sometimes, due to connectivity or other issues, the configuration sync may be lost. In the event of a fail over, the secondary member will take over but will be running with a different configuration compared to the primary (the original active member). This can result in service disruption.

Without Indeni how would you find this?
The status of configuration sync is visible in the web interface, as a widget on the main screen.

panos-show-high-availability-all-monitoring

#! META
name: panos-show-high-availability-all-monitoring
description: Track health of HA
type: monitoring
monitoring_interval: 5 minute
requires:
    vendor: "paloaltonetworks"
    os.name: "panos"
    high-availability: "true"
    product: "firewall"

#! COMMENTS
cluster-member-active:
    why: |
        Tracking the state of a cluster member is important. If a cluster member which used to be the active member of the cluster no longer is, it may be the result of an issue. In some cases, it is due to maintenance work (and so was anticipated), but in others it may be due to a failure in the firewall or another component in the network.
    how: |
        This script uses the Palo Alto Networks API to retrieve the status of the high availability function of the firewall and specifically retrieves the local member's state.
    without-indeni: |
        The status of high availability is visible in the web interface, as a widget on the main screen.
    can-with-snmp: true
    can-with-syslog: true
cluster-state:
    why: |
        Tracking the state of a cluster is important. If a cluster which used to be healthy no longer is, it may be the result of an issue. In some cases, it is due to maintenance work (and so was anticipated), but in others it may be due to a failure in the members of the cluster or another component in the network.
    how: |
        This script uses the Palo Alto Networks API to retrieve the status of the high availability function of the cluster and specifically retrieves the local member's and peer's states.
    without-indeni: |
        The status of high availability is visible in the web interface, as a widget on the main screen.
    can-with-snmp: true
    can-with-syslog: true
cluster-preemption-enabled:
    why: |
        Preemption is a function in clustering which sets a primary member of the cluster to always strive to be the active member. The trouble with this is that if the active member that is set with preemption on has a critical failure and reboots, the cluster will fail over to the secondary and then immediately fail over back to the primary when it completes the reboot. This can result in another crash and the process would happen again and again in a loop. The Palo Alto Networks firewalls have a means of dealing with this ( https://live.paloaltonetworks.com/t5/Learning-Articles/Understanding-Preemption-with-the-Configured-Device-Priority-in/ta-p/53398 ) but it is generally a good idea not to have the preemption feature enabled.
    how: |
        This script uses the Palo Alto Networks API to retrieve the status of the high availability function of this cluster member and specifically the preemption setting.
    without-indeni: |
        Going into a preemption loop is difficult to detect. Normally an administrator will notice service disruption. Then through manual inspection the administrator will determine there is a preemption loop.
    can-with-snmp: true
    can-with-syslog: true
cluster-config-synced:
    why: |
        Normally two Palo Alto Networks firewalls in a cluster work together to ensure their configurations are synchronized. Sometimes, due to connectivity or other issues, the configuration sync may be lost. In the event of a fail over, the secondary member will take over but will be running with a different configuration compared to the primary (the original active member). This can result in service disruption.
    how: |
        This script uses the Palo Alto Networks API to retrieve the status of the high availability function of this cluster and specifically the status of the config synchronization.
    without-indeni: |
        The status of configuration sync is visible in the web interface, as a widget on the main screen.
    can-with-snmp: true
    can-with-syslog: true
device-is-passive:
    why: |
        This metric describe whether this device is a passive device. For passive device, port down alert should not be triggered.
    how: |
        This script uses the Palo Alto Networks API to retrieve the active/passive state of the device.
    without-indeni: |
        The active/passive status is visible in the web interface.
    can-with-snmp: true
    can-with-syslog: true
passive-link-state:
    why: |
        This metric describe whether this the passive-link-state is shutdown or auto. If it is shutdown we can use this metric to not to trigger alerts when ports are in power-down state as expected behavior.
    how: |
        This script uses the Palo Alto Networks API to retrieve the passive-link-state state of the device.
    without-indeni: |
        The passive-link-state status can be found via the web interface or the cli.
    can-with-snmp: true
    can-with-syslog: true

#! REMOTE::HTTP
url: /api?type=op&cmd=<show><high-availability><all></all></high-availability></show>&key=${api-key}
protocol: HTTPS

#! PARSER::XML
_vars:
    root: /response/result
_metrics:
    -
        _tags:
            "im.name":
                _constant: "cluster-member-active"
            "name":
                _constant: "Firewall Clustering"
            "live-config":
                _constant: "true"
            "display-name":
                _constant: "Cluster Member State (this)"
            "im.dstype.displayType":
                _constant: "state"
        _temp:
            state:
                _text: "${root}/group/local-info/state"
        _transform:
            _value.double: |
                {
                    if (temp("state") ~ /^(active|active-primary|active-secondary)/) {
                        print "1"
                    } else {
                        print "0"
                    }
                }
    -
        _tags:
            "im.name":
                _constant: "device-is-passive"
        _temp:
            state:
                _text: "${root}/group/local-info/state"
        _transform:
            _value.double: |
                {
                    if (temp("state") ~ /^(active|active-primary|active-secondary)/) {
                        print "0"
                    } else {
                        print "1"
                    }
                }
    -
        _tags:
            "im.name":
                _constant: "passive-link-state"
        _temp:
            "passivelinkstate":
                _count: "${root}/group/local-info/active-passive/passive-link-state[. = 'shutdown']"
        _transform:
            _value.double: |
                {
                    if (temp("passivelinkstate") > 0) {
                        print "0"
                    } else {
                        print "1"
                    }
                }
    -
        _tags:
            "im.name":
                _constant: "cluster-state"
            "name":
                _constant: "Firewall Clustering"
            "live-config":
                _constant: "true"
            "display-name":
                _constant: "Cluster State"
            "im.dstype.displayType":
                _constant: "state"
        _temp:
            localstate:
                _text: "${root}/group/local-info/state"
            peerstate:
                _text: "${root}/group/peer-info/state"
        _transform:
            _value.double: |
                {
                    if (temp("localstate") != "down" && temp("peerstate") != "down" && temp("peerstate") != "unknown" && temp("peerstate") != "suspended") {
                        print "1"
                    } else {
                        print "0"
                    }
                }
    -
        _tags:
            "im.name":
                _constant: "cluster-config-synced"
            "live-config":
                _constant: "true"
            "display-name":
                _constant: "Cluster Configuration Synced"
            "im.dstype.displayType":
                _constant: "boolean"
        _temp:
            runningsync:
                _text: "${root}/group/running-sync"
        _transform:
            _value.double: |
                {
                    if (temp("runningsync") == "synchronized") {
                        print "1"
                    } else {
                        print "0"
                    }
                }
    -
        _tags:
            "im.name":
                _constant: "cluster-preemption-enabled"
            "live-config":
                _constant: "true"
            "display-name":
                _constant: "Preemption Enabled"
            "im.dstype.displayType":
                _constant: "boolean"
        _temp:
            preemptive:
                _text: "${root}/group/local-info/preemptive"
        _transform:
            _value.double: |
                {
                    if (temp("preemptive") == "yes") {
                        print "1"
                    } else {
                        print "0"
                    }
                }

panos-show-high-availability-all-monitoring

#! META
name: panos-show-high-availability-all-monitoring
description: Track health of HA
type: monitoring
monitoring_interval: 5 minute
requires:
    vendor: "paloaltonetworks"
    os.name: "panos"
    high-availability: "true"
    product: "firewall"

#! COMMENTS
cluster-member-active:
    why: |
        Tracking the state of a cluster member is important. If a cluster member which used to be the active member of the cluster no longer is, it may be the result of an issue. In some cases, it is due to maintenance work (and so was anticipated), but in others it may be due to a failure in the firewall or another component in the network.
    how: |
        This script uses the Palo Alto Networks API to retrieve the status of the high availability function of the firewall and specifically retrieves the local member's state.
    without-indeni: |
        The status of high availability is visible in the web interface, as a widget on the main screen.
    can-with-snmp: true
    can-with-syslog: true
cluster-state:
    why: |
        Tracking the state of a cluster is important. If a cluster which used to be healthy no longer is, it may be the result of an issue. In some cases, it is due to maintenance work (and so was anticipated), but in others it may be due to a failure in the members of the cluster or another component in the network.
    how: |
        This script uses the Palo Alto Networks API to retrieve the status of the high availability function of the cluster and specifically retrieves the local member's and peer's states.
    without-indeni: |
        The status of high availability is visible in the web interface, as a widget on the main screen.
    can-with-snmp: true
    can-with-syslog: true
cluster-preemption-enabled:
    why: |
        Preemption is a function in clustering which sets a primary member of the cluster to always strive to be the active member. The trouble with this is that if the active member that is set with preemption on has a critical failure and reboots, the cluster will fail over to the secondary and then immediately fail over back to the primary when it completes the reboot. This can result in another crash and the process would happen again and again in a loop. The Palo Alto Networks firewalls have a means of dealing with this ( https://live.paloaltonetworks.com/t5/Learning-Articles/Understanding-Preemption-with-the-Configured-Device-Priority-in/ta-p/53398 ) but it is generally a good idea not to have the preemption feature enabled.
    how: |
        This script uses the Palo Alto Networks API to retrieve the status of the high availability function of this cluster member and specifically the preemption setting.
    without-indeni: |
        Going into a preemption loop is difficult to detect. Normally an administrator will notice service disruption. Then through manual inspection the administrator will determine there is a preemption loop.
    can-with-snmp: true
    can-with-syslog: true
cluster-config-synced:
    why: |
        Normally two Palo Alto Networks firewalls in a cluster work together to ensure their configurations are synchronized. Sometimes, due to connectivity or other issues, the configuration sync may be lost. In the event of a fail over, the secondary member will take over but will be running with a different configuration compared to the primary (the original active member). This can result in service disruption.
    how: |
        This script uses the Palo Alto Networks API to retrieve the status of the high availability function of this cluster and specifically the status of the config synchronization.
    without-indeni: |
        The status of configuration sync is visible in the web interface, as a widget on the main screen.
    can-with-snmp: true
    can-with-syslog: true
device-is-passive:
    why: |
        This metric describe whether this device is a passive device. For passive device, port down alert should not be triggered.
    how: |
        This script uses the Palo Alto Networks API to retrieve the active/passive state of the device.
    without-indeni: |
        The active/passive status is visible in the web interface.
    can-with-snmp: true
    can-with-syslog: true
passive-link-state:
    why: |
        This metric describe whether this the passive-link-state is shutdown or auto. If it is shutdown we can use this metric to not to trigger alerts when ports are in power-down state as expected behavior.
    how: |
        This script uses the Palo Alto Networks API to retrieve the passive-link-state state of the device.
    without-indeni: |
        The passive-link-state status can be found via the web interface or the cli.
    can-with-snmp: true
    can-with-syslog: true

#! REMOTE::HTTP
url: /api?type=op&cmd=<show><high-availability><all></all></high-availability></show>&key=${api-key}
protocol: HTTPS

#! PARSER::XML
_vars:
    root: /response/result
_metrics:
    -
        _tags:
            "im.name":
                _constant: "cluster-member-active"
            "name":
                _constant: "Firewall Clustering"
            "live-config":
                _constant: "true"
            "display-name":
                _constant: "Cluster Member State (this)"
            "im.dstype.displayType":
                _constant: "state"
        _temp:
            state:
                _text: "${root}/group/local-info/state"
        _transform:
            _value.double: |
                {
                    if (temp("state") ~ /^(active|active-primary|active-secondary)/) {
                        print "1"
                    } else {
                        print "0"
                    }
                }
    -
        _tags:
            "im.name":
                _constant: "device-is-passive"
        _temp:
            state:
                _text: "${root}/group/local-info/state"
        _transform:
            _value.double: |
                {
                    if (temp("state") ~ /^(active|active-primary|active-secondary)/) {
                        print "0"
                    } else {
                        print "1"
                    }
                }
    -
        _tags:
            "im.name":
                _constant: "passive-link-state"
        _temp:
            "passivelinkstate":
                _count: "${root}/group/local-info/active-passive/passive-link-state[. = 'shutdown']"
        _transform:
            _value.double: |
                {
                    if (temp("passivelinkstate") > 0) {
                        print "0"
                    } else {
                        print "1"
                    }
                }
    -
        _tags:
            "im.name":
                _constant: "cluster-state"
            "name":
                _constant: "Firewall Clustering"
            "live-config":
                _constant: "true"
            "display-name":
                _constant: "Cluster State"
            "im.dstype.displayType":
                _constant: "state"
        _temp:
            localstate:
                _text: "${root}/group/local-info/state"
            peerstate:
                _text: "${root}/group/peer-info/state"
        _transform:
            _value.double: |
                {
                    if (temp("localstate") != "down" && temp("peerstate") != "down" && temp("peerstate") != "unknown" && temp("peerstate") != "suspended") {
                        print "1"
                    } else {
                        print "0"
                    }
                }
    -
        _tags:
            "im.name":
                _constant: "cluster-config-synced"
            "live-config":
                _constant: "true"
            "display-name":
                _constant: "Cluster Configuration Synced"
            "im.dstype.displayType":
                _constant: "boolean"
        _temp:
            runningsync:
                _text: "${root}/group/running-sync"
        _transform:
            _value.double: |
                {
                    if (temp("runningsync") == "synchronized") {
                        print "1"
                    } else {
                        print "0"
                    }
                }
    -
        _tags:
            "im.name":
                _constant: "cluster-preemption-enabled"
            "live-config":
                _constant: "true"
            "display-name":
                _constant: "Preemption Enabled"
            "im.dstype.displayType":
                _constant: "boolean"
        _temp:
            preemptive:
                _text: "${root}/group/local-info/preemptive"
        _transform:
            _value.double: |
                {
                    if (temp("preemptive") == "yes") {
                        print "1"
                    } else {
                        print "0"
                    }
                }

cluster_config_unsynced

Failed to fetch the data: https://bitbucket.org/indeni/indeni-knowledge/src/master/rules/sync_core_rules/ClusterConfigNotSyncedRule.scala
0 Likes

#2

:+1: This is a very valuable notification.

1 Like

pinned #3
0 Likes