Cluster configuration not synced-f5-False

error
high-availability
false
f5
Cluster configuration not synced-f5-False
0

#1

Cluster configuration not synced-f5-False

Vendor: f5

OS: False

Description:
For devices that support full configuration synchronization, indeni will trigger an issue if the configuration is out of sync.

Remediation Steps:
Log into the device and synchronize the configuration across the cluster.

How does this work?
This alert logs into the F5 device through SSH and retrieves the current state of the configuration synchronization.

Why is this important?
It is normally desireable for clusters to have their configuration synced. Else, changes made on one node in a cluster might not be active in the event of a fail over. This might cause disruption.

Without Indeni how would you find this?
An unsynced cluster could be detected during a failover where old configuration is suddenly activated and this could cause issues with the application delivery. An administrator could verify the cluster member state by logging into the web interface of the device looking in the upper left corner. An synced cluster show “In sync”.

f5-show-cm

#! META
name: f5-show-cm
description: Get cluster information
type: monitoring
monitoring_interval: 5 minutes
requires:
    vendor: "f5"
    high-availability: "true"
    product: "load-balancer"
    shell: "bash"

#! COMMENTS
known-devices:
    why: |
        To make it easier to add devices to indeni, the cluster members are extracted.
    how: |
        This alert logs into the F5 device through SSH and extracts the known cluster members.
    without-indeni: |
        This wouldn't be relevant without indeni.
    can-with-snmp: false
    can-with-syslog: false
cluster-state:
    why: |
        Tracking the state of a cluster is important. If a cluster which used to be healthy no longer is, it may be the result of an issue. In some cases, it is due to maintenance work (and so was anticipated), but in others it may be due to a failure in the members of the cluster or another component in the network.
    how: |
        This alert logs into the F5 device through SSH to verify that each traffic group has an active member.
    without-indeni: |
        Problems with a cluster state is generally detected by that the units in question does not process traffic. An administrator could verify that each traffic group has an active member by logging into the device through SSH, entering TMSH and executing the command "show cm". This would bring up details of the cluster state.
    can-with-snmp: true
    can-with-syslog: false
cluster-member-active:
    why: |
        Tracking the state of a cluster member is important. If a cluster member which used to be the active member of the cluster no longer is, it may be the result of an issue. In some cases, it is due to maintenance work (and so was anticipated), but in others it may be due to a failure in the firewall or another component in the network.
    how: |
        This alert logs into the F5 device through SSH and retrieves the local member's state.
    without-indeni: |
        An unplanned change of a cluster members state could be detected by traffic disruptions. An administrator could verify the cluster member state by logging into the web interface of the device looking in the upper left corner. An active device would show "ACTIVE".
    can-with-snmp: true
    can-with-syslog: false
cluster-config-synced:
    why: |
        It is normally desireable for clusters to have their configuration synced. Else, changes made on one node in a cluster might not be active in the event of a fail over. This might cause disruption.
    how: |
        This alert logs into the F5 device through SSH and retrieves the current state of the configuration synchronization.
    without-indeni: |
        An unsynced cluster could be detected during a failover where old configuration is suddenly activated and this could cause issues with the application delivery. An administrator could verify the cluster member state by logging into the web interface of the device looking in the upper left corner. An synced cluster show "In sync".
    can-with-snmp: true
    can-with-syslog: false


#! REMOTE::SSH
tmsh -q show cm

#! PARSER::AWK

BEGIN {
        cluster_state = 0
        iKnown_devices = 0
}

#Detect when we're in the "failover_status" section
#CM::Failover Status
/CM::Failover Status/{
        section = "failover_status"
}

#Detect when we're in the "sync_status" section
#CM::Sync Status
/CM::Sync Status/{
        section = "sync_status"
}

#Detect when we're in the "device_status" section
#CentMgmt::Device: device1.mydomain.local
/CentMgmt::Device:/ {
        section = "device_status"
        known_devicename = $2
}

#Detect when we're in the "traffic_groups" section
#CM::Traffic-Group
/::Traffic-Group/ {
    if (/CM::Traffic-Group/) {
        section = "traffic_groups"
    } else {
        # If it s running v13 it will start with CentMgmt::Traffic-Group and have a different output
        section = "traffic_groups"
        version = "13plus"
   }

}

#Match on non-empty lines
/^.+$/ {

    #-------------------------------------------
    #CM::Failover Status
    #-------------------------------------------
    #Color    green
    #Status   ACTIVE
    #Summary  1/1 active
    #Details
    # active for /Common/traffic-group-1
    # active for /Common/traffic-group-2

    if (section == "failover_status" && $1 == "active") {

        #Split the traffic group name on "/" and save the number of elements to "n"
        n=split($NF, trafficGroupArr, "/");

        #Get the traffic group name excluding the partition
        trafficGroup = trafficGroupArr[3];

        #Add the traffic group to the ones that the device is active in
        deviceActivetraffic_groups[trafficGroup] = 1;

    }

    #------------------------------------------------------------------------------------------
    #CM::Sync Status
    #------------------------------------------------------------------------------------------
    #Color    green
    #Status   In Sync
    #Summary  All devices in the device group are in sync
    #Details
    #       /Common/device1.mydomain.local: connected (for 3568561 seconds)
    #       /Common/DeviceGroupName (In Sync): All devices in the device group are in sync
    #       /Common/device_trust_group (In Sync): All devices in the device group are in sync

    #Write metrics for sync_status
    if (section == "sync_status" && $1 == "Status") {

        if (($2 " " $3) == "In Sync") {
                writeDoubleMetric("cluster-config-synced", null, "gauge", 300, 1)
        } else {
                writeDoubleMetric("cluster-config-synced", null, "gauge", 300, 0)
        }
    }

    #------------------------------------------------------------------------------------------
    #CentMgmt::Device: device1.mydomain.local
    #------------------------------------------------------------------------------------------
    #Mgmt Ip               10.20.11.125
    #Configsync Ip         10.253.251.1
    #Hostname              device1.mydomain.local
    #Base Mac              00:01:d7:ef:79:03
    #Mirror Ip             10.253.251.1
    #Mirror Secondary Ip   ::
    #Multicast Interface
    #Multicast IP          0.0.0.0
    #Multicast Port        0
    #Version               11.5.4
    #Product               BIG-IP
    #Edition               Hotfix HF2
    #Build                 2.0.291
    #Marketing Name        BIG-IP vCMP Guest
    #Platform Id           Z101
    #Chassis Id            chs400354s
    #Active Modules        Compression, Unlimited|KCTMAZP-PRTKSMP LTM, Base, C2400|AOGBJCA-PFRYCXK|IPV6 Gateway|Rate Shaping|Ram Cache|50 MBPS COMPRESSION|SSL, 500 TPS Per Core|Cluster Multi-Processing|APM, Limited, Viprion|App Tunnel|Anti-Virus Checks|Base Endpoint Security Checks|Firewall Checks|Machine Certificate Checks|Network Access|Protected Workspace|Remote Desktop|Secure Virtual Keyboard|APM, Web Application|SSL, C2200/C2400|AAM, Core VCMP Enabled, C2400|UNZRQBY-YXIYLBJ
    #Inactive Modules
    #Optional Modules      Acceleration Manager, C2400 ADC, Security Bundle, C2400 Advanced Protocols AFM, C2400 APM, Base, C2400 APM, Max Access Sessions, C2400 APM, Max CCU, C2400 App Mode (TMSH Only, No Root/Bash) ASM, Bundle, VIPRION ASM, PSM to ASM Upgrade ASM, Unlimited, VIPRION Best Bundle, C2200 / C2400 Platforms Better Bundle, C2200 / C2400 Platforms Better to Best Bundle, C2200 / C2400 Platforms CGN, Viprion Client Authentication DNS and GTM (1K QPS), VIPRION DNS and GTM (Unlimited), VIPRION DNS Services, VPR External Interface and Network HSM FIX Low Latency GTM IPI Subscription, 1Yr, C2400 IPI Subscription, 3Yr, C2400 MSM, Unlimited Mailboxes PEM URL Filtering, Subscription, 1Yr, C2400 PEM URL Filtering, Subscription, 3Yr, C2400 PEM, C2400 PEM, Quota Management, C2X00 Performance Extreme, VPR PSM, Base Routing Bundle SDN Services SSL, Forward Proxy SSL, Unlimited, C2400/C4400/C4480 SWG 1Yr, C2200/C2400, 60K URL Sessions SWG 3Yr, C2200/C2400, 60K URL Sessions SWG Subscription, 1Yr, C2200/C2400 SWG Subscription, 3Yr, C2200/C2400 URL Filtering Subscription, 1Yr, C2200/C2400 URL Filtering Subscription, 3Yr, C2200/C2400 URLF 1Yr, C2200/C2400, 60K URL Sessions URLF 3Yr, C2200/C2400, 60K URL Sessions VIPRION, Multicast Routing WBA, Bundle, C2400
    #Time Limited Modules
    #Location
    #Contact
    #Description
    #Comment
    #Time zone             CEST
    #Self device           1
    #Failover State        active

    #Write metrics for device status
    if (section == "device_status") {

        #Version 11 uses "Failover State", Version 12 uses "Device HA State"
        if (match($0, /((Failover State)|(Device HA State))\s+active/)) {
                cluster_state = 1
        }

        #Get the management IP of the device
        #Mgmt Ip               10.20.11.125
        if (match($0, /Mgmt Ip/)) {
                knownDeviceManagementIP = $NF
        }

    }

    if (section == "traffic_groups") {
        #--------------------------------------------------------------------------
        #CentMgmt::Traffic-Group
        #Name                      Device         Status   Next    Previous  Active
        #                                                  Active  Active    Reason
        #--------------------------------------------------------------------------
        #traffic-group-1           device1.local  standby  true    true      -
        #traffic-group-1           device2.local  active   false   false     -
        #traffic-group-local-only  -              -        -       -         -

        # For version 13 and newer:
        if ( version == "13plus" ) {
            if (match($0, /[^\s]+\s+[^\s]+\s+[^\s]+\s+(true|false)+\s+(true|false)+\s+[^\s]$/)) {

                trafficGroupTags["name"] = $1

                #Save the active traffic groups in an array
                #This will be used further down to determine cluster state
                if ($3 == "active") {
                    clusterActivetraffic_groupsArr[$1] = 1
                }

                #Add the traffic group to the list if it's not already there
                if (!($1 in clusterTrafficGroupArr)) {
                    clusterTrafficGroupArr[$1] = 1
                }

                #If the traffic group is in the active traffic groups it means that this member is active, otherwise not
                #Had to write it this way since the command only shows the name of the device that is active, not if it is itself

            }

        #---------------------------------------------------------------
        #CM::Traffic-Group
        #Name                      Device                Status   Next
        #                          Active
        #---------------------------------------------------------------
        #traffic-group-2           device1.mydomain.local  active   false
        #traffic-group-1           device2.mydomain.local  standby  true
        #traffic-group-3           device1.mydomain.local  standby  true
        #traffic-group-3           device2.mydomain.local  active   false
        #traffic-group-1           device1.mydomain.local  active   false
        #traffic-group-1           device2.mydomain.local  standby  true
        #traffic-group-local-only  -                     -        -

        # For versions below version 13:
        } else {
            if (match($0, /[^\s]+\s+[^\s]+\s+[^\s]+\s+(true|false)$/)) {

                trafficGroupTags["name"] = $1

                #Save the active traffic groups in an array
                #This will be used further down to determine cluster state
                if ($3 == "active") {
                    clusterActivetraffic_groupsArr[$1] = 1
                }

                #Add the traffic group to the list if it's not already there
                if (!($1 in clusterTrafficGroupArr)) {
                    clusterTrafficGroupArr[$1] = 1
                }

                #If the traffic group is in the active traffic groups it means that this member is active, otherwise not
                #Had to write it this way since the command only shows the name of the device that is active, not if it is itself

            }

        }

    }

}

#Empty lines signifies the end of a section
/^$/{

    #Check which section ended and write metrics if applicable
    if (section == "device_status") {
            iKnown_devices++
            known_devices[iKnown_devices, "name"] = known_devicename
            known_devices[iKnown_devices, "ip"] = knownDeviceManagementIP
    }

    #Reset the section variable
    section = ""
}

END {

    if (arraylen(deviceActivetraffic_groups) > 0) {
        if (deviceActivetraffic_groups[$1] != "") {
            writeDoubleMetric("cluster-member-active", trafficGroupTags, "gauge", 300, 1)
        } else {
            writeDoubleMetric("cluster-member-active", trafficGroupTags, "gauge", 300, 0)
        }
    }



    activetraffic_groups = 1

    #Verify that all traffic groups has an active unit
    for(trafficGroup in clusterTrafficGroupArr) {

        trafficGroupTags["name"] = trafficGroup

        if (trafficGroup in clusterActivetraffic_groupsArr) {
            writeDoubleMetric("cluster-state", trafficGroupTags, "gauge", 300, 1)
        } else {
            writeDoubleMetric("cluster-state", trafficGroupTags, "gauge", 300, 0)
        }
    }

    #  "ip" : "10.0.155.125",
    #  "name" : "device1.mydomain.local"
    #}, {
    #  "name" : "device2.mydomain.local",
    #  "ip" : "10.0.155.135"
    #} ]

    writeComplexMetricObjectArray("known-devices", null, known_devices)
}


f5-show-cm

#! META
name: f5-show-cm
description: Get cluster information
type: monitoring
monitoring_interval: 5 minutes
requires:
    vendor: "f5"
    high-availability: "true"
    product: "load-balancer"
    shell: "bash"

#! COMMENTS
known-devices:
    why: |
        To make it easier to add devices to indeni, the cluster members are extracted.
    how: |
        This alert logs into the F5 device through SSH and extracts the known cluster members.
    without-indeni: |
        This wouldn't be relevant without indeni.
    can-with-snmp: false
    can-with-syslog: false
cluster-state:
    why: |
        Tracking the state of a cluster is important. If a cluster which used to be healthy no longer is, it may be the result of an issue. In some cases, it is due to maintenance work (and so was anticipated), but in others it may be due to a failure in the members of the cluster or another component in the network.
    how: |
        This alert logs into the F5 device through SSH to verify that each traffic group has an active member.
    without-indeni: |
        Problems with a cluster state is generally detected by that the units in question does not process traffic. An administrator could verify that each traffic group has an active member by logging into the device through SSH, entering TMSH and executing the command "show cm". This would bring up details of the cluster state.
    can-with-snmp: true
    can-with-syslog: false
cluster-member-active:
    why: |
        Tracking the state of a cluster member is important. If a cluster member which used to be the active member of the cluster no longer is, it may be the result of an issue. In some cases, it is due to maintenance work (and so was anticipated), but in others it may be due to a failure in the firewall or another component in the network.
    how: |
        This alert logs into the F5 device through SSH and retrieves the local member's state.
    without-indeni: |
        An unplanned change of a cluster members state could be detected by traffic disruptions. An administrator could verify the cluster member state by logging into the web interface of the device looking in the upper left corner. An active device would show "ACTIVE".
    can-with-snmp: true
    can-with-syslog: false
cluster-config-synced:
    why: |
        It is normally desireable for clusters to have their configuration synced. Else, changes made on one node in a cluster might not be active in the event of a fail over. This might cause disruption.
    how: |
        This alert logs into the F5 device through SSH and retrieves the current state of the configuration synchronization.
    without-indeni: |
        An unsynced cluster could be detected during a failover where old configuration is suddenly activated and this could cause issues with the application delivery. An administrator could verify the cluster member state by logging into the web interface of the device looking in the upper left corner. An synced cluster show "In sync".
    can-with-snmp: true
    can-with-syslog: false


#! REMOTE::SSH
tmsh -q show cm

#! PARSER::AWK

BEGIN {
        cluster_state = 0
        iKnown_devices = 0
}

#Detect when we're in the "failover_status" section
#CM::Failover Status
/CM::Failover Status/{
        section = "failover_status"
}

#Detect when we're in the "sync_status" section
#CM::Sync Status
/CM::Sync Status/{
        section = "sync_status"
}

#Detect when we're in the "device_status" section
#CentMgmt::Device: device1.mydomain.local
/CentMgmt::Device:/ {
        section = "device_status"
        known_devicename = $2
}

#Detect when we're in the "traffic_groups" section
#CM::Traffic-Group
/::Traffic-Group/ {
    if (/CM::Traffic-Group/) {
        section = "traffic_groups"
    } else {
        # If it s running v13 it will start with CentMgmt::Traffic-Group and have a different output
        section = "traffic_groups"
        version = "13plus"
   }

}

#Match on non-empty lines
/^.+$/ {

    #-------------------------------------------
    #CM::Failover Status
    #-------------------------------------------
    #Color    green
    #Status   ACTIVE
    #Summary  1/1 active
    #Details
    # active for /Common/traffic-group-1
    # active for /Common/traffic-group-2

    if (section == "failover_status" && $1 == "active") {

        #Split the traffic group name on "/" and save the number of elements to "n"
        n=split($NF, trafficGroupArr, "/");

        #Get the traffic group name excluding the partition
        trafficGroup = trafficGroupArr[3];

        #Add the traffic group to the ones that the device is active in
        deviceActivetraffic_groups[trafficGroup] = 1;

    }

    #------------------------------------------------------------------------------------------
    #CM::Sync Status
    #------------------------------------------------------------------------------------------
    #Color    green
    #Status   In Sync
    #Summary  All devices in the device group are in sync
    #Details
    #       /Common/device1.mydomain.local: connected (for 3568561 seconds)
    #       /Common/DeviceGroupName (In Sync): All devices in the device group are in sync
    #       /Common/device_trust_group (In Sync): All devices in the device group are in sync

    #Write metrics for sync_status
    if (section == "sync_status" && $1 == "Status") {

        if (($2 " " $3) == "In Sync") {
                writeDoubleMetric("cluster-config-synced", null, "gauge", 300, 1)
        } else {
                writeDoubleMetric("cluster-config-synced", null, "gauge", 300, 0)
        }
    }

    #------------------------------------------------------------------------------------------
    #CentMgmt::Device: device1.mydomain.local
    #------------------------------------------------------------------------------------------
    #Mgmt Ip               10.20.11.125
    #Configsync Ip         10.253.251.1
    #Hostname              device1.mydomain.local
    #Base Mac              00:01:d7:ef:79:03
    #Mirror Ip             10.253.251.1
    #Mirror Secondary Ip   ::
    #Multicast Interface
    #Multicast IP          0.0.0.0
    #Multicast Port        0
    #Version               11.5.4
    #Product               BIG-IP
    #Edition               Hotfix HF2
    #Build                 2.0.291
    #Marketing Name        BIG-IP vCMP Guest
    #Platform Id           Z101
    #Chassis Id            chs400354s
    #Active Modules        Compression, Unlimited|KCTMAZP-PRTKSMP LTM, Base, C2400|AOGBJCA-PFRYCXK|IPV6 Gateway|Rate Shaping|Ram Cache|50 MBPS COMPRESSION|SSL, 500 TPS Per Core|Cluster Multi-Processing|APM, Limited, Viprion|App Tunnel|Anti-Virus Checks|Base Endpoint Security Checks|Firewall Checks|Machine Certificate Checks|Network Access|Protected Workspace|Remote Desktop|Secure Virtual Keyboard|APM, Web Application|SSL, C2200/C2400|AAM, Core VCMP Enabled, C2400|UNZRQBY-YXIYLBJ
    #Inactive Modules
    #Optional Modules      Acceleration Manager, C2400 ADC, Security Bundle, C2400 Advanced Protocols AFM, C2400 APM, Base, C2400 APM, Max Access Sessions, C2400 APM, Max CCU, C2400 App Mode (TMSH Only, No Root/Bash) ASM, Bundle, VIPRION ASM, PSM to ASM Upgrade ASM, Unlimited, VIPRION Best Bundle, C2200 / C2400 Platforms Better Bundle, C2200 / C2400 Platforms Better to Best Bundle, C2200 / C2400 Platforms CGN, Viprion Client Authentication DNS and GTM (1K QPS), VIPRION DNS and GTM (Unlimited), VIPRION DNS Services, VPR External Interface and Network HSM FIX Low Latency GTM IPI Subscription, 1Yr, C2400 IPI Subscription, 3Yr, C2400 MSM, Unlimited Mailboxes PEM URL Filtering, Subscription, 1Yr, C2400 PEM URL Filtering, Subscription, 3Yr, C2400 PEM, C2400 PEM, Quota Management, C2X00 Performance Extreme, VPR PSM, Base Routing Bundle SDN Services SSL, Forward Proxy SSL, Unlimited, C2400/C4400/C4480 SWG 1Yr, C2200/C2400, 60K URL Sessions SWG 3Yr, C2200/C2400, 60K URL Sessions SWG Subscription, 1Yr, C2200/C2400 SWG Subscription, 3Yr, C2200/C2400 URL Filtering Subscription, 1Yr, C2200/C2400 URL Filtering Subscription, 3Yr, C2200/C2400 URLF 1Yr, C2200/C2400, 60K URL Sessions URLF 3Yr, C2200/C2400, 60K URL Sessions VIPRION, Multicast Routing WBA, Bundle, C2400
    #Time Limited Modules
    #Location
    #Contact
    #Description
    #Comment
    #Time zone             CEST
    #Self device           1
    #Failover State        active

    #Write metrics for device status
    if (section == "device_status") {

        #Version 11 uses "Failover State", Version 12 uses "Device HA State"
        if (match($0, /((Failover State)|(Device HA State))\s+active/)) {
                cluster_state = 1
        }

        #Get the management IP of the device
        #Mgmt Ip               10.20.11.125
        if (match($0, /Mgmt Ip/)) {
                knownDeviceManagementIP = $NF
        }

    }

    if (section == "traffic_groups") {
        #--------------------------------------------------------------------------
        #CentMgmt::Traffic-Group
        #Name                      Device         Status   Next    Previous  Active
        #                                                  Active  Active    Reason
        #--------------------------------------------------------------------------
        #traffic-group-1           device1.local  standby  true    true      -
        #traffic-group-1           device2.local  active   false   false     -
        #traffic-group-local-only  -              -        -       -         -

        # For version 13 and newer:
        if ( version == "13plus" ) {
            if (match($0, /[^\s]+\s+[^\s]+\s+[^\s]+\s+(true|false)+\s+(true|false)+\s+[^\s]$/)) {

                trafficGroupTags["name"] = $1

                #Save the active traffic groups in an array
                #This will be used further down to determine cluster state
                if ($3 == "active") {
                    clusterActivetraffic_groupsArr[$1] = 1
                }

                #Add the traffic group to the list if it's not already there
                if (!($1 in clusterTrafficGroupArr)) {
                    clusterTrafficGroupArr[$1] = 1
                }

                #If the traffic group is in the active traffic groups it means that this member is active, otherwise not
                #Had to write it this way since the command only shows the name of the device that is active, not if it is itself

            }

        #---------------------------------------------------------------
        #CM::Traffic-Group
        #Name                      Device                Status   Next
        #                          Active
        #---------------------------------------------------------------
        #traffic-group-2           device1.mydomain.local  active   false
        #traffic-group-1           device2.mydomain.local  standby  true
        #traffic-group-3           device1.mydomain.local  standby  true
        #traffic-group-3           device2.mydomain.local  active   false
        #traffic-group-1           device1.mydomain.local  active   false
        #traffic-group-1           device2.mydomain.local  standby  true
        #traffic-group-local-only  -                     -        -

        # For versions below version 13:
        } else {
            if (match($0, /[^\s]+\s+[^\s]+\s+[^\s]+\s+(true|false)$/)) {

                trafficGroupTags["name"] = $1

                #Save the active traffic groups in an array
                #This will be used further down to determine cluster state
                if ($3 == "active") {
                    clusterActivetraffic_groupsArr[$1] = 1
                }

                #Add the traffic group to the list if it's not already there
                if (!($1 in clusterTrafficGroupArr)) {
                    clusterTrafficGroupArr[$1] = 1
                }

                #If the traffic group is in the active traffic groups it means that this member is active, otherwise not
                #Had to write it this way since the command only shows the name of the device that is active, not if it is itself

            }

        }

    }

}

#Empty lines signifies the end of a section
/^$/{

    #Check which section ended and write metrics if applicable
    if (section == "device_status") {
            iKnown_devices++
            known_devices[iKnown_devices, "name"] = known_devicename
            known_devices[iKnown_devices, "ip"] = knownDeviceManagementIP
    }

    #Reset the section variable
    section = ""
}

END {

    if (arraylen(deviceActivetraffic_groups) > 0) {
        if (deviceActivetraffic_groups[$1] != "") {
            writeDoubleMetric("cluster-member-active", trafficGroupTags, "gauge", 300, 1)
        } else {
            writeDoubleMetric("cluster-member-active", trafficGroupTags, "gauge", 300, 0)
        }
    }



    activetraffic_groups = 1

    #Verify that all traffic groups has an active unit
    for(trafficGroup in clusterTrafficGroupArr) {

        trafficGroupTags["name"] = trafficGroup

        if (trafficGroup in clusterActivetraffic_groupsArr) {
            writeDoubleMetric("cluster-state", trafficGroupTags, "gauge", 300, 1)
        } else {
            writeDoubleMetric("cluster-state", trafficGroupTags, "gauge", 300, 0)
        }
    }

    #  "ip" : "10.0.155.125",
    #  "name" : "device1.mydomain.local"
    #}, {
    #  "name" : "device2.mydomain.local",
    #  "ip" : "10.0.155.135"
    #} ]

    writeComplexMetricObjectArray("known-devices", null, known_devices)
}


cluster_config_unsynced

package com.indeni.server.rules.library

import com.indeni.ruleengine.expressions.conditions.{And, EndsWithRepetition, Equals}
import com.indeni.ruleengine.expressions.core._
import com.indeni.ruleengine.expressions.data.{SelectTagsExpression, SelectTimeSeriesExpression, TimeSeriesExpression}
import com.indeni.server.common.data.conditions.True
import com.indeni.server.rules.library.core.PerDeviceRule
import com.indeni.server.rules.{RuleContext, _}
import com.indeni.server.sensor.models.managementprocess.alerts.dto.AlertSeverity


case class ClusterConfigNotSyncedRule(context: RuleContext) extends PerDeviceRule with RuleHelper {

  override val metadata: RuleMetadata = RuleMetadata.builder("cluster_config_unsynced", "Clustered Devices: Cluster configuration not synced",
    "For devices that support full configuration synchronization, indeni will trigger an issue if the configuration is out of sync.", AlertSeverity.ERROR).build()

  override def expressionTree: StatusTreeExpression = {
    val tsToTestAgainst = TimeSeriesExpression[Double]("cluster-config-synced")
    val activeMemberValue = TimeSeriesExpression[Double]("cluster-member-active").last

    StatusTreeExpression(
      // Which objects to pull (normally, devices)
      SelectTagsExpression(context.metaDao, Set(DeviceKey), True),

      StatusTreeExpression(
        // The time-series we check the test condition against:
        SelectTimeSeriesExpression[Double](context.tsDao, Set("cluster-config-synced", "cluster-member-active"), denseOnly = false),

        // The condition which, if true, we have an issue. Checked against the time-series we've collected
        And(
          EndsWithRepetition(tsToTestAgainst, ConstantExpression(0.0), 3),
          Equals(activeMemberValue, ConstantExpression[Option[Double]](Some(1.0)))
        )
      ).withoutInfo().asCondition()
    ).withRootInfo(
      getHeadline(),
      ConstantExpression("The configuration has been changed on this device, but has not yet been synced to other members of the cluster. This may result in an unexpected behavior of other cluster members should this member go down."),
      ConditionalRemediationSteps("Log into the device and synchronize the configuration across the cluster.",
        ConditionalRemediationSteps.OS_NXOS ->
          """|1. Login to the device to review the FHRP configuration across the vPC cluster if it is configured.
             |2. Execute the "show hsrp brief" command to check the HSRP state and configuration to the cluster.
             |3. Execute the “show vrrp detail” command to check the VRRP state and configuration to the cluster.
             |4. Log into the device and synchronize the configuration across the vPC peer switches by reviewing  the “show run vpc” command output from both peers.
             |5. Execute the “show vpc consistency-parameters” command and review the output.  Ensure that type 1 & 2 vPC consistency parameters match. If they do not match, then vPC is suspended. Items that are type 2 do not have to match on both Nexus 5000 switches for the vPC to be operational.
             |6. Check that there are not unsaved configuration changes by running the “show running-config diff” NX-OS command.
             |7. Log into both peers and save the configuration with the "copy running-config startup-config" NX-OS command.""".stripMargin,
        ConditionalRemediationSteps.VENDOR_JUNIPER ->
          """|1. Run "show chassis cluster information configuration-synchronization" command to review configuration synchronization status of a chassis cluster (Junos OS Release 12.1X47-D10 or later).
             |2. Check the activation and last sync status if these options are enabled.
             |3. Check the link connectivity.
             |4. Check the cluster configuration for synchronization.
             |5. Review this article on Juniper TechLibrary: <a target="_blank" href="https://www.juniper.net/documentation/en_US/junos/topics/reference/command-summary/show-chassis-cluster-information-detail-config-sync.html">Operational Commands</a>
             |6. Contact Juniper Networks Technical Assistance Center (JTAC) if further assistance is required.""".stripMargin
      )
    )
  }
}