Cluster configuration not synced-juniper-junos

error
high-availability
junos
juniper
Cluster configuration not synced-juniper-junos
0
#1

Cluster configuration not synced-juniper-junos

Vendor: juniper

OS: junos

Description:
For devices that support full configuration synchronization, indeni will trigger an issue if the configuration is out of sync.

Remediation Steps:
Log into the device and synchronize the configuration across the cluster.
|||1. Run “show chassis cluster information configuration-synchronization” command to review configuration synchronization status of a chassis cluster (Junos OS Release 12.1X47-D10 or later).
|2. Check the activation and last sync status if these options are enabled.
|3. Check the link connectivity.
|4. Check the cluster configuration for synchronization.
|5. Review this article on Juniper TechLibrary: Operational Commands
|6. Contact Juniper Networks Technical Assistance Center (JTAC) if further assistance is required.

How does this work?
The script runs the “show chassis cluster information configuration synchronization” command via SSH and retrieves the configuration synchronization status.

Why is this important?
The failure of configuration synchronization will cause misbehaviors when the cluster failover occurs. For examples, an interfacer which should be enabled is still in disabled state, the latest configuration fails to apply to the new active node, and etc.

Without Indeni how would you find this?
An administrator could log on to the device to manually run this command to get configuration synchronization status.

junos-show-chassis-cluster-information-configuration-synchronization

#! META
name: junos-show-chassis-cluster-information-configuration-synchronization
description: Get chassis cluster configuration synchronization status
type: monitoring
monitoring_interval: 10 minute
requires:
    vendor: juniper
    os.name: junos
    product: firewall
    high-availability: true

#! COMMENTS
cluster-config-synced:
    why: |
        The failure of configuration synchronization will cause misbehaviors when the cluster failover occurs. For examples, an interfacer which should be enabled is still in disabled state, the latest configuration fails to apply to the new active node, and etc. 
    how: |
        The script runs the "show chassis cluster information configuration synchronization" command via SSH and retrieves the configuration synchronization status.
    without-indeni: |
        An administrator could log on to the device to manually run this command to get configuration synchronization status.
    can-with-snmp:
    can-with-syslog:
    vendor-provided-management: |
        The configuration synchronization status can be retrieved via the command line.

#! REMOTE::SSH
show chassis hardware node local | match node
show chassis cluster information configuration-synchronization

#! PARSER::AWK
BEGIN {
    node0 = 0
    node1 = 0
    feature_supported = 1
    node_sync_idx = 0
}

#error: syntax error, expecting <command>: informationconfiguration-synchronization
#the firmware is below 12.1X47
/^(errors:\s+syntax\s+error)/ {
    feature_supported = 0
}

#node0:
/^node0/ {
    node0++ 
}

#        Last sync result: Succeeded
#        Last sync result: Not needed 
/(Last sync result:)/ {
    split($0, get_status, ": ")
    if ( get_status[2] == "Succeeded" || get_status[2] == "Not needed" ){ 
        SyncStatus = 1
    } else {
        SyncStatus = 0
    }
    node_sync_status[node_sync_idx] = SyncStatus 
    node_sync_idx++
}

END {
    if ( feature_supported == 1 ) {
        if ( node0 == 2) {
            node_sync_idx = 0
            cluster_node["node"] = "node0"
        } else {
            node_sync_idx = 1
            cluster_node["node"] = "node1"
        }
        writeDoubleMetric("cluster-config-synced", cluster_node, "gauge", 60, node_sync_status[node_sync_idx]) 
    }
}

junos-show-chassis-cluster-status

#! META
name: junos-show-chassis-cluster-status
description: JUNOS collect clustering status
type: monitoring
monitoring_interval: 1 minute
requires:
    vendor: juniper
    os.name: junos
    product: firewall
    high-availability: true

#! COMMENTS
cluster-member-active:
    why: |
        Tracking the state of a cluster member is important. If a cluster member which used to be the active member of the cluster no longer is, it may be the result of an issue. In some cases, it is due to maintenance work (and so was anticipated), but in others it may be due to a failure in the firewall or another component in the network.
    how: |
        This script logs into the Juniper JUNOS-based device using SSH and retrieves the output of the "show chassis cluster status" command. The output includes the status of all redundancy groups across the cluster.
    without-indeni: |
        The administrator has to run the "show chassis cluster status" on the device to find whether the cluster member is active or not. 
    can-with-snmp: true
    can-with-syslog: true
cluster-state:
    why: |
        Tracking the state of a cluster is important. If a cluster which used to be healthy no longer is, it may be the result of an issue. In some cases, it is due to maintenance work (and so was anticipated), but in others it may be due to a failure in the members of the cluster or another component in the network.
    how: |
        This script logs into the Juniper JUNOS-based device using SSH and retrieves the output of the "show chassis cluster status" command. The output includes the status of all redundancy groups across the cluster.
    without-indeni: |
        The administrator has to run the "show chassis cluster status" on the device to find whether neither of cluster nodes is in primary state. 
    can-with-snmp: true
    can-with-syslog: true
cluster-preemption-enabled:
    why: |
        Preemption is a function in clustering which sets a primary member of the cluster to always strive to be the active member. The trouble with this is that if the active member that is set with preemption on has a critical failure and reboots, the cluster will fail over to the secondary and then immediately fail over back to the primary when it completes the reboot. This can result in another crash and the process would happen again and again in a loop.
    how: |
        This script logs into the Juniper JUNOS-based device using SSH and retrieves the output of the "show chassis cluster status" command. The output includes the status of all redundancy groups across the cluster.
    without-indeni: |
        The administrator has to run the "show chassis cluster status" on the device to find whether preemption is enabled and correctly configured if one of the nodes is expected to be always primary node. 
    can-with-snmp: false
    can-with-syslog: false

#! REMOTE::SSH
show chassis hardware node local | match node
show chassis cluster status

#! PARSER::AWK
BEGIN {
    RG = 0
}

#Node   Priority Status         Preempt Manual   Monitor-failures
/^Node.*Priority*/ {
    getColumns(trim($0), "[ \t]+", columns)
}

#Redundancy group: 0 , Failover count: 1
/^Redundancy group/ {
    regroup = $3
    group_state[regroup] = 0
    group_preempt [regroup] = 0
    RG = 1
    node_idx = 0
    cluster_tags["name"] = "redundancy group "regroup
}

#node0  1        primary        no      no       None           
/^node.*/ {
    if (RG == 0) {
       node_local = $1
       if (node_local ~ /node0/){
           myself = 0
       } else {
           myself = 1
       }
    } else {
        node = $getColId(columns, "Node")
        if (node == "node0") {
            node_idx == 0
        }else {
            node_idx = 1
        }

        statusDesc = $getColId(columns, "Status")
        monitor_failures = $getColId(columns, "Monitor-failures")

        if ( node_idx == myself ) {
            if ((statusDesc == "primary" && monitor_failures == "None") || (statusDesc == "secondary" && monitor_failures == "None")) {
                node_status[node_idx] = 1
            } else {
                node_status[node_idx] = 0
            }
            writeDoubleMetricWithLiveConfig("cluster-member-active", cluster_tags, "gauge", "60", node_status[myself], "Cluster Member Active", "state", "name")
        }
        node_idx++

        if (statusDesc == "primary" && monitor_failures == "None") {
            # either of nodes is primary, the state for this redundancy group is up
            group_state[regroup] = 1 
        }

        preempt = $getColId(columns, "Preempt")
        if (preempt == "yes") {
            group_preempt[regroup] = 1
        }
    }
}

END {
        for (regroup in group_state) {
            cluster_tags["name"] = "redundancy group "regroup
            writeDoubleMetricWithLiveConfig("cluster-state", cluster_tags, "gauge", "60", group_state[regroup], "Cluster State", "state", "name")
            writeDoubleMetricWithLiveConfig("cluster-preemption-enabled", cluster_tags, "gauge", "60", group_preempt[regroup], "Cluster Preemption Enabled", "boolean", "name")
        }
}

cluster_config_unsynced

Failed to fetch the data: https://bitbucket.org/indeni/indeni-knowledge/src/master/rules/sync_core_rules/ClusterConfigNotSyncedRule.scala
0 Likes

pinned #2
0 Likes