Firewall cluster configuration sync problem-fortinet-FortiOS

discobot · July 24, 2019, 8:28pm

Firewall cluster configuration sync problem-fortinet-FortiOS

Vendor: fortinet

OS: FortiOS

Description:
A problem to the configuration sync status has been identified. The FGCP normally uses a combination of incremental and periodic synchronization to make sure that the configuration of all cluster units is synchronized. This means that in most cases you only have to make a configuration change once to have it synchronized to all cluster units.

Remediation Steps:
|1. Login via ssh to the Fortinet firewall and run the FortiOS command “diagnose sys ha checksum cluster”. The command output lists all cluster members configuration checksums. If all cluster units have identical checksums then their configurations are synchronized.
|2. One solution to this problem could be to re-calculate the checksums. The re-calculated checksums should match and the out of sync error messages should stop appearing. You can use the following command to re-calculate HA checksums: “diagnose sys ha checksum recalculate [ | global]”. Just entering the command without options recalculates all checksums. You can specify a VDOM name to just recalculate the checksums for that VDOM. You can also enter global to recalculate the global checksum.
|3. Verify that the config sync is not manually disabled. The relevant configuration can be found under the “config system ha”. Review the status of the next command “set sync-config”.
|4. Detailed information and steps to determine what part of the configuration is causing the problem can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_failoverSyncConfig.htm#HA out
|5. Contact Fortinet Technical support at https://support.fortinet.com/ for further assistance.

How does this work?
The script runs the FortiOS command “get system ha status” to retrieve HA configuration sync status information.

Why is this important?
This setting shows whether or not the configurations of each of the cluster units are synchronized. A configuration that is not synchronized can cause service outage in case of a switchover and should be investigated. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm

Without Indeni how would you find this?
An administrator can run the FortiOS command “get system ha status” via SSH connection to retrieve the same information.

fortios-get-system-ha-status

name: fortios-get-system-ha-status
description: FortiGate Cluster High Availability
type: monitoring
monitoring_interval: 5 minutes
requires:
    vendor: fortinet
    os.name: FortiOS
    product: firewall
    high-availability: true
comments:
    ha-health-status:
        why: |
            Indicates if all cluster units are operating normally (OK) or if a problem was detected with the cluster. For example, a message similar to ERROR <serial-number> is lost @ <date> <time> appears if one the subordinate units leaves the cluster. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
        how: |
            The script runs the FortiOS command "get system ha status" to retrieve High Availability status information.
        can-with-snmp: true
        can-with-syslog: true
    ha-health-mode:
        why: |
            This metric collects and displays the HA mode of the cluster, for example, HA A-P or HA A-A. This metric should be the same for all the members of the cluster. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
        how: |
            The script runs the FortiOS command "get system ha status" to retrieve HA mode information.
        can-with-snmp: true
        can-with-syslog: false
    ha-group-id:
        why: |
            This metric captures the configured group ID of the cluster which should be the same for all the members of the cluster. HA problems or dropped traffic may be noticed in case of misconfiguration of the group id. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
        how: |
            The script runs the FortiOS command "get system ha status" to retrieve group id information.
        can-with-snmp: true
        can-with-syslog: false
    ha-debug-status:
        why: |
            This metric captures the HA debug status of the cluster. It is recommended to not be enabled for all the members of the cluster to avoid additional HW resourse consumption. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
        how: |
            The script runs the FortiOS command "get system ha status" to retrieve debug status information.
        can-with-snmp: false
        can-with-syslog: false
    ha-cluster-uptime:
        why: |
            This metric captures the number of days, hours, minutes, and seconds that the cluster has been operating. Any unexpected low uptime should be troubleshot and investigated. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
        how: |
            The script runs the FortiOS command "get system ha status" to retrieve uptime information.
        can-with-snmp: true
        can-with-syslog: false
    ha-session-pickup:
        why: |
            This metric captures the status of session pickup (enabled or disabled). When session-pickup is enabled, the FGCP synchronizes the primary unit's TCP session table to all cluster units. As soon as a new TCP session is added to the primary unit session table, that session is synchronized to all cluster units. This synchronization happens as quickly as possible to keep the session tables synchronized. If the primary unit fails, the new primary unit uses its synchronized session table to resume all TCP sessions that were being processed by the former primary unit with only minimal interruption. Under ideal conditions all TCP sessions should be resumed. This is not guaranteed though and under less than ideal conditions some TCP sessions may need to be restarted. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
        how: |
            The script runs the FortiOS command "get system ha status" to retrieve HA session pickup status information.
        can-with-snmp: false
        can-with-syslog: false
    ha-session-override:
        why: |
            This metric captures the status of the override option for the current cluster unit (enable or disable). When override is disabled a cluster may not always renegotiate when an event occurs that affects primary unit selection. For example, when override is disabled a cluster will not renegotiate when you change a cluster unit device priority or when you add a new cluster unit to a cluster. This is true even if the unit added to the cluster has a higher device priority than any other unit in the cluster. Also, when override is disabled a cluster does not negotiate if the new unit added to the cluster has a failed or disconnected monitored interface. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
        how: |
            The script runs the FortiOS command "get system ha status" to retrieve HA override status information.
        can-with-snmp: false
        can-with-syslog: false
    ha-config-sync-status:
        why: |
            This setting shows whether or not the configurations of each of the cluster units are synchronized. A configuration that is not synchronized can cause service outage in case of a switchover and should be investigated. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
        how: |
            The script runs the FortiOS command "get system ha status" to retrieve HA configuration sync status information.
        can-with-snmp: false
        can-with-syslog: false
    ha-heartbeat-link-status:
        why: |
            This setting shows the status of each cluster unit's heartbeat interfaces.  All of the heartbeat interfaces being down will cause a severe network impact. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
        how: |
            The script runs the FortiOS command "get system ha status" to retrieve HA heartbeat link status information.
        can-with-snmp: true
        can-with-syslog: true
    ha-heartbeat-total-bytes:
        why: |
            This metric captures how much data the heartbeat interfaces have processed.  If all the heartbeat interfaces do not receive/send packets, this needs to be troubleshot since it may have a severe network impact. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
        how: |
            The script runs the FortiOS command "get system ha status" to retrieve the total number of bytes received via the heartbeat interfaces.
        can-with-snmp: false
        can-with-syslog: false
    mon-interface-link-status:
        why: |
            This setting shows the status of each of the monitored interfaces of the cluster. If a monitored interface on the primary unit fails, the cluster renegotiates to select a new primary unit using the process for Primary unit selection. Because the cluster unit with the failed monitored interface has the lowest monitor priority, a different cluster unit becomes the primary unit. The new primary unit should have fewer link failures. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
        how: |
            The script runs the FortiOS command "get system ha status" to retrieve HA monitored interface status information.
        can-with-snmp: true
        can-with-syslog: true
    mon-interface-total-bytes:
        why: |
            This setting shows how much data the interfaces monitored by the cluster have processed. If all the monitored interfaces do not receive/send packets, this needs to be troubleshot, since this may have severe network impact. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
        how: |
            The script runs the FortiOS command "get system ha status" to retrieve the total number of bytes received via the monitored interfaces.
        can-with-snmp: false
        can-with-syslog: false
    ha-heartbeat-int-oper-status-over1:
        why: |
            This metric captures if the HA heartbeat keeps cluster units communicating with each other. It is highly recommended by the vendor to have minimum two interfaces per fortinet firewall as heartbeat interfaces (two links). If all the heartbeat interfaces are down, it will cause a severe network outage. This metric checks to see if there are at least two operational HA heartbeat interfaces per firewall (i.e. two links). More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_failoverHeartbeat.htm
        how: |
            The script runs the FortiOS command "get system ha status" to retrieve HA heartbeat interfaces information.
        can-with-snmp: false
        can-with-syslog: false
    model:
        why: |
            Two or more devices which operate as part of a single cluster must be running on the same hardware.
        how: |
            This script logs into the devices to retrieve the hardware model of the device. Indeni then compares the result to the same script run on other members of the same cluster.
        can-with-snmp: false
        can-with-syslog: false
steps:
-   run:
        type: SSH
        command: get system ha status
    parse:
        type: AWK
        file: get_system_ha_status.parser.1.awk

FortinetHaConfigSyncStatusRule

Failed to fetch the data: https://bitbucket.org/indeni/indeni-knowledge/src/master/rules/templatebased/fortinet/FortinetHaConfigSyncStatusRule.scala