Firewall cluster problem-fortinet-FortiOS
Vendor: fortinet
OS: FortiOS
Description:
Indeni will alert if one or more firewalls configured on this cluster are having problems
Remediation Steps:
|1. Login via https to the Fortinet firewall and check the system dashboard to review the System Information widget and the cluster units. Check the Unit Operation widget graphic to verify that the correct cluster unit interfaces are connected. Go to System > HA or from the System Information dashboard widget select HA Status > Configure and verify that all of the cluster units are displayed on the HA Cluster list.
|2. In case of new cluster setup then login via https to the Fortinet firewall and check if the configurations are the same, try re-entering the HA Password on each cluster unit in case there is an error typing the password. Validate that all Fortinet firewalls in the cluster have the same level of licensing for FortiGuard, FortiCloud, FortiClient, and VDOMs in order to form the cluster.
|3. Check that the correct interfaces of each cluster unit are connected. Check the cables and interface LEDs. Use the Unit Operation dashboard widget, system network interface list, or cluster members list to verify that each interface that should be connected actually is connected. If the link is down re-verify the physical connection. If you are using port monitoring check the status of these interfaces. Try replacing network cables or switches as required.
|4. Login via ssh to the Fortinet firewall and run the FortiOS command “get system ha status”. The above command provides the information about the cluster health and operation status, some information about the cluster configuration, and information about how long the cluster has been operating. Besides, it includes information about how the primary unit was selected, configuration synchronization status, usage stats for each cluster unit, heartbeat status, and the relative priorities of the cluster units. The output of command also indicates if all cluster units are operating normally (OK) or if a problem was detected with the cluster. For example, a message similar to ERROR is lost @ appears if one the subordinate units leaves the cluster or a message like WARNING: has mondev down appears if a monitor link interface state changed to down. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operatingStatusCLI.htm
|5. More information can be found to the “FortiOS™ Handbook - High Availability” and the next link: https://docs.fortinet.com/uploaded/files/3997/fortigate-ha-56.pdf
|6. Contact Fortinet Technical support at https://support.fortinet.com/ for further assistance.
How does this work?
The script runs the FortiOS command “get system ha status” to retrieve High Availability status information.
Why is this important?
Indicates if all cluster units are operating normally (OK) or if a problem was detected with the cluster. For example, a message similar to ERROR is lost @ appears if one the subordinate units leaves the cluster. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
Without Indeni how would you find this?
An administrator can run the FortiOS command “get system ha status” via SSH connection to retrieve the same information.
fortios-get-system-ha-status
name: fortios-get-system-ha-status
description: FortiGate Cluster High Availability
type: monitoring
monitoring_interval: 5 minutes
requires:
vendor: fortinet
os.name: FortiOS
product: firewall
high-availability: true
comments:
ha-health-status:
why: |
Indicates if all cluster units are operating normally (OK) or if a problem was detected with the cluster. For example, a message similar to ERROR <serial-number> is lost @ <date> <time> appears if one the subordinate units leaves the cluster. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
how: |
The script runs the FortiOS command "get system ha status" to retrieve High Availability status information.
can-with-snmp: true
can-with-syslog: true
ha-health-mode:
why: |
This metric collects and displays the HA mode of the cluster, for example, HA A-P or HA A-A. This metric should be the same for all the members of the cluster. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
how: |
The script runs the FortiOS command "get system ha status" to retrieve HA mode information.
can-with-snmp: true
can-with-syslog: false
ha-group-id:
why: |
This metric captures the configured group ID of the cluster which should be the same for all the members of the cluster. HA problems or dropped traffic may be noticed in case of misconfiguration of the group id. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
how: |
The script runs the FortiOS command "get system ha status" to retrieve group id information.
can-with-snmp: true
can-with-syslog: false
ha-debug-status:
why: |
This metric captures the HA debug status of the cluster. It is recommended to not be enabled for all the members of the cluster to avoid additional HW resourse consumption. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
how: |
The script runs the FortiOS command "get system ha status" to retrieve debug status information.
can-with-snmp: false
can-with-syslog: false
ha-cluster-uptime:
why: |
This metric captures the number of days, hours, minutes, and seconds that the cluster has been operating. Any unexpected low uptime should be troubleshot and investigated. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
how: |
The script runs the FortiOS command "get system ha status" to retrieve uptime information.
can-with-snmp: true
can-with-syslog: false
ha-session-pickup:
why: |
This metric captures the status of session pickup (enabled or disabled). When session-pickup is enabled, the FGCP synchronizes the primary unit's TCP session table to all cluster units. As soon as a new TCP session is added to the primary unit session table, that session is synchronized to all cluster units. This synchronization happens as quickly as possible to keep the session tables synchronized. If the primary unit fails, the new primary unit uses its synchronized session table to resume all TCP sessions that were being processed by the former primary unit with only minimal interruption. Under ideal conditions all TCP sessions should be resumed. This is not guaranteed though and under less than ideal conditions some TCP sessions may need to be restarted. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
how: |
The script runs the FortiOS command "get system ha status" to retrieve HA session pickup status information.
can-with-snmp: false
can-with-syslog: false
ha-session-override:
why: |
This metric captures the status of the override option for the current cluster unit (enable or disable). When override is disabled a cluster may not always renegotiate when an event occurs that affects primary unit selection. For example, when override is disabled a cluster will not renegotiate when you change a cluster unit device priority or when you add a new cluster unit to a cluster. This is true even if the unit added to the cluster has a higher device priority than any other unit in the cluster. Also, when override is disabled a cluster does not negotiate if the new unit added to the cluster has a failed or disconnected monitored interface. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
how: |
The script runs the FortiOS command "get system ha status" to retrieve HA override status information.
can-with-snmp: false
can-with-syslog: false
ha-config-sync-status:
why: |
This setting shows whether or not the configurations of each of the cluster units are synchronized. A configuration that is not synchronized can cause service outage in case of a switchover and should be investigated. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
how: |
The script runs the FortiOS command "get system ha status" to retrieve HA configuration sync status information.
can-with-snmp: false
can-with-syslog: false
ha-heartbeat-link-status:
why: |
This setting shows the status of each cluster unit's heartbeat interfaces. All of the heartbeat interfaces being down will cause a severe network impact. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
how: |
The script runs the FortiOS command "get system ha status" to retrieve HA heartbeat link status information.
can-with-snmp: true
can-with-syslog: true
ha-heartbeat-total-bytes:
why: |
This metric captures how much data the heartbeat interfaces have processed. If all the heartbeat interfaces do not receive/send packets, this needs to be troubleshot since it may have a severe network impact. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
how: |
The script runs the FortiOS command "get system ha status" to retrieve the total number of bytes received via the heartbeat interfaces.
can-with-snmp: false
can-with-syslog: false
mon-interface-link-status:
why: |
This setting shows the status of each of the monitored interfaces of the cluster. If a monitored interface on the primary unit fails, the cluster renegotiates to select a new primary unit using the process for Primary unit selection. Because the cluster unit with the failed monitored interface has the lowest monitor priority, a different cluster unit becomes the primary unit. The new primary unit should have fewer link failures. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
how: |
The script runs the FortiOS command "get system ha status" to retrieve HA monitored interface status information.
can-with-snmp: true
can-with-syslog: true
mon-interface-total-bytes:
why: |
This setting shows how much data the interfaces monitored by the cluster have processed. If all the monitored interfaces do not receive/send packets, this needs to be troubleshot, since this may have severe network impact. More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_operating.htm
how: |
The script runs the FortiOS command "get system ha status" to retrieve the total number of bytes received via the monitored interfaces.
can-with-snmp: false
can-with-syslog: false
ha-heartbeat-int-oper-status-over1:
why: |
This metric captures if the HA heartbeat keeps cluster units communicating with each other. It is highly recommended by the vendor to have minimum two interfaces per fortinet firewall as heartbeat interfaces (two links). If all the heartbeat interfaces are down, it will cause a severe network outage. This metric checks to see if there are at least two operational HA heartbeat interfaces per firewall (i.e. two links). More details can be found here: https://help.fortinet.com/fos50hlp/54/Content/FortiOS/fortigate-high-availability-52/HA_failoverHeartbeat.htm
how: |
The script runs the FortiOS command "get system ha status" to retrieve HA heartbeat interfaces information.
can-with-snmp: false
can-with-syslog: false
model:
why: |
Two or more devices which operate as part of a single cluster must be running on the same hardware.
how: |
This script logs into the devices to retrieve the hardware model of the device. Indeni then compares the result to the same script run on other members of the same cluster.
can-with-snmp: false
can-with-syslog: false
steps:
- run:
type: SSH
command: get system ha status
parse:
type: AWK
file: get_system_ha_status.parser.1.awk
FortinetHaHealthStatusRule
Failed to fetch the data: https://bitbucket.org/indeni/indeni-knowledge/src/master/rules/templatebased/fortinet/FortinetHaHealthStatusRule.scala