High memory usage-fortinet-FortiOS

High memory usage-fortinet-FortiOS
0

High memory usage-fortinet-FortiOS

Vendor: fortinet

OS: FortiOS

Description:
Indeni will alert if the memory utilization of a device is above a high threshold. If the device has multiple memory elements, each will be inspected separately and alert for.

Remediation Steps:
Determine the cause for the high memory usage of the listed elements.

  |1. Login via https to the Fortinet firewall and go to menu System > Dashboard > Status. Look at the system resources widget to review the current Memory utilization graph.
  |2. Login via ssh to the Fortinet firewall and run the FortiOS command "diagnose hardware sysinfo memory" which provides information about current memory usage.
  |3. Check if the unit is dealing with high traffic volume or with connection pool limits.
  |4. Check if the Fortinet firewall is in "conserve mode" state by running the FortiOS command "diagnose hardware sysinfo conserve". For more information review the following Fortinet guides:
  |- http://kb.fortinet.com/kb/viewContent.do?externalId=FD33103
  |- http://kb.fortinet.com/kb/viewContent.do?externalId=11076
  |5. If the problem persists, contact Fortinet Technical support at https://support.fortinet.com/ for further assistance.

How does this work?
Indeni uses the built-in Fortinet “get system performance status” command to retrieve the device memory utilization.

Why is this important?
If the firewall memory becomes fully utilized, performance may be impacted and traffic may be dropped, and in extreme cases the firewall could crash. It is critical to monitor the memory usage and handle the issue prior to resource exhaustion.

Without Indeni how would you find this?
An administrator could login and manually run the command via CLI, check the system resources widget via the GUI, enable SNMP, configure a syslog server for a log message every 5 minutes containing the utilization, or use Fortinet FortiAnalyzer.

fortios-get-system-performance-status

#! META
name: fortios-get-system-performance-status
description: Performance metrics based on "get system performance status" command on Fortinet firewall
type: monitoring
monitoring_interval: 1 minute
includes_resource_data: true
requires:
    vendor: "fortinet"
    os.name: "FortiOS"
    product: "firewall"

#! COMMENTS
memory-usage:
    why: |
        If the firewall memory becomes fully utilized, performance may be impacted and traffic may be dropped, and in extreme cases the firewall could crash. It is critical to monitor the memory usage and handle the issue prior to resource exhaustion.
    how: |
        Indeni uses the built-in Fortinet "get system performance status" command to retrieve the device memory utilization.
    without-indeni: |
        An administrator could login and manually run the command via CLI, check the system resources widget via the GUI, enable SNMP, configure a syslog server for a log message every 5 minutes containing the utilization, or use Fortinet FortiAnalyzer.
    can-with-snmp: true
    can-with-syslog: true

cpu-usage:
    why: |
        If the firewall CPU becomes fully utilized, performance may be impacted and traffic may be dropped, and in extreme cases the firewall could crash. It is critical to monitor the memory usage and handle the issue prior to resource exhaustion.
    how: |
        Indeni uses the built-in Fortinet "get system performance status" command to retrieve the device CPU utilization.
    without-indeni: |
        An administrator could login and manually run the command via CLI, check the system resources widget via the GUI, enable SNMP, configure a syslog server for a log message every 5 minutes containing the utilization, or use Fortinet FortiAnalyzer.
    can-with-snmp: true
    can-with-syslog: true

uptime-milliseconds:
    why: |
        Capture the uptime of the device. If the uptime is lower than the previous sample, the device must have reloaded.
    how: |
        Indeni uses the built-in Fortinet "get system performance status" command to retrieve the current device up-time.
    without-indeni: |
        An administrator could login and manually run the command via CLI, check the system resources widget via the GUI, enable SNMP, or use Fortinet FortiAnalyzer.
    can-with-snmp: true
    can-with-syslog: false

memory-free-kbytes:
    skip-documentation: true
memory-total-kbytes:
    skip-documentation: true
memory-used-kbytes:
    skip-documentation: true

#! REMOTE::SSH
get system performance status

#! PARSER::AWK

function writeCpuUsageMetric(id, cpuIdleAmount, cpuIsAverage) {
    sub(/%/, "", cpuIdleAmount)

    tags_cpu["cpu-id"] = id
    tags_cpu["cpu-is-avg"] = cpuIsAverage
    tags_cpu["resource-metric"] = "true"
    writeDoubleMetricWithLiveConfig("cpu-usage", tags_cpu, "gauge", 0, 100 - cpuIdleAmount, "CPU Usage", "percentage", "cpu-id")
}

# v5.4
#Memory states: 66% used
/^Memory states:/ {
    memory_usage = substr($3, 1, 2)

    # the following "RAM" tag value does NOT surface in the UI. It's here just to satisfy the
    # requirements of the rule -- for some reason, we need to have this tag _with_ a value for things
    # to function properly.

    tags_memory["name"] = "RAM"
    tags_memory["resource-metric"] = "true"
    writeDoubleMetricWithLiveConfig("memory-usage", tags_memory, "gauge", 0, memory_usage, "Memory Usage", "percentage", "")
}

# v5.6
#Memory: 1019996k total, 354312k used (34%), 665684k free (66%), 1616k buffers
/^Memory:/ {
    percent_memory_usage = substr($6, 2, 2)
    free = substr($7, 1, length($7) - 1)
    total = substr($2, 1, length($2) - 1)
    used = substr($4, 1, length($4) - 1)

    tags_memory["name"] = "Memory: Free"
    writeDoubleMetricWithLiveConfig("memory-free-kbytes", tags_memory, "gauge", "60", free, "Memory Usage", "kilobytes", "name")

    tags_memory["name"] = "Memory: Total"
    writeDoubleMetricWithLiveConfig("memory-total-kbytes", tags_memory, "gauge", "60", total, "Memory Usage", "kilobytes", "name")

    tags_memory["name"] = "Memory: Used"
    writeDoubleMetricWithLiveConfig("memory-used-kbytes", tags_memory, "gauge", "60", used, "Memory Usage", "kilobytes", "name")

    tags_memory["name"] = "Memory Usage"
    tags_memory["resource-metric"] = "true"
    writeDoubleMetricWithLiveConfig("memory-usage", tags_memory, "gauge", 0, percent_memory_usage, "Memory Usage", "percentage", "name")
}

# This section handles the "per core" metrics for CPU usage
# v5.4
#CPU0 states: 2% user 4% system 0% nice 94% idle
# v5.6
#CPU1 states: 6% user 8% system 0% nice 86% idle 0% iowait 0% irq 0% softirq
/^CPU[0-9]+ states:/ {
    writeCpuUsageMetric("Per Core - " $1, $9, "false")
}

# "CPU states:" shows the average CPU usage across all CPU cores
# v5.4
#CPU states: 8% user 10% system 0% nice 82% idle
# v5.6
#CPU states: 4% user 6% system 0% nice 90% idle 0% iowait 0% irq 0% softirq
/^CPU states:/ {
    writeCpuUsageMetric("Average - CPU", $9, "true")
}

#Uptime: 3 days,  6 hours,  10 minutes
/^Uptime:/ {
    days = $2
    hours = $4
    minutes = $6
    uptime_in_seconds = days * 86400 + hours * 3600 + minutes * 60
    # Display in Overview - Live Config the uptime (in seconds)
    writeDoubleMetricWithLiveConfig("uptime-milliseconds", null, "gauge", 0, (uptime_in_seconds*1000), "Device Uptime", "duration", "")
}



cross_vendor_high_memory_usage

package com.indeni.server.rules.library.templatebased.crossvendor

import com.indeni.server.rules.RuleContext
import com.indeni.server.rules.library.ConditionalRemediationSteps
import com.indeni.server.rules.library.templates.NearingCapacityWithItemsTemplateRule

/**
  *
  */
case class cross_vendor_high_memory_usage() extends NearingCapacityWithItemsTemplateRule(
  ruleName = "cross_vendor_high_memory_usage",
  ruleFriendlyName = "All Devices: High memory usage",
  ruleDescription = "Indeni will alert if the memory utilization of a device is above a high threshold. If the device has multiple memory elements, each will be inspected separately and alert for.",
  usageMetricName = "memory-usage",
  applicableMetricTag = "name",
  threshold = 92.0,
  alertDescription = "Some memory elements are nearing their maximum capacity.",
  alertItemDescriptionFormat = "Current memory utilization is: %.0f%%",
  baseRemediationText = "Determine the cause for the high memory usage of the listed elements.",
  alertItemsHeader = "Memory Elements Affected",
  itemsToIgnore = Set("^vCMP host - (swap|linux).*".r, "^PA firewall management plane.*".r))(
  ConditionalRemediationSteps.VENDOR_CP ->
    """
      |Consider reading https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solutionid=sk33781#MEMORY
      |
      |Note: In trying to understand this alert, you can use the Linux "free" command to view memory utilization. The output of this command can be confusing. To be sure you correctly understand the output, see: https://serverfault.com/questions/85470/meaning-of-the-buffers-cache-line-in-the-output-of-free
      |
      |Also note that Linux has recently changed the "free" output format to make it more intuitive. This change was made in procps 3.3.10 ("procps" is the group of utilities which includes "free"). Use "ps -V" in expert mode to see your version of procps. See also: https://askubuntu.com/questions/770108/what-do-the-changes-in-free-output-from-14-04-to-16-04-mean""".stripMargin,
  ConditionalRemediationSteps.VENDOR_PANOS -> "Consider opening a support ticket with Palo Alto Networks.",
  ConditionalRemediationSteps.VENDOR_CISCO -> "Review http://docwiki.cisco.com/wiki/Cisco_Nexus_7000_Series_NX-OS_Troubleshooting_Guide_--_Troubleshooting_Memory",
  ConditionalRemediationSteps.OS_NXOS ->
    """|
      |1. Check from the Indeni  the memory utilization history graph for this device and review the pattern. Correlate any change to the pattern with any configuration change
      |2. The next NX-OS commands output can inform whether the platform memory utilization is normal or un-expected:
      |• show system resources
      |• show processes memory.
      |3. For more information please review the next  troubleshooting guide for high memory utilization: http://docwiki.cisco.com/wiki/Cisco_Nexus_7000_Series_NX-OS_Troubleshooting_Guide_--_Troubleshooting_Memory""".stripMargin,
  ConditionalRemediationSteps.VENDOR_FORTINET ->
    """
      |1. Login via https to the Fortinet firewall and go to menu System > Dashboard > Status. Look at the system resources widget to review the current Memory utilization graph.
      |2. Login via ssh to the Fortinet firewall and run the FortiOS command "diagnose hardware sysinfo memory" which provides information about current memory usage.
      |3. Check if the unit is dealing with high traffic volume or with connection pool limits.
      |4. Check if the Fortinet firewall is in "conserve mode" state by running the FortiOS command "diagnose hardware sysinfo conserve". For more information review the following Fortinet guides:
      |- http://kb.fortinet.com/kb/viewContent.do?externalId=FD33103
      |- http://kb.fortinet.com/kb/viewContent.do?externalId=11076
      |5. If the problem persists, contact Fortinet Technical support at https://support.fortinet.com/ for further assistance.""".stripMargin,
  ConditionalRemediationSteps.VENDOR_BLUECOAT ->
    """
      |1. Login via https to the ProxySG and go to Statistics > System > Resources > Memory use. Review the current Memory utilization graph.
      |2. Login via ssh to the ProxySG and run the command "show resources" which provides information about current memory usage.
      |3. Check if the unit is dealing with high traffic volume.
      |4. Check the ICAP service maximum number of connections. For more information review the following Bluecoat guides:
      |- https://origin-symwisedownload.symantec.com/resources/webguides/proxysg/certification/sg_firststeps_webguide/Content/Troubleshooting/Malware%20Prevention/troubleshoot_sg_unresponsive.htm
      |- https://origin-symwisedownload.symantec.com/resources/webguides/contentanalysis/13/system_webguide/Content/Topics/Tasks/Stats_Mem.htm
      |5. If the problem persists, contact Symantec Technical support at https://support.symantec.com for further assistance.
    """.stripMargin
)