High utilization of generic dataplane pool-paloaltonetworks-panos

High utilization of generic dataplane pool-paloaltonetworks-panos

Vendor: paloaltonetworks

OS: panos

Description:
The dataplane of a Palo Alto Networks firewall has several pools, each with a different role. indeni will alert when a pool is near exhaustion.

Remediation Steps:
Contact Palo Alto Networks technical support.

How does this work?
This script logs into the Palo Alto Networks firewall through SSH and retrieves the status of all the pools. The output includes the total size of the pools and how many elements are available. indeni utilizes this output to determine when the pool is running low on available elements.

Why is this important?
On a Palo Alto Networks firewall, the data plane is where the traffic is handled. In the course of processing traffic the firewall needs to retain certain bits of information. This information is saved in pools of memory, easily accessible but limited in size. When the firewall needs to save information it retrieves a member of the pool and when it is done it returns it. If a pool runs out of members the firewall may have trouble handling traffic, potentially losing part of it.

Without Indeni how would you find this?
An administrator would need to write a script to poll their firewalls for the information, or simply wait until there’s an issue and open a support ticket with TAC.

panos-debug-dataplane-pool-statistics

name: panos-debug-dataplane-pool-statistics
description: Grab debug dataplane pool statistics
type: monitoring
monitoring_interval: 30 minutes
requires:
    vendor: paloaltonetworks
    os.name: panos
    product: firewall
comments:
    dataplane-pool-used:
        why: |
            On a Palo Alto Networks firewall, the data plane is where the traffic is handled. In the course of processing traffic the firewall needs to retain certain bits of information. This information is saved in pools of memory, easily accessible but limited in size. When the firewall needs to save information it retrieves a member of the pool and when it is done it returns it. If a pool runs out of members the firewall may have trouble handling traffic, potentially losing part of it.
        how: |
            This script logs into the Palo Alto Networks firewall through SSH and retrieves the status of all the pools. The output includes the total size of the pools and how many elements are available. indeni utilizes this output to determine when the pool is running low on available elements.
        can-with-snmp: false
        can-with-syslog: false
    dataplane-pool-limit:
        why: |
            Capture the pool limit. This information is necessary for tracking health status of the device. If a pool runs out of members the firewall may have trouble handling traffic.
        how: |
            This script logs into the Palo Alto Networks firewall through SSH and retrieves the status of all the pools. The output includes the total size of the pools and how many elements are available. indeni utilizes this output to determine when the pool is running low on available elements.
        can-with-snmp: false
        can-with-syslog: false
steps:
-   run:
        type: SSH
        command: debug dataplane pool statistics
    parse:
        type: AWK
        file: debug-dataplane-pool-statistics.parser.1.awk

panos-debug-dataplane-pool-statistics

name: panos-debug-dataplane-pool-statistics
description: Grab debug dataplane pool statistics
type: monitoring
monitoring_interval: 30 minutes
requires:
    vendor: paloaltonetworks
    os.name: panos
    product: firewall
comments:
    dataplane-pool-used:
        why: |
            On a Palo Alto Networks firewall, the data plane is where the traffic is handled. In the course of processing traffic the firewall needs to retain certain bits of information. This information is saved in pools of memory, easily accessible but limited in size. When the firewall needs to save information it retrieves a member of the pool and when it is done it returns it. If a pool runs out of members the firewall may have trouble handling traffic, potentially losing part of it.
        how: |
            This script logs into the Palo Alto Networks firewall through SSH and retrieves the status of all the pools. The output includes the total size of the pools and how many elements are available. indeni utilizes this output to determine when the pool is running low on available elements.
        can-with-snmp: false
        can-with-syslog: false
    dataplane-pool-limit:
        why: |
            Capture the pool limit. This information is necessary for tracking health status of the device. If a pool runs out of members the firewall may have trouble handling traffic.
        how: |
            This script logs into the Palo Alto Networks firewall through SSH and retrieves the status of all the pools. The output includes the total size of the pools and how many elements are available. indeni utilizes this output to determine when the pool is running low on available elements.
        can-with-snmp: false
        can-with-syslog: false
steps:
-   run:
        type: SSH
        command: debug dataplane pool statistics
    parse:
        type: AWK
        file: debug-dataplane-pool-statistics.parser.1.awk

panw_pool_usage_generic

package com.indeni.server.rules.library.paloalto

import com.indeni.ruleengine.expressions.conditions.GreaterThanOrEqual
import com.indeni.ruleengine.expressions.core.{StatusTreeExpression, _}
import com.indeni.ruleengine.expressions.data.{SelectTagsExpression, SelectTimeSeriesExpression, TimeSeriesExpression}
import com.indeni.ruleengine.expressions.math.{DivExpression, TimesExpression}
import com.indeni.server.common.data.conditions.{Equals, Not, True}
import com.indeni.server.params.ParameterDefinition
import com.indeni.server.params.ParameterDefinition.UIType
import com.indeni.server.rules._
import com.indeni.server.rules.library.{ConditionalRemediationSteps, PerDeviceRule, RuleHelper}
import com.indeni.server.sensor.models.managementprocess.alerts.dto.AlertSeverity

case class GenericDataplanePoolUsageRule() extends PerDeviceRule with RuleHelper {

  private[library] val highThresholdParameterName = "High_Threshold_of_Pool_usage"
  private val highThresholdParameter = new ParameterDefinition(highThresholdParameterName,
    "",
    "High Threshold of Pool Usage",
    "What is the threshold for the license usage for which once it is crossed an alert will be issued.",
    UIType.DOUBLE,
    80.0)

  override val metadata: RuleMetadata = RuleMetadata.builder("panw_pool_usage_generic", "High utilization of generic dataplane pool",
    "The dataplane of a Palo Alto Networks firewall has several pools, each with a different role. indeni will alert when a pool is near exhaustion.", AlertSeverity.ERROR, categories= Set(RuleCategory.HealthChecks), deviceCategory = DeviceCategory.PaloAltoNetworksFirewalls).configParameter(highThresholdParameter).build()

  override def expressionTree(context: RuleContext): StatusTreeExpression = {
    val inUseValue = TimeSeriesExpression[Double]("dataplane-pool-used").last
    val limitValue = TimeSeriesExpression[Double]("dataplane-pool-limit").last

    StatusTreeExpression(
      // Which objects to pull (normally, devices)
      SelectTagsExpression(context.metaDao, Set(DeviceKey), True),

      // What constitutes an issue
        StatusTreeExpression(

          // The additional tags we care about (we'll be including this in alert data)
          SelectTagsExpression(context.tsDao, Set("name"), withTagsCondition("dataplane-pool-used", "dataplane-pool-limit")),

            StatusTreeExpression(
              // The time-series we check the test condition against:
              SelectTimeSeriesExpression[Double](context.tsDao, Set("dataplane-pool-used", "dataplane-pool-limit"), denseOnly = false,
                condition = Not(
                    Equals("name", "Packet Buffers") &
                    Equals("name", "Work Queue Entries") &
                    Equals("name", "software packet buffer 0") &
                    Equals("name", "software packet buffer 1") &
                    Equals("name", "software packet buffer 2") &
                    Equals("name", "software packet buffer 3") &
                    Equals("name", "software packet buffer 4") &
                    Equals("name", "CTD Flow") &
                    Equals("name", "CTD AV Block") &
                    Equals("name", "SML VM Fields") &
                    Equals("name", "SML VM Vchecks") &
                    Equals("name", "Detector Threats") &
                    Equals("name", "CTD DLP FLOW") &
                    Equals("name", "CTD DLP DATA") &
                    Equals("name", "CTD DECODE FILTER") &
                    Equals("name", "SML VM EmlInfo") &
                    Equals("name", "Regex Results") &
                    Equals("name", "TIMER Chunk") &
                    Equals("name", "FPTCP segs") &
                    Equals("name", "Proxy session") &
                    Equals("name", "SSL Handshake State") &
                    Equals("name", "SSL State") &
                    Equals("name", "SSL Handshake MAC State") &
                    Equals("name", "SSH Handshake State") &
                    Equals("name", "SSH State") &
                    Equals("name", "TCP host connections") &
                    Equals("name", "DFA Result")
                  )),

              // The condition which, if true, we have an issue. Checked against the time-series we've collected
              GreaterThanOrEqual(
                inUseValue,
                TimesExpression[Double](limitValue, DivExpression[Double](getParameterDouble(highThresholdParameter), ConstantExpression[Option[Double]](Some(100.0)))))

              // The Alert Item to add for this specific item
              ).withSecondaryInfo(
                scopableStringFormatExpression("${scope(\"name\")}"),
                scopableStringFormatExpression("There are %.0f elements used from the pool where the limit is %.0f.", inUseValue, limitValue),
                title = "Affected Pools"
            ).asCondition()
        ).withRootInfo(
              getHeadline(),
              ConstantExpression("The firewall dataplane has several different memory pools, each with its own role."),
              ConditionalRemediationSteps("Contact Palo Alto Networks technical support.")
        ).asCondition()
    ).withoutInfo()
  }
}