High utilization of generic dataplane pool-paloaltonetworks-panos

error
panos
paloaltonetworks
High utilization of generic dataplane pool-paloaltonetworks-panos
0

#1

High utilization of generic dataplane pool-paloaltonetworks-panos

Vendor: paloaltonetworks

OS: panos

Description:
The dataplane of a Palo Alto Networks firewall has several pools, each with a different role. indeni will alert when a pool is near exhaustion.

Remediation Steps:
Contact Palo Alto Networks technical support.

How does this work?
This script logs into the Palo Alto Networks firewall through SSH and retrieves the status of all the pools. The output includes the total size of the pools and how many elements are available. indeni utilizes this output to determine when the pool is running low on available elements.

Why is this important?
On a Palo Alto Networks firewall, the data plane is where the traffic is handled. In the course of processing traffic the firewall needs to retain certain bits of information. This information is saved in pools of memory, easily accessible but limited in size. When the firewall needs to save information it retrieves a member of the pool and when it is done it returns it. If a pool runs out of members the firewall may have trouble handling traffic, potentially losing part of it.

Without Indeni how would you find this?
An administrator would need to write a script to poll their firewalls for the information, or simply wait until there’s an issue and open a support ticket with TAC.

panos-debug-dataplane-pool-statistics

#! META
name: panos-debug-dataplane-pool-statistics
description: Grab debug dataplane pool statistics
type: monitoring
monitoring_interval: 30 minutes
requires:
    vendor: paloaltonetworks
    os.name: panos
    product: firewall

#! COMMENTS
dataplane-pool-used:
    why: |
        On a Palo Alto Networks firewall, the data plane is where the traffic is handled. In the course of processing traffic the firewall needs to retain certain bits of information. This information is saved in pools of memory, easily accessible but limited in size. When the firewall needs to save information it retrieves a member of the pool and when it is done it returns it. If a pool runs out of members the firewall may have trouble handling traffic, potentially losing part of it.
    how: |
        This script logs into the Palo Alto Networks firewall through SSH and retrieves the status of all the pools. The output includes the total size of the pools and how many elements are available. indeni utilizes this output to determine when the pool is running low on available elements.
    without-indeni: |
        An administrator would need to write a script to poll their firewalls for the information, or simply wait until there's an issue and open a support ticket with TAC.
    can-with-snmp: false
    can-with-syslog: false
dataplane-pool-limit:
    skip-documentation: true

#! REMOTE::SSH
debug dataplane pool statistics

#! PARSER::AWK

#[ 0] Packet Buffers            :   262062/262144   0x8000000020c00000
#[ 1] Work Queue Entries        :   491466/491520   0x8000000410000000
/^DP / {
    dp = $2
    sub(/\:/, "", dp)
}

#[ 0] Packet Buffers            :    57343/57344    0x7f0002005d00
#[10] SML VM Vchecks            :    65536/65536    0x7f005c226ef8
/^\[[0-9 ]+\]/ {
    # Get the pool name
    line = $0
    sub(/\[.*?\]\s*/, "", line)
    sub(/\:.*/, "", line)
    sub(/[ ]$/, "", line)
    poolname = line

    # Get pool utilization
    util = $(NF-1)
    split(util, util_parts, "/")
    limit = util_parts[2]
    used = limit - util_parts[1]

    pooltags["name"] = dp "-" poolname
    writeDoubleMetric("dataplane-pool-used", pooltags, "gauge", 1800, used)
    writeDoubleMetric("dataplane-pool-limit", pooltags, "gauge", 1800, limit)
}

panos-debug-dataplane-pool-statistics

#! META
name: panos-debug-dataplane-pool-statistics
description: Grab debug dataplane pool statistics
type: monitoring
monitoring_interval: 30 minutes
requires:
    vendor: paloaltonetworks
    os.name: panos
    product: firewall

#! COMMENTS
dataplane-pool-used:
    why: |
        On a Palo Alto Networks firewall, the data plane is where the traffic is handled. In the course of processing traffic the firewall needs to retain certain bits of information. This information is saved in pools of memory, easily accessible but limited in size. When the firewall needs to save information it retrieves a member of the pool and when it is done it returns it. If a pool runs out of members the firewall may have trouble handling traffic, potentially losing part of it.
    how: |
        This script logs into the Palo Alto Networks firewall through SSH and retrieves the status of all the pools. The output includes the total size of the pools and how many elements are available. indeni utilizes this output to determine when the pool is running low on available elements.
    without-indeni: |
        An administrator would need to write a script to poll their firewalls for the information, or simply wait until there's an issue and open a support ticket with TAC.
    can-with-snmp: false
    can-with-syslog: false
dataplane-pool-limit:
    skip-documentation: true

#! REMOTE::SSH
debug dataplane pool statistics

#! PARSER::AWK

#[ 0] Packet Buffers            :   262062/262144   0x8000000020c00000
#[ 1] Work Queue Entries        :   491466/491520   0x8000000410000000
/^DP / {
    dp = $2
    sub(/\:/, "", dp)
}

#[ 0] Packet Buffers            :    57343/57344    0x7f0002005d00
#[10] SML VM Vchecks            :    65536/65536    0x7f005c226ef8
/^\[[0-9 ]+\]/ {
    # Get the pool name
    line = $0
    sub(/\[.*?\]\s*/, "", line)
    sub(/\:.*/, "", line)
    sub(/[ ]$/, "", line)
    poolname = line

    # Get pool utilization
    util = $(NF-1)
    split(util, util_parts, "/")
    limit = util_parts[2]
    used = limit - util_parts[1]

    pooltags["name"] = dp "-" poolname
    writeDoubleMetric("dataplane-pool-used", pooltags, "gauge", 1800, used)
    writeDoubleMetric("dataplane-pool-limit", pooltags, "gauge", 1800, limit)
}

panw_pool_usage_generic

package com.indeni.server.rules.library.paloalto

import com.indeni.ruleengine.expressions.conditions.GreaterThanOrEqual
import com.indeni.ruleengine.expressions.core.{StatusTreeExpression, _}
import com.indeni.ruleengine.expressions.data.{SelectTagsExpression, SelectTimeSeriesExpression, TimeSeriesExpression}
import com.indeni.ruleengine.expressions.math.{DivExpression, TimesExpression}
import com.indeni.server.common.data.conditions.{Equals, Not, True}
import com.indeni.server.params.ParameterDefinition
import com.indeni.server.params.ParameterDefinition.UIType
import com.indeni.server.rules._
import com.indeni.server.rules.library.RuleHelper
import com.indeni.server.rules.library.core.PerDeviceRule
import com.indeni.server.sensor.models.managementprocess.alerts.dto.AlertSeverity

case class GenericDataplanePoolUsageRule() extends PerDeviceRule with RuleHelper {

  private[library] val highThresholdParameterName = "High_Threshold_of_Pool_usage"
  private val highThresholdParameter = new ParameterDefinition(highThresholdParameterName,
    "",
    "High Threshold of Pool Usage",
    "What is the threshold for the license usage for which once it is crossed an alert will be issued.",
    UIType.DOUBLE,
    80.0)

  override val metadata: RuleMetadata = RuleMetadata.builder("panw_pool_usage_generic", "Palo Alto Networks Firewalls: High utilization of generic dataplane pool",
    "The dataplane of a Palo Alto Networks firewall has several pools, each with a different role. indeni will alert when a pool is near exhaustion.", AlertSeverity.ERROR).configParameter(highThresholdParameter).build()

  override def expressionTree(context: RuleContext): StatusTreeExpression = {
    val inUseValue = TimeSeriesExpression[Double]("dataplane-pool-used").last
    val limitValue = TimeSeriesExpression[Double]("dataplane-pool-limit").last

    StatusTreeExpression(
      // Which objects to pull (normally, devices)
      SelectTagsExpression(context.metaDao, Set(DeviceKey), True),

      // What constitutes an issue
        StatusTreeExpression(

          // The additional tags we care about (we'll be including this in alert data)
          SelectTagsExpression(context.tsDao, Set("name"), withTagsCondition("dataplane-pool-used", "dataplane-pool-limit")),

            StatusTreeExpression(
              // The time-series we check the test condition against:
              SelectTimeSeriesExpression[Double](context.tsDao, Set("dataplane-pool-used", "dataplane-pool-limit"), denseOnly = false,
                condition = Not(
                    Equals("name", "Packet Buffers") &
                    Equals("name", "Work Queue Entries") &
                    Equals("name", "software packet buffer 0") &
                    Equals("name", "software packet buffer 1") &
                    Equals("name", "software packet buffer 2") &
                    Equals("name", "software packet buffer 3") &
                    Equals("name", "software packet buffer 4") &
                    Equals("name", "CTD Flow") &
                    Equals("name", "CTD AV Block") &
                    Equals("name", "SML VM Fields") &
                    Equals("name", "SML VM Vchecks") &
                    Equals("name", "Detector Threats") &
                    Equals("name", "CTD DLP FLOW") &
                    Equals("name", "CTD DLP DATA") &
                    Equals("name", "CTD DECODE FILTER") &
                    Equals("name", "SML VM EmlInfo") &
                    Equals("name", "Regex Results") &
                    Equals("name", "TIMER Chunk") &
                    Equals("name", "FPTCP segs") &
                    Equals("name", "Proxy session") &
                    Equals("name", "SSL Handshake State") &
                    Equals("name", "SSL State") &
                    Equals("name", "SSL Handshake MAC State") &
                    Equals("name", "SSH Handshake State") &
                    Equals("name", "SSH State") &
                    Equals("name", "TCP host connections") &
                    Equals("name", "DFA Result")
                  )),

              // The condition which, if true, we have an issue. Checked against the time-series we've collected
              GreaterThanOrEqual(
                inUseValue,
                TimesExpression[Double](limitValue, DivExpression[Double](getParameterDouble(highThresholdParameter), ConstantExpression[Option[Double]](Some(100.0)))))

              // The Alert Item to add for this specific item
              ).withSecondaryInfo(
                scopableStringFormatExpression("${scope(\"name\")}"),
                scopableStringFormatExpression("There are %.0f elements used from the pool where the limit is %.0f.", inUseValue, limitValue),
                title = "Affected Pools"
            ).asCondition()
        ).withRootInfo(
              getHeadline(),
              ConstantExpression("The firewall dataplane has several different memory pools, each with its own role."),
              ConstantExpression("Contact Palo Alto Networks technical support.")
        ).asCondition()
    ).withoutInfo()
  }
}