Critical process(es) down (per VS)-paloaltonetworks-panos

Critical process(es) down (per VS)-paloaltonetworks-panos
0

Critical process(es) down (per VS)-paloaltonetworks-panos

Vendor: paloaltonetworks

OS: panos

Description:
Many devices have critical processes, usually daemons, that must be up for certain functions to work. indeni will alert if any of these goes down.

Remediation Steps:
Review the cause for the processes being down.

How does this work?
This script logs into the Palo Alto Networks firewall through SSH and retrieves the status of running processes. It then compares the list of running processes to a known list of processes that are critical and checks to see they are all up. Those that are down are flagged as such.

Why is this important?
Each device has certain executable processes which are critical to the stable operation of it. Within Palo Alto Networks firewalls, these processes are responsible for the management layer (mgmtsrvr), certain services (like dhcp and snmp), VPN (like ikemgr and keymgr) and many other functions. A process being down may indicate a critical failure.

Without Indeni how would you find this?
An administrator would need to write a script to poll their firewalls for the data. The other option is to pull this data during an outage.

panos-debug-system-process-info

name: panos-debug-system-process-info
description: Grab list of processes
type: monitoring
monitoring_interval: 10 minutes
requires:
    vendor: paloaltonetworks
    os.name: panos
    product: firewall
comments:
    process-state:
        why: |
            Each device has certain executable processes which are critical to the stable operation of it. Within Palo Alto Networks firewalls, these processes are responsible for the management layer (mgmtsrvr), certain services (like dhcp and snmp), VPN (like ikemgr and keymgr) and many other functions. A process being down may indicate a critical failure.
        how: |
            This script logs into the Palo Alto Networks firewall through SSH and retrieves the status of running processes. It then compares the list of running processes to a known list of processes that are critical and checks to see they are all up. Those that are down are flagged as such.
        can-with-snmp: false
        can-with-syslog: false
    process-cpu:
        why: |
            Capture the per-process CPU utilization. This information can be used to troubleshoot the root cause of overall system high cpu conditions.
        how: |
            This script logs into the Palo Alto Networks firewall through SSH and retrieves the per-process CPU utilization.
        can-with-snmp: false
        can-with-syslog: false
steps:
-   run:
        type: SSH
        command: debug system process-info
    parse:
        type: AWK
        file: debug-system-process-info.parser.1.awk

cross_vendor_critical_process_down_vsx

// Deprecation warning : Scala template-based rules are deprecated. Please use YAML format rules instead.

package com.indeni.server.rules.library.templatebased.crossvendor

import com.indeni.server.rules.RuleContext
import com.indeni.apidata.time.TimeSpan
import com.indeni.server.rules.library.templates.StateDownTemplateRule
import com.indeni.server.rules.RemediationStepCondition

/**
  *
  */
case class cross_vendor_critical_process_down_vsx() extends StateDownTemplateRule(
  ruleName = "cross_vendor_critical_process_down_vsx",
  ruleFriendlyName = "All Devices: Critical process(es) down (per VS)",
  ruleDescription = "Many devices have critical processes, usually daemons, that must be up for certain functions to work. indeni will alert if any of these goes down.",
  metricName = "process-state",
  applicableMetricTag = "process-name",
  descriptionMetricTag = "vs.name",
  alertItemsHeader = "Processes Affected",
  alertDescription = "One or more processes which are critical to the operation of this device, are down.",
  baseRemediationText = "Review the cause for the processes being down.")(
  RemediationStepCondition.VENDOR_CP -> "Check if \"cpstop\" was run.If MDS check if \"mdsstop\" was run",
  RemediationStepCondition.VENDOR_CISCO ->
    """|
      |1. Use the "show processes cpu" NX-OS command in order to show the CPU usage at the process level.
      |2. Use the "show process cpu detail <pid>" NX-OS command to find out the CPU usage for all threads that belong to a specific process ID (PID).
      |3. Use the "show system internal sysmgr service pid <pid>" NX-OS command in order to display additional details, such as restart time, crash status, and current state, on the process/service by PID.
      |4. Run the "show system internal processes cpu" NX-OS command which is equivalent to the top command in Linux and provides an ongoing look at processor activity in real time.""".stripMargin
)