High number of zombie processes-checkpoint-secureplatform

High number of zombie processes-checkpoint-secureplatform

Vendor: checkpoint

OS: secureplatform

Description:
indeni will trigger an issue when there are too many zombie processes.

Remediation Steps:
Review the list of processes on the device to determine the possible root cause. To identify what zombie processes are running, you may run the following command: ps -ef | grep defunct\nIf a parent process on your system keeps creating zombies, you may need to file a support ticket with the associated software vendor.

How does this work?
Using the built in “ps” command, the number of zombie processes is retrieved.

Why is this important?
A zombie process is a child process that has died. The parent process has however not read the exit status and ended the child process. This means that it continues to use memory. Zombie processes are often a result of poorly written software. Since they still use memory a lot of zombie processes could cause issues.

Without Indeni how would you find this?
An administrator could login and manually run the command.

chkp-secureplatform-zombie

name: chkp-secureplatform-zombie
description: displays list of zombie processes
type: monitoring
monitoring_interval: 10 minute
requires:
    vendor: checkpoint
    os.name: secureplatform
comments:
    tasks-zombies:
        why: |
            A zombie process is a child process that has died. The parent process has however not read the exit status
            and ended the child process. This means that it continues to use memory. Zombie processes are often a result
            of poorly written software. Since they still use memory a lot of zombie processes could cause issues.
        how: |
            Using the built in "ps" command, the number of zombie processes is retrieved.
        can-with-snmp: false
        can-with-syslog: false
steps:
-   run:
        type: SSH
        file: zombie.remote.1.bash
    parse:
        type: AWK
        file: zombie.parser.1.awk

high_zombie_tasks_count

package com.indeni.server.rules.library.core

import com.indeni.apidata.time.TimeSpan
import com.indeni.ruleengine.expressions.OptionalExpression
import com.indeni.ruleengine.expressions.conditions.GreaterThanOrEqual
import com.indeni.ruleengine.expressions.core._
import com.indeni.ruleengine.expressions.data.{SelectTagsExpression, SelectTimeSeriesExpression, TimeSeriesExpression}
import com.indeni.server.common.data.conditions.True
import com.indeni.server.params.ParameterDefinition
import com.indeni.server.params.ParameterDefinition.UIType
import com.indeni.server.rules._
import com.indeni.server.rules.library.{ConditionalRemediationSteps, PerDeviceRule, RuleHelper}
import com.indeni.server.sensor.models.managementprocess.alerts.dto.AlertSeverity
import com.indeni.server.rules.library.core.HighZombieCountRule._

case class HighZombieCountRule() extends PerDeviceRule with RuleHelper {

  private[library] val highThresholdParameterName = "High_Threshold_of_Zombie_Count"
  private val highThresholdParameter = new ParameterDefinition(highThresholdParameterName,
    "",
    "High Threshold of Zombie Process Count",
    "The threshold for the number of zombie processes, which once it is crossed an issue will be triggered.",
    UIType.DOUBLE,
    20.0)

  override val metadata: RuleMetadata = RuleMetadata.builder(NAME, "High number of zombie processes",
    "indeni will trigger an issue when there are too many zombie processes.",
    AlertSeverity.WARN, categories= Set(RuleCategory.HealthChecks), deviceCategory = DeviceCategory.LinuxbasedDevices).interval(TimeSpan.fromMinutes(10)).configParameter(highThresholdParameter).build()

  override def expressionTree(context: RuleContext): StatusTreeExpression = {
    val actualValue = TimeSeriesExpression[Double]("tasks-zombies").last
    val threshold: OptionalExpression[Double] = getParameterDouble(highThresholdParameter)

    StatusTreeExpression(
      // Which objects to pull (normally, devices)
      SelectTagsExpression(context.metaDao, Set(DeviceKey), True),

      StatusTreeExpression(
            // The time-series we check the test condition against:
            SelectTimeSeriesExpression[Double](context.tsDao, Set("tasks-zombies"), denseOnly = false),

            // The condition which, if true, we have an issue. Checked against the time-series we've collected
            GreaterThanOrEqual(
              actualValue,
              threshold)

      ).withRootInfo(
            getHeadline(),
            scopableStringFormatExpression("The number of zombie processes is %.0f, above the threshold of %.0f.", actualValue, threshold),
            ConditionalRemediationSteps("Review the list of processes on the device to determine the possible root cause. To identify what zombie processes are running, you may run the following command: ps -ef | grep defunct\nIf a parent process on your system keeps creating zombies, you may need to file a support ticket with the associated software vendor.")

      ).asCondition()
    ).withoutInfo()
  }
}

object HighZombieCountRule {

  /* --- Constants --- */

  private[library] val NAME = "high_zombie_tasks_count"
}