Pnote(s) down-checkpoint-False

error
false
checkpoint
Pnote(s) down-checkpoint-False
0

#1

Pnote(s) down-checkpoint-False

Vendor: checkpoint

OS: False

Description:
ClusterXL has multiple problem notifications (pnotes) - if any of them fail an alert will be issued.

Remediation Steps:
Review the list of problematic elements and take appropriate action.

How does this work?
By using the Check Point built-in command “cphaprob list” the pnotes are retrieved.

Why is this important?
If a device in a cluster discovers a problem with itself, it will be because one pre-defined check “pnote” (“problem notification”) has failed. It is interesting to know which pnote failed, to begin an investigation into why it happened.

Without Indeni how would you find this?
An administrator could login and manually run the command.

cphaprob_list

#! META
name: cphaprob_list
description: Run "cphaprob list" to find pnotes in "problem"
type: monitoring
monitoring_interval: 1 minute
requires:
    vendor: "checkpoint"
    high-availability: "true"
    vsx:
        neq: "true"
    role-firewall: "true"

#! COMMENTS
clusterxl-pnote-state:
    why: |
        If a device in a cluster discovers a problem with itself, it will be because one pre-defined check "pnote" ("problem notification") has failed. It is interesting to know which pnote failed, to begin an investigation into why it happened.
    how: |
        By using the Check Point built-in command "cphaprob list" the pnotes are retrieved.
    without-indeni: |
        An administrator could login and manually run the command.
    can-with-snmp: false
    can-with-syslog: false
    vendor-provided-management: |
        Listing pnotes is only available from the command line interface.

#! REMOTE::SSH
${nice-path} -n 15 cphaprob -l list;${nice-path} -n 15 cphaprob list

#! PARSER::AWK

# cphaprob list only echoes the pnotes that has problems. This means that if there is a problem and it's solved indeni would
# alert like it should, but if the issue heals itself indeni would not not be able to auto-resolve the alert. 
# 
# To solve this we've added the -l switch to the command section. However, since -l is not available on SPLAT we need to run
# both "cphaprob list" and "cphaprob -l list" to account for both versions.
# As a consequence of that some information would be echoed twice on platforms supporting both commands.
# A verification has been added to avoid the same pnote state being written twice.

#Device Name: admin_down
/^Device Name:\s/ {
    deviceName = $0
    sub(/^Device Name:\s/, "", deviceName)
}

#Current state: problem (non-blocking)
#Current state: problem
/^Current state:\s/ {
	
    state = $3

	if (state == "problem") {
		status = 0 
	} else {
        status = 1
    }

    if (!(name in states)) {
        states[deviceName] = status
    }

}

END {

    for (deviceName in states) {
        
        tags["name"] = deviceName
        status = states[deviceName]

        writeDoubleMetricWithLiveConfig("clusterxl-pnote-state", tags, "gauge", "60", status, "ClusterXL Devices", "state", "name")

    }

}




clusterxl_pnote_down_non_vsx

package com.indeni.server.rules.library.templatebased.checkpoint

import com.indeni.ruleengine.expressions.conditions.{Equals => RuleEquals, Not => RuleNot, Or => RuleOr}
import com.indeni.server.common.data.conditions.{Equals => DataEquals, Not => DataNot}
import com.indeni.server.rules.RuleContext
import com.indeni.server.rules.library._
import com.indeni.server.rules.library.templates.StateDownTemplateRule

/**
  *
  */
case class clusterxl_pnote_down_non_vsx() extends StateDownTemplateRule(
  ruleName = "clusterxl_pnote_down_non_vsx",
  ruleFriendlyName = "Check Point ClusterXL (Non-VSX): Pnote(s) down",
  ruleDescription = "ClusterXL has multiple problem notifications (pnotes) - if any of them fail an alert will be issued.",
  metricName = "clusterxl-pnote-state",
  applicableMetricTag = "name",
  alertItemsHeader = "Problematic Elements",
  alertDescription = "This cluster member is down due to certain elements being in a \"problem state\".\n\nThis alert was added per the request of <a target=\"_blank\" href=\"http://il.linkedin.com/pub/gal-vitenberg/83/484/103\">Gal Vitenberg</a>.",
  baseRemediationText = "Review the list of problematic elements and take appropriate action.",
  metaCondition = !DataEquals("vsx", "true"),
  itemSpecificDescription = Seq (
    "(?i).*FIB.*".r -> "The FIB device is responsible for supporting dynamic routing under ClusterXL. Review the firewall logs to ensure traffic with the FIBMGR service is flowing correctly.",

    // Catch-all
    ".*".r -> "Please consult with your technical support provider about this pnote."
  ),
  // Ignore interface active check, we alert about interface count separately
  itemsToIgnore = Set ("Interface Active Check".r))()