,
I created an .IND script that parses for a custom metric: "cpu-soft-lockup-state" and have seen the indeni-collector correctly apply the metric in my test environment:
Command chkp-asg-var-log-messages ran as monitoring for device CP-TEST1(192.168.197.30), returned 1 metrics: cpu-soft-lockup-state (1)
INFO [2017-11-22 14:20:43,435] com.indeni.server.rules.manager.factory.FileSystemRuleFactory: Loading rule from file "/usr/share/indeni/rules/test.rule" INFO [2017-11-22 14:20:56,314] com.indeni.server.rules.manager.factory.FileSystemRuleFactory: Successfully loaded rule "CpuSoftLockup" from file "/usr/share/indeni/rules/test.rule"
----------------------------
Here is how the IND script looks like:
#! META
name: chkp-asg-var-log-messages
description: Retrieve soft lock-up from var/log/messages
type: monitoring
monitoring_interval: 2 minute
requires:
vendor: checkpoint
asg: true
#! REMOTE::SSH
awk '$0>=from' from="$(date +%b" "%d" "%H:%M:%S -d -10min)" /var/log/messages | grep "soft lockup"
#! PARSER::AWK
BEGIN {
logEntries = 0
}
#Jun 20 13:02:18 [remote host] kernel: BUG: soft lockup - CPU#3 stuck for 10s! [cphaconf:12533]
/BUG: soft lockup/ {
logEntries++
lockups[logEntries, "entries"] = $0
}
END{
writeComplexMetricObjectArray("cpu-softlockups", tags, lockups)
}
---------------
Sample Scala Rule
package com.indeni.server.rules.library.templatebased.crossvendor
import com.indeni.data.conditions.{Equals => DataEquals, Not => DataNot}
import com.indeni.ruleengine.expressions.conditions.{Equals => RuleEquals, Not => RuleNot, Or => RuleOr}
import com.indeni.ruleengine.expressions.data.SnapshotExpression
import com.indeni.server.rules.RuleContext
import com.indeni.server.rules.library._
/**
*
*/
case class CrossVendorCpuSoftLockup(context: RuleContext) extends MultiSnapshotValueCheckTemplateRule(context,
ruleName = "CpuSoftLockup",
ruleFriendlyName = "All Devices: CPU Soft Lockup",
ruleDescription = "Many devices that rely on Linux kernels can experience CPU soft lockups. It can be critical to identify them early enough before all the cores are seized.",
metricName = "cpu-softlockups",
alertItemsHeader = "Affected Cores",
alertDescription = "The following device had core(s) experience lockup. Please be aware that if all cores lockup, the device may fail. Indeni tracks if the cores have lockedup in the last 10 minutes",
baseRemediationText = "Please review the process that is causing the lockup for the core. Indeni will autoresolve the alert over the next 20 minutes if the cores are unlocked.",
complexCondition = RuleNot(RuleEquals(RuleHelper.createEmptyComplexArrayConstantExpression(), SnapshotExpression("cpu-softlockups").asMulti().mostRecent().noneable)))()
------------------------
I can't identify in the logs whether the Rule is picking up the new metrics from the TSDB and I can't seem to trigger the alert in the GUI. How can I find whether the indeni-server is picking up the metric "cpu-soft-lockup-state"?