High Memory Usage per Chassis and Blade-juniper-junos
Vendor: juniper
OS: junos
Description:
Alert when Memory usage is high
Remediation Steps:
Review the load on this thread to see if the memory utilization is valid.
|||1. On the device command line interface execute “show chassis routing-engine” command to check overall routing engine memory usage.
|2. Run “show system processes extensive” command to review the memory allocation status for processes.
|3. Identify the processes which are consuming too much memory.
|4. Consider turning off some processes which are not vital to ensure bigger memory space allocation for other sessions and processes.
|5. Review the following article on Juniper tech support site: Checking Memory Status.
How does this work?
This script run the “show chassis routing-engine node X” command via SSH connection to retrieve the memory usage for both Control and Data plane.
Why is this important?
The various memory components are important to ensure smooth operation. They include Control and Data plane memory usage.
Without Indeni how would you find this?
An administrator needs to log in the device to run the “show chassis routing-engine node X” command to retrieve memory usage for Control and Data plane.
junos-show-chassis-routing-engine-cluster
name: junos-show-chassis-routing-engine-cluster
description: Retrieve the statistics and memory for the Routing Engine(CPU/mem).
type: monitoring
includes_resource_data: true
monitoring_interval: 1 minute
requires:
vendor: juniper
os.name: junos
product: firewall
high-availability: true
comments:
cpu-usage:
why: |
Control and data plane CPU utilization is important to track to ensure smooth operation.
A high CPU utilization of the control plane may impact the management interface, while a high CPU utilization in the data plane may impact traffic handling.
how: |
This script run the "show chassis routing-engine node X" command via SSH connection to retrieve the routing engine CPU usage.
can-with-snmp: false
can-with-syslog: false
memory-usage:
why: |
The various memory components are important to ensure smooth operation. They include control and data plane memory usage.
how: |
This script run the "show chassis routing-engine node X" command via SSH connection to retrieve the memory usage for both Control and Data plane.
can-with-snmp: false
can-with-syslog: false
memory-total-kbytes:
why: |
Tracking total memory on the system is critical to evaluate and assess current memory utilizatiion.
how: |
This script run the "show chassis routing-engine node X" command via SSH connection to retrieve the total memory for both Control and Data plane.
can-with-snmp: false
can-with-syslog: false
memory-free-kbytes:
why: |
Tracking free memory on the system is critical to evaluate memory utilization and identify possible memory leaks.
how: |
This script run the "show chassis routing-engine node X" command via SSH connection to retrieve the free memory for both Control and Data plane.
can-with-snmp: false
can-with-syslog: false
steps:
- run:
type: SSH
command: show chassis hardware node local | display xml
parse:
type: XML
file: show-chassis-routing-engine-cluster.parser.1.xml.yaml
- run:
type: SSH
command: show chassis routing-engine node ${node} | display xml
parse:
type: XML
file: show-chassis-routing-engine-cluster.parser.2.xml.yaml
high_per_chassis_blade_memory_usage
package com.indeni.server.rules.library.core
import com.indeni.ruleengine.Scope.{Scope, ScopeValueHelper}
import com.indeni.ruleengine.expressions.Expression
import com.indeni.ruleengine.expressions.conditions.GreaterThanOrEqual
import com.indeni.ruleengine.expressions.core.{StatusTreeExpression, _}
import com.indeni.ruleengine.expressions.data.{SelectTagsExpression, SelectTimeSeriesExpression, TimeSeriesExpression}
import com.indeni.ruleengine.expressions.math.AverageExpression
import com.indeni.ruleengine.expressions.scope.ScopableExpression
import com.indeni.server.common.ParameterValue
import com.indeni.server.common.data.conditions.{Equals, True}
import com.indeni.server.params.ParameterDefinition
import com.indeni.server.params.ParameterDefinition.UIType
import com.indeni.server.rules._
import com.indeni.server.rules.config.expressions.DynamicParameterExpression
import com.indeni.server.rules.library.{ConditionalRemediationSteps, PerDeviceRule}
import com.indeni.server.sensor.models.managementprocess.alerts.dto.AlertSeverity
/**
* Created by amir on 04/02/2016.
*/
case class HighPerChassisBladeMemoryUsageRule() extends PerDeviceRule {
private val highThresholdParameterName: String = "High_Threshold_of_Memory_Usage"
private val highThresholdParameter = new ParameterDefinition(highThresholdParameterName,
"",
"High Threshold of Memory Usage",
"What is the threshold for the memory usage for which once it is crossed an issue will be triggered.",
UIType.DOUBLE,
new ParameterValue((85.0).asInstanceOf[Object])
)
override val metadata: RuleMetadata =
RuleMetadata.builder(
"high_per_chassis_blade_memory_usage",
"High Memory Usage per Chassis and Blade",
"Alert when Memory usage is high",
AlertSeverity.ERROR,
Set(RuleCategory.HealthChecks),
deviceCategory = DeviceCategory.AllDevices).configParameter(highThresholdParameter).build()
override def expressionTree(context: RuleContext): StatusTreeExpression = {
val usagePercentage = AverageExpression(TimeSeriesExpression[Double]("memory-usage"))
val usagePercentageThreshold = DynamicParameterExpression.withConstantDefault(highThresholdParameter.getName, highThresholdParameter.getDefaultValue.asDouble.toDouble).noneable
val isUsagePercentageAboveThreshold = GreaterThanOrEqual(usagePercentage, usagePercentageThreshold)
val mountSpaceFailDescription = new ScopableExpression[String] {
override protected def evalWithScope(time: Long, scope: Scope): String =
"Memory usage (" + usagePercentage.eval(time) + "%) above threshold (" + usagePercentageThreshold.eval(time) + "%) " +
"for chassis: " + scope.getVisible("Chassis").get + ", blade: " + scope.getVisible("Blade").get
override def args: Set[Expression[_]] = Set(usagePercentage, usagePercentageThreshold)
}
val mountSpaceFailHeadline = new ScopableExpression[String] {
override protected def evalWithScope(time: Long, scope: Scope): String = "chassis: " + scope.getVisible("Chassis").get + ", blade: " + scope.getVisible("Blade").get
override def args: Set[Expression[_]] = Set()
}
val tsQuery = SelectTimeSeriesExpression[Double](context.tsDao, Set("memory-usage"), denseOnly = false)
val forTsCondition = StatusTreeExpression(tsQuery, isUsagePercentageAboveThreshold).withSecondaryInfo(
mountSpaceFailHeadline, mountSpaceFailDescription, title = "Problematic Blades"
).asCondition()
val chassisBladeQuery = SelectTagsExpression(context.tsDao, Set("Chassis", "Blade"), True)
val highMemoryUsagePerDevicePerChassisBladeLogic = StatusTreeExpression(chassisBladeQuery, forTsCondition)
.withoutInfo().asCondition()
val headline = ConstantExpression("High memory usage")
val description = ConstantExpression("The memory usage in the operating system is higher than the high threshold.")
val remediation = ConditionalRemediationSteps("Review the load on this thread to see if the memory utilization is valid.",
RemediationStepCondition.VENDOR_CISCO ->
"""1. Check from the Indeni the memory utilization history graph for this device an review the pattern. Correlate any change to the pattern with any configuration change.
|2. The next NX-OS commands output can inform whether the platform memory utilization is normal or un-expected:
| a. "show system resources"
| b. "show processes memory"
|3. For more information, please review: <a target="_blank" href="http://docwiki.cisco.com/wiki/Cisco_Nexus_7000_Series_NX-OS_Troubleshooting_Guide_--_Troubleshooting_Memory">Troubleshooting Guide For High Memory Utilization</a>.""".stripMargin,
RemediationStepCondition.VENDOR_JUNIPER ->
"""|1. On the device command line interface execute "show chassis routing-engine" command to check overall routing engine memory usage.
|2. Run "show system processes extensive" command to review the memory allocation status for processes.
|3. Identify the processes which are consuming too much memory.
|4. Consider turning off some processes which are not vital to ensure bigger memory space allocation for other sessions and processes.
|5. Review the following article on Juniper tech support site: <a target="_blank" href="https://www.juniper.net/documentation/en_US/release-independent/nce/topics/task/operational/security-policy-memory-testing.html">Checking Memory Status</a>.""".stripMargin
)
val devicesFilter = Equals("model", "CheckPoint61k")
val devicesQuery = SelectTagsExpression(context.metaDao, Set(DeviceKey), devicesFilter)
StatusTreeExpression(devicesQuery, highMemoryUsagePerDevicePerChassisBladeLogic).withRootInfo(
headline, description, remediation
)
}
}