High CPU usage per core(s)-juniper-junos

error
junos
health-checks
juniper
High CPU usage per core(s)-juniper-junos
0
#1

High CPU usage per core(s)-juniper-junos

Vendor: juniper

OS: junos

Description:
High CPU usage is a symptom of a system which is unable to handle " +
"the required load or a symptom of a specific issue with the system " +
"and the applications and services running on it. Indeni will monitor the CPU usage " +
"of each core separately and alert if any of the cores’ CPU usage crosses the threshold.

Remediation Steps:
Determine the cause for the high CPU usage of the listed cores.
|||The Juniper SRX device may start dropping packets if CPU utilization reaches 100%. In order to determine the root cause of high CPU usage:
|1. Check the CPU status in the routing engine by running “show chassis routing-engine” in the command-line interface.
|2. Identify the top running processes which hold most of CPU cycle by running the command “show system processes extensive”. Consider restarting or ending processes if too many events are being handled (e.g. sampling, traceoptions, syslog, snmp).
|3. Check CPU utilization in the forwarding engine by running “show chassis forwarding”. If CPU is high it may be indicative of the device reaching capacity. |
|See Juniper Tech Library on CPU load thresholds or contact Juniper technical support for further troubleshooting.

How does this work?
This script and others use the CLI over SSH to retrieve the current status of multiple different CPU elements.

Why is this important?
The control and data plane CPU utilization of a Juniper JUNOS device are important to track to ensure a smooth operation. A high CPU utilization of the control plane may impact the management interface, while a high CPU utilization in the data plane may impact traffic handling.

Without Indeni how would you find this?
CPU utilization information at both the control and data plane levels is available via SNMP and can be monitored using an SNMP-based tool. An administrator can then define thresholds against this.

junos-show-chassis-routing-engine

#! META
name: junos-show-chassis-routing-engine
description: JUNOS get routing engine stats (CPU/mem)
type: monitoring
includes_resource_data: true
monitoring_interval: 1 minute
requires:
    vendor: juniper
    os.name: junos
    product: firewall
    high-availability: 
        neq: true

#! COMMENTS
cpu-usage:
    why: |
        The control and data plane CPU utilization of a Juniper JUNOS device are important to track to ensure a smooth operation. A high CPU utilization of the control plane may impact the management interface, while a high CPU utilization in the data plane may impact traffic handling.
    how: |
        This script and others use the CLI over SSH to retrieve the current status of multiple different CPU elements.
    without-indeni: |
        CPU utilization information at both the control and data plane levels is available via SNMP and can be monitored using an SNMP-based tool. An administrator can then define thresholds against this.
    can-with-snmp: true
    can-with-syslog: false
memory-usage:
    why: |
        The various memory components of a Juniper JUNOS device are important to track to ensure a smooth operation. This includes the routing engine's memory element (RE) as well as the variety of data plane elements.
    how: |
        This script and others use the CLI over SSH to retrieve the current status of multiple different memory elements.
    without-indeni: |
        Some of the memory elements' status is accessible over SNMP, but many of the memory elements in the data plane are solely accessible over SSH. An administrator would need to write their own scripts to collect this information.
    can-with-snmp: false
    can-with-syslog: false

#! REMOTE::SSH
show chassis routing-engine | display xml

#! PARSER::XML
_vars:
    root: /rpc-reply//route-engine-information[1]
_metrics:
    -
        _tags:
            "im.name":
                _constant: "cpu-usage"
            "live-config":
                _constant: "true"
            "display-name":
                _constant: "CPU Usage"
            "im.dstype.displayType":
                _constant: "percentage"
            "cpu-id":
                _constant: "RE"
            "cpu-is-avg":
                _constant: "false"
            "resource-metric":
                _constant: "true"
            "im.identity-tags":
                _constant: "cpu-id"
        _temp:
            "cpu_idle":
                _text: ${root}/route-engine/cpu-idle
        _transform:
            _value.double: |
                {
                    idle_cpu = 100 - temp("cpu_idle")
                    print idle_cpu 
                }

# CONTROL PLANE MEMORY
    -
        _tags:
            "im.name":
                _constant: "memory-total-kbytes"
            "live-config":
                _constant: "true"
            "display-name":
                _constant: "Memory Used"
            "im.dstype.displayType":
                _constant: "kilobytes"
            "name":
                _constant: "Control Plane"
            "im.identity-tags":
                _constant: "name"

        _temp:
            "cp_total_mem":
                _text: ${root}/route-engine/memory-control-plane
        _transform:
            _value.double: |
                {
                    cp_total_memory = temp("cp_total_mem") * 1024
                    print cp_total_memory
                }
    -
        _tags:
            "im.name":
                _constant: "memory-free-kbytes"
            "live-config":
                _constant: "true"
            "display-name":
                _constant: "Memory Free"
            "im.dstype.displayType":
                _constant: "kilobytes"
            "name":
                _constant: "Control Plane"
            "im.identity-tags":
                _constant: "name"

        _temp:
            "cp_total_mem":
                _text: ${root}/route-engine/memory-control-plane
            "cp_used_mem":
                _text: ${root}/route-engine/memory-control-plane-used
        _transform:
            _value.double: |
                {
                    cp_free_memory = (temp("cp_total_mem") - temp("cp_used_mem")) * 1024
                    print cp_free_memory 
                }
    -
        _tags:
            "im.name":
                _constant: "memory-usage"
            "live-config":
                _constant: "true"
            "display-name":
                _constant: "Memory Usage"
            "im.dstype.displayType":
                _constant: "percentage"
            "name":
                _constant: "Control Plane"
            "resource-metric":
                _constant: "true"
            "im.identity-tags":
                _constant: "name"
        _temp:
            "cp_mem_usage":
                _text: ${root}/route-engine/memory-control-plane-util
        _transform:
            _value.double: |
                {
                    print temp("cp_mem_usage")
                }

# DATA PLANE MEMORY
    -
        _tags:
            "im.name":
                _constant: "memory-total-kbytes"
            "live-config":
                _constant: "true"
            "display-name":
                _constant: "Memory Used"
            "im.dstype.displayType":
                _constant: "kilobytes"
            "name":
                _constant: "Data Plane"
            "im.identity-tags":
                _constant: "name"

        _temp:
            "dp_total_mem":
                _text: ${root}/route-engine/memory-data-plane
        _transform:
            _value.double: |
                {
                    dp_total_memory = temp("dp_total_mem") * 1024
                    print dp_total_memory
                }
    -
        _tags:
            "im.name":
                _constant: "memory-free-kbytes"
            "live-config":
                _constant: "true"
            "display-name":
                _constant: "Memory Free"
            "im.dstype.displayType":
                _constant: "kilobytes"
            "name":
                _constant: "Data Plane"
            "im.identity-tags":
                _constant: "name"

        _temp:
            "dp_total_mem":
                _text: ${root}/route-engine/memory-data-plane
            "dp_used_mem":
                _text: ${root}/route-engine/memory-data-plane-used
        _transform:
            _value.double: |
                {
                    dataplane_memory = (temp("dp_total_mem") - temp("dp_used_mem")) * 1024
                    print dataplane_memory
                }
    -
        _tags:
            "im.name":
                _constant: "memory-usage"
            "live-config":
                _constant: "true"
            "display-name":
                _constant: "Memory Usage"
            "im.dstype.displayType":
                _constant: "percentage"
            "name":
                _constant: "Data Plane"
            "im.identity-tags":
                _constant: "name"
        _temp:
            "dp_mem_usage":
                _text: ${root}/route-engine/memory-data-plane-util
        _transform:
            _value.double: |
                {
                    print temp("dp_mem_usage")
                }

high_per_core_cpu_use_by_device

Failed to fetch the data: https://bitbucket.org/indeni/indeni-knowledge/src/master/rules/sync_core_rules/HighPerCoreCpuUsageRule.scala
0 Likes