Indeni will alert one or more blades in a chassis is down.
Review the cause for the blades being down.
How does this work?
This script uses the F5 iControl API to retrieve the state of the blades.
Why is this important?
A blade that is not powered up could indicate a hardware issue. This could result in reduced performance, or in worst case system downtime.
Without Indeni how would you find this?
An administrator can check the status of the blades by entering TMSH and running “show sys hardware”.
name: f5-rest-mgmt-tm-sys-hardware description: Get hardware status metrics type: monitoring monitoring_interval: 5 minutes requires: vendor: f5 product: load-balancer rest-api: 'true' comments: hardware-element-status: why: | A critical aspect to track on a given device is the health of the hardware components. A power supply which stopped working or a dead fan can spell trouble down the line. how: | This alert uses the F5 iControl REST API to retrieve the health of the power components in a chassis. without-indeni: | An administrator would be able to extract this information by logging into the device through SSH, entering TMSH and executing the command "show sys hardware". The output would then show the status of each hardware element. can-with-snmp: true can-with-syslog: false hardware-eos-date: why: | Ensuring the hardware being used is always within the vendor's list of supported models is critical. Otherwise, during a critical issue, the vendor may decline to provide technical support. indeni tracks the official list from F5 and updates this script to match. how: | This script uses the F5 iControl API to retrieve the current hardware model (the equivalent of running "show sys hardware" in TMSH), and based on the model and the F5 documentation at https://support.f5.com/csp/article/K4309 the correct end of support date is used. without-indeni: | Manual tracking by an administrator is usually the only method for knowing when a given device may be nearing its end of support and is in need of replacement. can-with-snmp: false can-with-syslog: false serial-numbers: skip-documentation: true blade-state: why: | A blade that is not powered up could indicate a hardware issue. This could result in reduced performance, or in worst case system downtime. how: | This script uses the F5 iControl API to retrieve the state of the blades. without-indeni: | An administrator can check the status of the blades by entering TMSH and running "show sys hardware". can-with-snmp: true can-with-syslog: false model: why: | Two or more devices which operate as part of a single cluster must be running on the same hardware. how: | This script uses the F5 REST API to retrieve the hardware model of the device. Indeni then compares the result to the same script run on other members of the same cluster. without-indeni: | Manual tracking by an administrator is usually the only method for knowing when two devices are not running on the same hardware. can-with-snmp: false can-with-syslog: false steps: - run: type: HTTP command: /mgmt/tm/sys/hardware parse: type: JSON file: rest-mgmt-tm-sys-hardware.parser.1.json.yaml
// Deprecation warning : Scala template-based rules are deprecated. Please use YAML format rules instead. package com.indeni.server.rules.library.templatebased.crossvendor import com.indeni.server.rules.RuleContext import com.indeni.server.rules.library.templates.StateDownTemplateRule import com.indeni.server.rules.RemediationStepCondition /** * */ case class chassis_blade_down() extends StateDownTemplateRule( ruleName = "chassis_blade_down", ruleFriendlyName = "Chassis Devices: Blade(s) down", ruleDescription = "Indeni will alert one or more blades in a chassis is down.", metricName = "blade-state", applicableMetricTag = "name", alertItemsHeader = "Blades Affected", alertDescription = "One or more blades in this chassis are down.", baseRemediationText = "Review the cause for the blades being down.")( RemediationStepCondition.VENDOR_CP -> "If the blade was not stopped intentionally (admin down), check to see it wasn't disconnected physically.", RemediationStepCondition.VENDOR_CISCO -> """| |Most of the module related failures (such as the module not coming up, the module getting reloaded, and so on) can be analyzed by looking at the logs stored on the switch. Use the following CLI commands to identify the problem: |•show system reset-reason module |•show version |•show logging |•show module internal exception-log |•show module internal event-history module |•show module internal event-history errors |•show platform internal event-history errors |•show platform internal event-history module |Further details can be found to the next CISCO troubleshooting guide: |https://www.cisco.com/en/US/products/ps5989/prod_troubleshooting_guide_chapter09186a008067a0ef.html""".stripMargin )