Server(s) down-f5-False

error
health-checks
false
f5
Server(s) down-f5-False
0

#1

Server(s) down-f5-False

Vendor: f5

OS: False

Description:
indeni will alert one or more servers that the load balancer is directing traffic to is down.

Remediation Steps:
Review the cause for the servers being down.

How does this work?
This alert uses the iControl REST interface to extract the node statuses on the device.

Why is this important?
A node marked as down by a monitor, or disabled by an administrator, results in reduced pool capacity or in worst case, downtime. Disabling nodes is common during ie. a maintenance window but it is easily forgotten. This metric would warn administrators when a node is not ready to accept traffic.

Without Indeni how would you find this?
Login to the device’s web interface and click on “Local Traffic” -> “Nodes”. This would show a list of the nodes and their statuses. In case the configuration is divided in multiple partitions changing to the “All [Read-only]” partition is recommended.

f5-rest-mgmt-tm-ltm-node

 #! META
name: f5-rest-mgmt-tm-ltm-node
description: Determine node state and availablity
type: monitoring
monitoring_interval: 5 minutes
requires:
    vendor: "f5"
    product: "load-balancer"
    rest-api: "true"

#! COMMENTS
lb-server-state:
    why: |
        A node marked as down by a monitor, or disabled by an administrator, results in reduced pool capacity or in worst case, downtime. Disabling nodes is common during ie. a maintenance window but it is easily forgotten. This metric would warn administrators when a node is not ready to accept traffic.
    how: |
        This alert uses the iControl REST interface to extract the node statuses on the device.
    without-indeni: |
        Login to the device's web interface and click on "Local Traffic" -> "Nodes". This would show a list of the nodes and their statuses. In case the configuration is divided in multiple partitions changing to the "All [Read-only]" partition is recommended.
    can-with-snmp: true
    can-with-syslog: false
    vendor-provided-management: Unknown

#! REMOTE::HTTP
url: /mgmt/tm/ltm/node?$select=fullPath,session,state
protocol: HTTPS

#! PARSER::JSON

_metrics:

    #   State determines if the node is ready to accept traffic according to the LB
    #
    #   Possible states:
    #   checking - No result yet
    #   down - monitor down
    #   up - Monitor up
    #   unchecked - no assigned monitor
    #   user-down - Forced offline (disabled, and manually set to state down)
    #   unavailable - Ie. connection limit has been reached

    - # Record nodes ready to receive traffic
        _groups:
            "$.items[0:][?((@.state == 'up' || @.state == 'unchecked') && @.session != 'user-disabled')]":
                _tags:
                    "im.name":
                        _constant: "lb-server-state"
                    "name":
                        _value: "fullPath"
                _value.double:
                    _constant: "1"
    - # Record nodes not ready to receive traffic
        _groups:
            "$.items[0:][?((@.state != 'up' && @.state != 'unchecked') || @.session == 'user-disabled')]":
                _tags:
                    "im.name":
                        _constant: "lb-server-state"
                    "name":
                        _value: "fullPath"
                _value.double:
                    _constant: "0"

lb_server_down

package com.indeni.server.rules.library.templatebased.loadbalancer

import com.indeni.server.rules.RuleContext
import com.indeni.server.rules.library.templates.StateDownTemplateRule
/**
  *
  */
case class lb_server_down() extends StateDownTemplateRule(
  ruleName = "lb_server_down",
  ruleFriendlyName = "Load Balancers: Server(s) down",
  ruleDescription = "indeni will alert one or more servers that the load balancer is directing traffic to is down.",
  metricName = "lb-server-state",
  applicableMetricTag = "name",
  alertItemsHeader = "Servers Affected",
  alertDescription = "One or more servers, to which this device is forwarding traffic, are down.",
  baseRemediationText = "Review the cause for the servers being down.")()