Pool member(s) unavailable-f5-False

error
false
best-practices
f5
Pool member(s) unavailable-f5-False
0

#1

Pool member(s) unavailable-f5-False

Vendor: f5

OS: False

Description:
indeni will alert if a pool member which should be available is not.

Remediation Steps:
Determine why the members are down and resolve the issue as soon as possible.

How does this work?
This alert uses the iControl REST interface to extract the node states on the device.

Why is this important?
A node disabled by an administrator results in reduced pool capacity or in worst case, downtime. Disabling nodes is common during ie. a maintenance but it is easily forgotten. This metric would warn administrators when a node is not ready to accept traffic.

Without Indeni how would you find this?
Login to the device’s web interface and click on “Local Traffic” -> “Pools” -> “Statistics”. This would show a list of the pools, their members and their states. In case the configuration is divided in multiple partitions changing to the “All [Read-only]” partition is recommended.

f5-rest-mgmt-tm-ltm-pool

 #! META
name: f5-rest-mgmt-tm-ltm-pool
description: Determine pool member state, availability, capacity and action on service down
type: monitoring
monitoring_interval: 5 minutes
requires:
    vendor: "f5"
    product: "load-balancer"
    rest-api: "true"

#! COMMENTS
lb-pool-member-availability:
    why: |
        A member marked as down by a monitor results in reduced pool capacity or in worst case, downtime. This metric would warn administrators when a member is marked as down.
    how: |
        This alert uses the iControl REST interface to extract the member statuses on the device.
    without-indeni: |
        Login to the device's web interface and click on "Local Traffic" -> "Pools" -> "Statistics". This would show a list of the pools, their members and their availability. In case the configuration is divided in multiple partitions changing to the "All [Read-only]" partition is recommended.
    can-with-snmp: true
    can-with-syslog: false
    vendor-provided-management: Unknown
lb-pool-member-state:
    why: |
        A node disabled by an administrator results in reduced pool capacity or in worst case, downtime. Disabling nodes is common during ie. a maintenance but it is easily forgotten. This metric would warn administrators when a node is not ready to accept traffic.
    how: |
        This alert uses the iControl REST interface to extract the node states on the device.
    without-indeni: |
        Login to the device's web interface and click on "Local Traffic" -> "Pools" -> "Statistics". This would show a list of the pools, their members and their states. In case the configuration is divided in multiple partitions changing to the "All [Read-only]" partition is recommended.
    can-with-snmp: true
    can-with-syslog: false
    vendor-provided-management: Unknown
lb-pool-capacity:
    why: |
        A pool that is not running with full capacity could cause slowness in the application, service disruption, or in worst case downtime. indeni tracks this by measuring the available members of the pool in percent.
    how: |
        This alert uses the iControl REST interface to extract the members available to process traffic compared to the total members of the pool.
    without-indeni:
        An administrator could manually check member availabilty by logging on to the web interface of the device and clicking on "Local Traffic" -> "Pools" -> "Statistics". This would show the pool statistics for the active partition.
f5-default-action-on-service-down:
    why: |
        The default option is "None", which maintains connections to pool member even when the monitor fails, but does not create new connections. The better option in most cases however, is "Reject" which instead resets the existing connection and forces the client to establish a new one. This, coupled with good monitors ensures that the client has an optimal chance of connecting to a functioning pool member.
    how: |
        This alert uses the iControl REST interface to extract the option "Action On Service Down" for all configured pools.
    without-indeni:
        An administrator could manually check member availabilty by logging on to the web interface of the device and clicking on "Local Traffic" -> "Pools" and for each pool in the list verify the option "Action On Service Down".

#! REMOTE::HTTP
url: /mgmt/tm/ltm/pool?expandSubcollections=true&$select=fullPath,serviceDownAction,membersReference/items/fullPath,membersReference/items/selfLink,membersReference/items/state,membersReference/items/session
protocol: HTTPS

#! PARSER::JSON

_metrics:

    #   State determines if pool members are ready to accept traffic according to the LB
    #
    #   Possible states:
    #   checking - No result yet
    #   down - monitor down
    #   up - Monitor up
    #   unchecked - no assigned monitor
    #   user-down - Forced offline (disabled, and manually set to state down)
    #   unavailable - Ie. connection limit has been reached

    - # Record members ready to receive traffic
        _groups:
            "$.items[0:].membersReference.items[0:][?(@.state == 'up' || @.state == 'unchecked')]":
                _tags:
                    "im.name":
                        _constant: "lb-pool-member-availability"
                    "name":
                        _value: "fullPath"
                _temp:
                    "poolName":
                        _value: "selfLink"
                _value.double:
                    _constant: "1"
        _transform:
            _tags:
                "pool-name": |
                    {
                        #We want to extract the pool name from the selfLink

                        #https://localhost/mgmt/tm/ltm/pool/~MyPartition~MyPool/members/~Common~MyWeb-03:5905?ver\u003d12.1.1",
                        poolName = temp("poolName")

                        #Remove evenything up to the first ~
                        sub(/^[^~].+?(?=~)/, "", poolName)

                        #Replace every thing from the first slash to the end
                        #~MyPartition~MyPool/members/~Common~MyWeb-03:5905?ver\u003d12.1.1",
                        sub(/\/.*/, "", poolName)

                        #Finally, replace ~ with /
                        #~MyPartition~MyPool
                        gsub(/\~/, "/", poolName)

                        print poolName

                    }

    - # Record members that is not ready to receive traffic
        _groups:
            "$.items[0:].membersReference.items[0:][?(@.state != 'up' && @.state != 'unchecked')]":
                _tags:
                    "im.name":
                        _constant: "lb-pool-member-availability"
                    "name":
                        _value: "fullPath"
                _temp:
                    "poolName":
                        _value: "selfLink"
                _value.double:
                    _constant: "0"
        _transform:
            _tags:
                "pool-name": |
                    {
                        #We want to extract the pool name from the selfLink

                        #https://localhost/mgmt/tm/ltm/pool/~MyPartition~MyPool/members/~Common~MyWeb-03:5905?ver\u003d12.1.1",
                        poolName = temp("poolName")

                        #Remove evenything up to the first ~
                        sub(/^[^~].+?(?=~)/, "", poolName)

                        #Replace every thing from the first slash to the end
                        #~MyPartition~MyPool/members/~Common~MyWeb-03:5905?ver\u003d12.1.1",
                        sub(/\/.*/, "", poolName)

                        #Finally, replace ~ with /
                        #~MyPartition~MyPool
                        gsub(/\~/, "/", poolName)

                        print poolName

                    }
    #   Session determines if pool members are enabled or disabled
    #
    #   Session:   monitor-enabled - Enabled and with a monitor
    #              user-enabled - Enabled, no monitor
    #              user-disabled - Disabled
    - # Record members that is enabled
        _groups:
            "$.items[0:].membersReference.items[0:][?(@.session != 'user-disabled')]":
                _tags:
                    "im.name":
                        _constant: "lb-pool-member-state"
                    "name":
                        _value: "fullPath"
                _temp:
                    "poolName":
                        _value: "selfLink"
                _value.double:
                    _constant: "1"
        _transform:
            _tags:
                "pool-name": |
                    {
                        #We want to extract the pool name from the selfLink

                        #https://localhost/mgmt/tm/ltm/pool/~MyPartition~MyPool/members/~Common~MyWeb-03:5905?ver\u003d12.1.1",
                        poolName = temp("poolName")

                        #Remove evenything up to the first ~
                        sub(/^[^~].+?(?=~)/, "", poolName)

                        #Replace every thing from the first slash to the end
                        #~MyPartition~MyPool/members/~Common~MyWeb-03:5905?ver\u003d12.1.1",
                        sub(/\/.*/, "", poolName)

                        #Finally, replace ~ with /
                        #~MyPartition~MyPool
                        gsub(/\~/, "/", poolName)

                        print poolName

                    }
    - # Record members that is disabled
        _groups:
            "$.items[0:].membersReference.items[0:][?(@.session == 'user-disabled')]":
                _tags:
                    "im.name":
                        _constant: "lb-pool-member-state"
                    "name":
                        _value: "fullPath"
                _temp:
                    "poolName":
                        _value: "selfLink"
                _value.double:
                    _constant: "0"
        _transform:
            _tags:
                "pool-name": |
                    {
                        #We want to extract the pool name from the selfLink

                        #https://localhost/mgmt/tm/ltm/pool/~MyPartition~MyPool/members/~Common~MyWeb-03:5905?ver\u003d12.1.1",
                        poolName = temp("poolName")

                        #Remove evenything up to the first ~
                        sub(/^[^~].+?(?=~)/, "", poolName)

                        #Replace every thing from the first slash to the end
                        #~MyPartition~MyPool/members/~Common~MyWeb-03:5905?ver\u003d12.1.1",
                        sub(/\/.*/, "", poolName)

                        #Finally, replace ~ with /
                        #~MyPartition~MyPool
                        gsub(/\~/, "/", poolName)

                        print poolName

                    }
    - # Calculate pool capacity
        _groups:
            "$.items[0:][?(@.membersReference.items[0:].session)]":
                _tags:
                    "im.name":
                        _constant: "lb-pool-capacity"
                    "name":
                        _value: "fullPath"
                _temp:
                    "membersPassingTraffic":
                        _count: "membersReference.items[0:][?(@.session != 'user-disabled' && (@.state == 'up' || @.state == 'unchecked'))]"
                    "totalMembers":
                        _count: "membersReference.items[0:]"
        _transform:
            _value.double: |
                {
                    capacity = temp("membersPassingTraffic")/temp("totalMembers")*100
                    print capacity
                }
    - # Get pools where service down action is set to reject (F5 calls this reset in the configuration files, and reject in the web interface for some reason)
        _groups:
            "$.items[0:]":
                _tags:
                    "im.name":
                        _constant: "f5-default-action-on-service-down"
                    "name":
                        _value: "fullPath"
                _temp:
                    "actionOnServiceDown":
                        _value: "serviceDownAction"
        _transform:
            _value.complex:
                value: |
                    {
                        if(temp("actionOnServiceDown") == "none"){
                            print "true"
                        } else {
                            print "false"
                        }
                    }

lb_pool_members_unavailable

package com.indeni.server.rules.library.templatebased.loadbalancer

import com.indeni.server.rules.RuleContext
import com.indeni.server.rules.library.templates.StateDownTemplateRule
/**
  *
  */
case class lb_pool_members_unavailable() extends StateDownTemplateRule(
  ruleName = "lb_pool_members_unavailable",
  ruleFriendlyName = "Load Balancers: Pool member(s) unavailable",
  ruleDescription = "indeni will alert if a pool member which should be available is not.",
  metricName = "lb-pool-member-state",
  applicableMetricTag = "name",
  descriptionMetricTag = "pool-name",
  alertItemsHeader = "Pool Members Affected",
  alertDescription = "Certain pool members which should available are not. Review list below.",
  baseRemediationText = "Determine why the members are down and resolve the issue as soon as possible.")()