Pool member(s) unavailable-radware-alteon-os

error
best-practices
alteon-os
radware
Pool member(s) unavailable-radware-alteon-os
0

#1

Pool member(s) unavailable-radware-alteon-os

Vendor: radware

OS: alteon-os

Description:
indeni will alert if a pool member which should be available is not.

Remediation Steps:
Determine why the members are down and resolve the issue as soon as possible.

How does this work?
This script uses the “/info/slb/dump” command to retrieve the real server status. This script looks for “RUNNING” status.

Why is this important?
A real server being down results in reduced pool capacity or in the worst, downtime. Disabling nodes is common (during maintenance for example) but it is easily forgotten. This metric would warn administrators when a node is not ready to accept traffic.

Without Indeni how would you find this?
An administrator would have to log in to the device’s web interface and click on “Application Delivery” -> “Server Resources” -> “Real Servers”. This would show a list of the real servers and their states. Select the affected real server to determine if it needs to be reconfigured or has a failing health check.

radware-ssh-info-slb-dump

#! META
name: radware-ssh-info-slb-dump
description: Determine pool member state and capacity
type: monitoring
monitoring_interval: 5 minutes
requires:
    vendor: "radware"
    os.name: "alteon-os"

#! COMMENTS
lb-pool-member-state:
    why: |
        A real server being down results in reduced pool capacity or in the worst, downtime. Disabling nodes is common (during maintenance for example) but it is easily forgotten. This metric would warn administrators when a node is not ready to accept traffic.
    how: |
        This script uses the "/info/slb/dump" command to retrieve the real server status. This script looks for "RUNNING" status. 
    without-indeni: |
        An administrator would have to log in to the device's web interface and click on "Application Delivery" -> "Server Resources" -> "Real Servers". This would show a list of the real servers and their states. Select the affected real server to determine if it needs to be reconfigured or has a failing health check.
    can-with-snmp: true
    can-with-syslog: false
    vendor-provided-management: |
        Can be done through Management GUI (Vision or Alteon VX).
lb-pool-capacity:
    why: |
        A server group that is not running with full capacity could cause slowness in the application, a service disruption, or in the worst case downtime. Knowing the percentage of available members in a group would help prevent this type of issues.
    how: |
        This script uses the "/info/slb/dump" command to retrieve the real server status and and their belonging groups. This script looks for "RUNNING" status. Indeni then measures a % of "RUNNING" servers against total server count.
    without-indeni:
        An administrator would have to manually check member availabilty by logging on to the web interface of the device and clicking on "Application Delivery" -> "Server Resources" -> "Server Groups".  The administrator would have to check the server state of each real server and determine if there is an adequate number of real servers that are active.
    can-with-snmp: false
    can-with-syslog: false
    vendor-provided-management: |
        Can be done through Management GUI (Vision or Alteon VX).

#! REMOTE::SSH
/info/slb/dump / /

#! PARSER::AWK
BEGIN{
    FS = ","
}

#1: 100.100.100.1, 123, 00:00:00:00:00:00,  vlan , port , health inherit, FAILED
/FAILED|RUNNING|DISABLED/ {
    #1: 100.100.100.1
    split($1, arr, ":")
    srvIndex = trim(arr[1])
    runtimeStatus = trim($NF)
    #"runtimeStatus -> srvStatus mapping": {RUNNING->1, FAILED|DISABLED->0}
    if (runtimeStatus == "RUNNING") {
        srvStatus = 1
    } else {
        srvStatus = 0
    }
}

#    Real Server Group 1, health tcp (runtime ICMP)
/Real Server Group/ {
    #"   Real Server Group 1"
    split(ltrim($1), arr, " ")
    groupIndex = arr[4]
    if (! (groupIndex in groupSrvCount)) {
        # groupIndex not in the list, add it and initilize it
        groupSrvCount[groupIndex] = 0
        groupRunningSrvCount[groupIndex] = 0
    }
    groupSrvCount[groupIndex]++
    if (srvStatus == 1) {
        groupRunningSrvCount[groupIndex]++
    }
   
    # server state metric per each group 
    serverTags["name"] = srvIndex
    serverTags["pool-name"] = groupIndex
    writeDoubleMetric("lb-pool-member-state", serverTags, "gauge", 300, srvStatus)
}

END {
    for (groupIndex in groupSrvCount) {
        groupTags["name"] = groupIndex
        percentile = 100.0 * groupRunningSrvCount[groupIndex] / groupSrvCount[groupIndex]
        writeDoubleMetric("lb-pool-capacity", groupTags, "gauge", 300, percentile)
    }
}

lb_pool_members_unavailable

package com.indeni.server.rules.library.templatebased.loadbalancer

import com.indeni.server.rules.RuleContext
import com.indeni.server.rules.library.templates.StateDownTemplateRule
/**
  *
  */
case class lb_pool_members_unavailable() extends StateDownTemplateRule(
  ruleName = "lb_pool_members_unavailable",
  ruleFriendlyName = "Load Balancers: Pool member(s) unavailable",
  ruleDescription = "indeni will alert if a pool member which should be available is not.",
  metricName = "lb-pool-member-state",
  applicableMetricTag = "name",
  descriptionMetricTag = "pool-name",
  alertItemsHeader = "Pool Members Affected",
  alertDescription = "Certain pool members which should available are not. Review list below.",
  baseRemediationText = "Determine why the members are down and resolve the issue as soon as possible.")()