Next hop inaccessible-juniper-junos

error
junos
health-checks
juniper
Next hop inaccessible-juniper-junos
0

#1

Next hop inaccessible-juniper-junos

Vendor: juniper

OS: junos

Description:
Indeni will review the routing table and identify when a next hop router is showing as FAILED or INCOMPLETE in the ARP table.

Remediation Steps:
Determine why the next hops are not responding.
|||1. Log into the device over SSH and enter “show arp no-resolve” command to review next-hop MAC and IP address information in ARP table.
|2. Check for a misconfiguration on interfaces or a physical issue.
|3. Review the following article on Juniper tech support site: Operational Commands

How does this work?
This script retrieves the information from the ARP table via SSH connection to the device by running the command “show arp no-resolve” command. Then it extracts IPs, MAC addresses, interfaces and status. It also shows the total entries in the ARP table.

Why is this important?
The ARP table stores the mapping between IPs and MAC addresses to minimize the ARP traffic. The size of the ARP table and the incorrect mapping between IPs and MAC addreses can create many different issues for the network. So it is critical to monitor it.

Without Indeni how would you find this?
An administrator could log on to the device to run the command “show arp no-resolve” to collect the same information.

junos-show-arp-no-resolve

#! META
name: junos-show-arp-no-resolve
description: JUNOS get ARP table information 
type: monitoring
monitoring_interval: 5 minute
requires:
    vendor: juniper
    os.name: junos
    product: firewall

#! COMMENTS
arp-total-entries:
    skip-documentation: true
arp-table:
    why: |
        The ARP table stores the mapping between IPs and MAC addresses to minimize the ARP traffic. The size of the ARP table and the incorrect mapping between IPs and MAC addreses can create many different issues for the network. So it is critical to monitor it.
    how: |
        This script retrieves the information from the ARP table via SSH connection to the device by running the command  "show arp no-resolve" command. Then it extracts IPs, MAC addresses, interfaces and status. It also shows the total entries in the ARP table. 
    without-indeni: |
        An administrator could log on to the device to run the command "show arp no-resolve" to collect the same information.
    can-with-snmp: true 
    can-with-syslog: false
    vendor-provided-management: |
        The commamnd line is available to retrieve this information

#! REMOTE::SSH
show arp no-resolve | display xml

#! PARSER::XML
_vars:
    root: /rpc-reply//arp-table-information[1]
_metrics:
    -
        _groups:
            ${root}/arp-table-entry:
                _tags:
                    "im.name":
                        _constant: "arp-table"
                    "live-config":
                        _constant: "true"
                    "display-name":
                        _constant: "ARP Table"
                _value.complex:
                    target_ip:
                        _text: ip-address
                    interface:
                        _text: interface-name
                    mac:
                        _text: mac-address
                _temp:
                    mac_address:
                        _text: "mac-address"
        _transform:
            _value.complex:
                "success": |
                    {
                        if(temp("mac_address") ~ /Incomplete/) {
                           arp_success = "0"
                        } else {
                           arp_success = "1"
                        }
                        print arp_success
                    }         
        _value: complex-array
    -
        _tags:
            "im.name":
                _constant: "arp-total-entries"
            "live-config":
                _constant: "true"
            "display-name":
                _constant: "ARP - Total Entries"
            "im.dstype.displayType":
                _constant: "number"
        _value.double:
            _text: ${root}/arp-entry-count

junos-show-route-protocol-static-terse

#! META
name: junos-show-route-protocol-static-terse
description: JUNOS get static routes information 
type: monitoring
monitoring_interval: 5 minute
requires:
    vendor: juniper
    os.name: junos
    product: firewall

#! COMMENTS
connected-networks-table:
    why: |
        The static routes are manually defined on a device. Incorrectly defined static routes will cause the network outage or unpredictable network behaviors.
    how: |
        This script retrieves routes statically defined on a device by running the command "show route protocol static terse" via SSH connection to a device. 
    without-indeni: |
        An administrator could log on to the device to run the command "show route protocol static terse" to collect the same information.
    can-with-snmp: false 
    can-with-syslog: false
    vendor-provided-management: |
        The commamnd line is available to retrieve this information

#! REMOTE::SSH
show route protocol static terse

#! PARSER::AWK
#inet.0: 13 destinations, 13 routes (13 active, 0 holddown, 0 hidden)
/inet/ {
  table_name = $1
}

#* 30.30.30.0/24      D   0                       >lt-0/0/0.1
/^(\*\s+[0-9]|\+\s+[0-9]|\-\s+[0-9])/ {
    line++
    network = $2
    next_hop = $NF
    split(network, network_prefix, "/")
    route[line, "table-name"] = table_name
    route[line, "network"] = network_prefix[1]
    route[line, "mask"] = network_prefix[2]  
    gsub(/\>/, "", next_hop)
    route[line, "next-hop"] = next_hop
}

END{
    writeComplexMetricObjectArray("static-routing-table", null, route)
}

cross_vendor_next_hop_router_inaccessible

package com.indeni.server.rules.library

import com.indeni.ruleengine.InvisibleScopeKey
import com.indeni.ruleengine.expressions.conditions.Equals
import com.indeni.ruleengine.expressions.core._
import com.indeni.ruleengine.expressions.data._
import com.indeni.ruleengine.expressions.scope.ScopeValueExpression
import com.indeni.server.common.data.conditions.True
import com.indeni.server.rules._
import com.indeni.server.rules.library.core.PerDeviceRule
import com.indeni.server.sensor.models.managementprocess.alerts.dto.AlertSeverity


case class NextHopRouterInaccessibleRule() extends PerDeviceRule with RuleHelper {

  override val metadata: RuleMetadata =
    RuleMetadata.builder(
      "cross_vendor_next_hop_router_inaccessible",
      "All Devices: Next hop inaccessible",
      "Indeni will review the routing table and identify when a next hop router is showing as FAILED or INCOMPLETE in the ARP table.",
      AlertSeverity.ERROR,
      Set(RuleCategory.HealthChecks)).build()

  override def expressionTree(context: RuleContext): StatusTreeExpression = {
    StatusTreeExpression(
      // Which objects to pull (normally, devices)
      SelectTagsExpression(context.metaDao, Set(DeviceKey), True),

      // What constitutes an issue
      StatusTreeExpression(
        // The time-series we check the test condition against:
        SelectSnapshotsExpression(context.snapshotsDao, Set("arp-table", "static-routing-table")).multi(),

        // The condition which, if true, we have an issue. Checked against the time-series we've collected
        StatusTreeExpression(

          JoinSnapshotsExpression("arp-table" -> "targetip", "static-routing-table" -> "next-hop")
            .distinct(InvisibleScopeKey("next-hop", "static-routing-table")),

          Equals(
            ScopeValueExpression("success").invisible("arp-table").optional(),
            ConstantExpression(Some("0"))
          )
        ).withSecondaryInfo(
          scopableStringFormatExpression("${scope(\"static-routing-table:next-hop\")}"),
          EMPTY_STRING,
          title = "Inaccessible Next Hops",
          invisibleIdKeys = Set(InvisibleScopeKey("next-hop", "static-routing-table"))
        ).asCondition()
      ).withoutInfo().asCondition()


      // Details of the alert itself
    ).withRootInfo(
      getHeadline(),
      scopableStringFormatExpression("Some of the routes in this device have a next hop which is inaccessible."),
      ConditionalRemediationSteps("Determine why the next hops are not responding.",
        ConditionalRemediationSteps.VENDOR_CP -> "Trying pinging the next hop routers in the list above and resolve any connectivity issues one by one until all pings are successful.",
        ConditionalRemediationSteps.VENDOR_PANOS -> "Log into the device over SSH and review the output of \"show arp\" to identify failures.",
        ConditionalRemediationSteps.OS_NXOS ->
          """|
             |1. Execute the "show spanning-tree" and "show spanning-tree summary"  NX-OS commands to quickly identify the STP root for all the configured vlans.
             |2. Run the "show spanning-tree vlan X detail" NX-OS command to collect more info about the STP topology (X=vlanid).
             |3. Check the event history to find where the Topology Change Notifications originate from by running the next NX-OS command "show spanning-tree internal event-history tree X brief" , (X=vlanid).
             |4. Display the STP events of an interface with the next NX-OS command "show spanning-tree internal event-history tree Y interface X brief" , (X=vlanid, Y=interfaceid).
             |5. Consider to hard code the STP root and backup root to the core switches by configuring a lower STP priority.
             |6. Activate the recommended vPC "peer switch" NX-OS command to a pure peer switch topology in which the devices all belong to the vPC.
             |7. Consider to use Root Guard feature to enforce the root bridge placement in the network. If a received BPDU triggers an STP convergence that makes that designated port become a root port, that port is put into a root-inconsistent (blocked) state.
             |8. For more information please review the following links:
             | <a target="_blank" href="https://www.cisco.com/c/en/us/support/docs/switches/nexus-5000-series-switches/116199-technote-stp-00.html">Spanning Tree Protocol Troubleshooting on a Nexus 5000 Series Switch</a>
             | <a target="_blank" href="https://www.cisco.com/c/dam/en/us/products/collateral/switches/nexus-7000-series-switches/C07-572834-00_STDG_NX-OS_vPC_DG.pdf">Spanning Tree Design Guidelines for Cisco NX-OS Software and Virtual PortChannels</a>
          """.stripMargin,
        ConditionalRemediationSteps.VENDOR_BLUECOAT ->
          """ARP resolve failure to the next hop of the ProxySG.
            |1. Login via SSH to the ProxySG and run the  "show arp-table" command.
            |2. Check for incomplete arp enteries.
            |3. Run the "show interface all" command and check the current status of the network interface with the incomplete arp entery.
            |4. Diagnose the layer 2 connectivity between the ProxySG to the other device.
            |5. If the problem persists, contact Symantec Technical support at https://support.symantec.com for further assistance.""".stripMargin,
        ConditionalRemediationSteps.VENDOR_JUNIPER ->
          """|1. Log into the device over SSH and enter “show arp no-resolve” command to review next-hop MAC and IP address information in ARP table.
             |2. Check for a misconfiguration on interfaces or a physical issue.
             |3. Review the following article on Juniper tech support site: <a target="_blank" href="https://www.juniper.net/documentation/en_US/junos/topics/reference/command-summary/show-arp.html#jd0e289">Operational Commands</a>""".stripMargin
      )
    )
  }
}