R80.10 Clusters only broadcast active GW.

A weird issue ensued with an R80.10 cluster (13500 models). thoughts?


FW1 mngmt: 10.10.10.1

FW2 mngmt: 10.10.10.2


Initially, FW2 kept failing to connect. We checked the arp table on the indeni instance, FW2 MAC address was missing. It seems like FW2 was no longer broadcasting its IP. SSH and Ping failed with "No Route to Host".


Luckily, both FW have mngmt accessible through the DMZ interface. We then forced a failover by rebooting FW1. As soon as FW2 became the primary, Indeni was able to connect to FW2 and the alert resolved.


However, we then alerted that FW1 had ClusterXL issues , identified VPN tunnels were down, and within 10 minutes, alerted "failed to connect" to FW1.


The same issue had occured with FW1 - it stopped broadcasting its MAC and Indeni no longer had a route to device. SSH/Ping failed. We tried to force another failover by rebooting FW2, and we were back to square 1 with FW2 failing to connect.

I've seen something like this, we usually don't use a management port but a synch interface to ensure that is the only traffic on the ports.

Interesting. How are these two interfaces configured in the topology at the SmartConsole?

And what is the output of "cphaprob state"

Also, now when you have the issue and is unable to connect to FW2, is it possible to ping/ssh from FW1 to FW2 (to 10.10.10.2)?

very weird stuff.