automation for failovers

hey

im quite new here but i have a Feature Requestto bring up to the team ,

I would like to check with you an opinion to fail over an entire datacenter from one side to another

Meaning that if I have clustering on two datacenters with multiple devices , and I would like to make an automation \ interactive failovers to all active devices to be on one datacenter.


Example :

Datacenter A

Datacenter B

F5 Device – Passive

F5 Device -Active

Palo Alto Device – Active

Palo Alto Device – Passive

Juniper Device –Passive

Juniper Device – Active

Checkpoint Device – Passive

Checkpoint Device – Active


Management decision comes and I need to perform a full fail-over to datacenter B because of Electricity / maintenance issues

One click on Indeni , and move the devices


Datacenter A

Datacenter B

F5 Device – Passive

F5 Device -Active

Palo Alto Device – Passive

Palo Alto Device – Active

Juniper Device – Passive

Juniper Device – Active

Checkpoint Device – Passive

Checkpoint Device – Active


I've collected some samples commands to preform each task but I need to make it scheduled + add some delay to it .


F5

tmsh run /sys failover standby



checkpoint

clusterXL_admin down

clusterXL_admin up


and each device with it's way of failover commands can be added ...



Is that something we could perform with the help of Indeni product ?


If something is not fully understood please let me know.


Thanks

Haviv

Hi Haviv, I love this topic. We get questions like this a lot with regards to utilizing Indeni as an automation or orchestration type tool. At this time we want to keep Indeni pulling and anaylyzing metrics and device diagnostic information. Our focus is to connect the dots with device-specific events and preemptively alert.


I will suggest however...if you have an automation utility as a part of your infrastructure to leverage our REST API or other integration means like snmp traps or syslog. For example, if there is a "special" event triggered within Indeni, a trap could be sent to this automation utility, thus triggering an automatic reaction.


Hey Indeni Users out there, any suggestions or creative means to help Haviv accomplish his task?


Hi Haviv,

Interesting idea. Currently, our foucs is monitoring automation to avoid the same problem from reoccurring. The natural next step would be to actually perform actions as you suggested here. As Liz mentioned, we don't do this yet but we do have the components... device credential, transport, device state, parsers, etc.


Here's a conceptual idea, we can develop a script to

1) ssh into the device

2) execute the "failover" commands

3) Use CommandRunner to parse the response. If success, repeat 1) & 2). If not, roll back the failover.


Let us know if this is worth pursusing. We can brainstorm more...




If we are to focus how Indeni as a monitoring tool can help you it could potentially alert you if the active nodes are not in the same data center, but this would be a custom rule as it's not applicable to most installations and maybe not that useful compared to actual automation.


If I were you though I'd look into it and write a script for this specific purpose. At least the F5 part is very easy as you could use the REST API for this if you're running version 11.6 or above (11.5.4 could work too, but not recommended as the REST API was still in EA at that time).


Example of how to failover traffic-group-1 on yourltm.domain.local using user adminuser and password mypassword:

curl -k -u adminuser:mypassword https://yourltm.domain.local/mgmt/tm/sys/failover/ -H "Content-Type: application/json" -X POST -d "{"command":"run","standby":true,"trafficGroup":"traffic-group-1"}"

If you're brave and running R80 on the gateways you should also be able to use the Checkpoint API for this. Otherwise there is a possibility to just create a shell script using SSH. Letting the other experts chime in for the other vendors. :)


/Patrik

How are two data center connected (physically or logically) ?