Delay In HA Failover When Active Firewall Goes Non-Functional
6044
Created On 09/28/22 21:27 PM - Last Modified 04/02/24 22:59 PM
Symptom
- After the Primary-Active firewall has gone to non-functional, there's a traffic outage for 5-10min until the Primary boots up again.
- Secondary-Passive firewall show HA status changed to Active but the traffic outage still exists.
- System logs (show log system) on both firewalls indicate the message "Ignoring session synchronization due to HA2-unavailable".
- After the HA2 links on both HA peers are Up, session synchronization will complete and the traffic will start passing successfully through the Secondary-Active firewall.
- There are no logs related to keep-alive in the System logs.
- 'ha_agent' (less mp-log ha_agent.log) logs indicate that keep-alive setting is turned off.
System logs:
info ha session 0 HA Group 1: Completed session synchronization with peer
info ha session 0 HA Group 1: Starting session synchronization with peer on slots 1
info ha ha2-lin 0 HA2 peer link up
info ha ha2-lin 0 HA2 link up
info port HA2 link-ch 0 Port HA2: Up 40Gb/s-full duplex
high ha session 0 HA Group 1: Ignoring session synchronization due to HA2-unavailable
critical ha ha2-lin 0 All HA2 links down
critical ha ha2-lin 0 HA2 link down
info port HA2 link-ch 0 Port HA2: Down 40Gb/s-full duplex
critical ha ha2-lin 0 All HA2 links down
high ha session 0 HA Group 1: Ignoring session synchronization due to HA2-unavailable
high ha ha2-lin 0 HA2 peer link down
high ha state-c 0 HA Group 1: Moved from state Passive to state Active
ha_agent logs
0700 debug: ha_dpmon_peer_action_set(src/ha_dpmon.c:155): Setting peer keep-alive setting to off. <<----
Environment
- Palo Alto Networks firewalls
- PAN-OS (All)
- High Availability Active/Passive
Cause
- HA2 keep-alive is not enabled on HA2 link.
- HA2 keep-alive is a mechanism to validate the health of the HA state synchronization path (HA2).
Resolution
- Enable HA2 Keep-alive on HA2 link.
- When enabled, the peers will use keep-alive messages to monitor the HA2 connection to detect a failure based on the Threshold set (default is 10,000 ms).
- Once enabled, the HA2 Keep-alive recovery Action will be taken.
- HA2 keep-alive option can be configured on both firewalls, or just one firewall in the HA pair.
- If the option is only enabled on one firewall, only that firewall will send the keep-alive messages. The other firewall will be notified if a failure occurs.
- To enable go to GUI Device > High Availability > General, edit the Data Link (HA2) section.
- To check HA2 keep-alive settings, use > show high-availability ha2_keepalive
Additional Information
Please refer the following articles for better understanding of HA2-keep-alive feature: