How to troubleshoot HA2 or HA2-backup keep-alive down
10519
Created On 07/09/24 15:50 PM - Last Modified 08/23/24 20:15 PM
Objective
- Verify physical and network connectivity.
- Verify configuration settings.
- Check firewall resources.
Environment
- NGFW
- HA2 keep-alive
- HA2-backup keep-alive
Procedure
- Read the ha_agent.log to track the timestamp of the issue and check the HA events that preceded it.
less mp-log ha_agent.log
- Check the physical connectivity of the HA2 link (HA2-backup link) by ensuring that the physical cables are properly connected. Check the brdagent.log and system.log:
less mp-log brdagent.log show log system direction equal backward show log system subtype equal port eventid equal link-change direction equal backward
- Verify the HA2 (HA2-backup) configuration and that the HA2 keep-alive thresholds are properly configured on both firewalls in the HA setup. Navigate to DEVICE > High Availability > HA communication > HA2 (HA2 Backup) in the UI.
- Verify that the correct interfaces are used for the HA2 (HA2 Backup) connection.
- Verify the HA2 keep-alive threshold: DEVICE > High Availability > HA communication > HA2 > HA2 Keep-alive > Threshold (ms).
- Check the network connectivity between the HA2 interfaces (HA2 backup interfaces). Use ping or traceroute:
- For HA2:
ping source <HA2's IP address of the FW> host <HA2's IP address of the peer FW> traceroute source <HA2's IP address of the FW> host <HA2's IP address of the peer FW>
- For HA2 Backup:
ping source <HA2 backup's IP address of the FW> host <HA2 backup's IP address of the peer FW> traceroute source <HA2 backup's IP address of the FW> host <HA2 backup's IP address of the peer FW>
- For HA2:
- Check the firewall resources by verifying the CPU, packet descriptor, and buffer utilization to ensure no resource issues at the time of the problem.
show running resource-monitor
- Ensure that the firewall is running the preferred PAN-OS software version and is not hitting any known SW issue:
- PAN-231507 : On PA-1400 Series firewalls only, when an HSCI interface is used as an HA2 interface, HA2 packets are intermittently dropped on the passive device, which can cause the HA2 connection to flap due to missing HA2 keepalive messages. Workaround: use data ports configured as HA2 interface.
Additional Information
HA2 keep-alive: When HA2 keep-alive is enabled, the firewall monitors the connection stability between itself and the HA peer on the HA2 connection.
A threshold can be set (in milliseconds) so that if the keep-alive packets do not reach the connected peer by that time, the HA2 connection is considered down.
HA2: The HA2 link synchronizes sessions, forwarding tables, IPSec security associations, and ARP tables between firewalls in an HA pair. Data flow on the HA2 link is always unidirectional (except for the HA2 keep-alive); it flows from the active or active-primary firewall to the passive or active-secondary firewall. The HA2 link is a Layer 2 link, and it uses ether type 0x7261 by default.
Ports used for HA2—The HA data link can be configured to use IP (protocol number 99) or UDP (port 29281) as the transport, allowing the HA data link to span subnets.
Refer to HA Links and Backup Links.