High-Availability - HA links status

High-Availability - HA links status

99770
Created On 04/26/22 16:46 PM - Last Modified 11/18/25 16:02 PM


Symptom


  • Detection of a link failure of one of the HA links between the firewalls in HA setup.
  • Display of HA links status by the firewall Dashboard or by the Strata Cloud Manager HA links status graph for firewalls with Strata Cloud Manager subscription.


Environment


  • PAN-OS
  • HA Link


Cause


If one of the HA links: HA1 link, HA2 link, HA backup links or HA3 link (in case of active-active) is down, then a firewall system log message is generated and the Strata Cloud Manager HA links status alert is triggered.

Resolution


  1. Initial Detection and Stabilization
    First, identify which HA link is down. You can find this information in the firewall's web interface by navigating to Dashboard > Widgets > High Availability or in the HA links status graph in Strata Cloud Manager (SCM).
    If the HA1 (Control) link is down and there is no functional HA1 backup, your immediate priority is to prevent a "split-brain" scenario. A split-brain occurs when both firewalls in the pair believe they are the active device, leading to network instability.
    To stabilize the setup:
    1. Manually suspend the secondary firewall:
      • In an Active/Passive (A/P) setup, suspend the passive device.
      • In an Active/Active (A/A) setup, suspend the active-secondary device.
    2. To do this, log in to the device that you need to suspend, navigate to Device > High Availability > Operational Commands, and click Suspend local device for high availability.
      This action isolates the secondary device, allowing you to troubleshoot the HA1 link without risking a split-brain event.
  2. Physical Layer (Layer 1) Troubleshooting
    Once the HA pair is stabilized, begin troubleshooting at the physical layer.
    1. Best Practice: Direct Connection: Whenever possible, connect HA ports directly between the two firewalls. Using intermediate switches or routers can introduce potential points of failure and communication issues.
    2. Port and Cable Verification:
  3. Data Link and Network Layers (Layer 2 & 3) Troubleshooting
    If the physical layer is confirmed to be healthy, proceed to the next layers.
    1. Layer 2 (for HA2, HA3, and HA1 via Switch)
      • If a switch connects the HA links, check the switch configuration for issues.
      • Ensure the relevant ports are in an active (non-blocked) state and are not affected by VLAN or Spanning Tree Protocol (STP) issues.
      • HA3 Specifics: The HA3 link uses  MAC-in-MAC encapsulation and requires jumbo frames to be enabled on any intermediate Layer 2 devices, as its proprietary header increases the packet size beyond 1500 bytes. The MTU on this path must accommodate the larger packet size.
    2. Layer 3 (for HA1 and HA2 via Router)
      The HA1 link is the only HA link that operates at Layer 3 by default. The HA2 link operates at Layer 2 by default and uses EtherType 0x7261, but it can be configured to use either IP (protocol number 99) or UDP (port 29281) as the transport. This allows the HA2 data link to span subnets and function as a Layer 3 link that requires an IP address. For more information about the HA2 link with IP or UDP transport, refer to How is the HA2 IP Information Communicated Between Peers When HA2 UDP/IP Transport is Enabled?
      If the HA link traverses a router, verify the router configuration and ensure proper IP connectivity between the HA1 (and HA2, if configured as a Layer 3 link) interfaces of both firewalls.
  4. Transport, Application Layer, and Configuration Troubleshooting
    Finally, verify the protocol and configuration settings.
    1. Firewall Port Configuration
      Ensure that the necessary TCP and UDP ports are allowed for communication between the firewalls. The specific ports vary based on the link and its configuration (e.g., clear text vs. encrypted). Refer to HA Links and Backup Links, and WHICH PORTS NEED TO BE OPENED FOR PAN-OS IN HA TO SYNC AND COMMUNICATE?.
      • HA1 (Clear Text): TCP ports 28769 and 28260.
      • HA1 (Encrypted): TCP port 28 (SSH).
      • HA2: Can be configured to use IP protocol 99 or UDP port 2928.
    2. HA1 Encryption
      Encryption is recommended when HA1 traffic traverses a network where it could be inspected or captured.

  5. Proactive Redundancy Measures
    To minimize downtime from a single link failure, implement redundancy for your HA connections.
    1. HA1 and HA2 Links: Configure backup links for both HA1 and HA2. It is a best practice for the primary and backup links to use different subnets.
    2. HA3 Link: The firewall does not support a dedicated backup link for HA3. To provide redundancy, configure a Link Aggregation Group (LAG) using two or more physical interfaces for the HA3 connection.


                  Additional Information


                  For more information about HA timers check HA timers document.

                  Actions
                  • Print
                  • Copy Link

                    https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA14u0000004OKTCA2&refURL=http%3A%2F%2Fknowledgebase.paloaltonetworks.com%2FKCSArticleDetail

                  Choose Language