When does an HA node go into Suspended state due to Non-Functional loop?
187573
Created On 09/25/18 19:52 PM - Last Modified 10/02/23 17:35 PM
Symptom
One of the firewalls in a High Availability pair (HA) moves into the "suspended" state due to Non-functional loop.
The device which has a higher priority and a lower value, moves into this state of suspended (Non-functional loop detected)
HA link monitoring interface triggers an active-passive loop even when cables are not connected
This is slightly different from device moving to suspended due to Preemption loop. Refer: When does an HA node go into Suspended state due to Preemption loop ?
Environment
- Devices in High Availability HA configuration.
- Link monitoring OR path monitoring is configured on individual nodes.
- Passive link state is set to shutdown . Refer What is the Difference Between Auto and Shutdown Mode for Passive Link?
- The monitored link remains disconnected or down on both the devices in the HA pair.
Cause
- When a link or path monitoring (or both) failure condition is detected, the Active device moves to non-functional state. Refer: https://docs.paloaltonetworks.com/pan-os/9-0/pan-os-admin/high-availability/ha-firewall-states.html
- The monitoring will take effect even if the cables are not connected and the active firewall will move from active to non-functional.
- When the active firewall moves to passive the peer firewall, which was previously passive will move to active, and again the link monitoring will take effect. As the cables are not connected/link down the firewall will transition again from active to non-functional and then finally to passive.
- The node moves into "Suspend" state due to non-functional loop if "Maximum number of flaps" are observed.
- A flap is counted when the firewall leaves the active state within 15 minutes after it last left the active state.
- This value indicates the maximum number of flaps that are permitted before the firewall is suspended and the passive firewall takes over (range 0-16, default 3).
- Maximum number of flaps can be configured as follows:
Resolution
Additional Information
Flap-Max Timer Setting
The flap-max is the number of times a device is allowed to go into a Non-Functional or Tentative state before moving into a Suspended state to keep the devices from flapping. The flap-max is defaulted to 3 and is cleared on the system after 10 to 20 minutes depending on the kind of loop that is being detected. A Non-Functional failure counts a "flap" or loop whenever a device goes into a Non-Functional state. A preemption loop is counted every time a device preempts the other device and on every failure this count is checked against the flap-max.