DHCP sessions between branch FW configured as DHCP Relay and DHCP server at Hub location go into DISCARD state after a power outage.
10700
Created On 02/01/22 16:12 PM - Last Modified 05/07/24 18:43 PM
Symptom
- Client devices at branch office are unable to get IP addresses
- On the Branch location FW, DHCP sessions between the FW (DHCP Relay) and the DHCP server at the Hub location are seen to be in a DISCARD state
admin@Lab70-158-PA-220> show session all filter application dhcp
--------------------------------------------------------------------------------
ID Application State Type Flag Src[Sport]/Zone/Proto (translated IP[Port])
Vsys Dst[Dport]/Zone (translated IP[Port])
--------------------------------------------------------------------------------
64858 dhcp DISCARD FLOW NS 192.168.2.1[67]/L3-Trust/17 (10.129.72.158[53110])
vsys1 192.168.1.5[67]/L3-Untrust (192.168.1.5[67])
- DISCARD session end-reason is policy-deny due to appid policy lookup deny
admin@Lab70-158-PA-220> show session id 64858
Session 64858
c2s flow:
source: 192.168.2.1 [L3-Trust]
dst: 192.168.1.5
proto: 17
sport: 67 dport: 67
state: DISCARD type: FLOW
src user: unknown
dst user: unknown
s2c flow:
source: 192.168.1.5 [L3-Untrust]
dst: 10.129.72.158
proto: 17
sport: 67 dport: 53110
state: DISCARD type: FLOW
src user: unknown
dst user: unknown
start time : Fri Jan 28 04:23:30 2022
timeout : 60 sec
time to live : 58 sec
total byte count(c2s) : 8312
total byte count(s2c) : 0
layer7 packet count(c2s) : 24
layer7 packet count(s2c) : 0
vsys : vsys1
application : dhcp
rule : vsys1+interzone-default
service timeout override(index) : False
session to be logged at end : True
session in session ager : True
session updated by HA peer : False
address/port translation : source
nat-rule : Trust-NAT(vsys1)
layer7 processing : enabled
URL filtering enabled : False
session via syn-cookies : False
session terminated on host : True
session traverses tunnel : False
session terminate tunnel : False
captive portal session : False
ingress interface : ethernet1/2
egress interface : ethernet1/1
session QoS rule : N/A (class 4)
tracker stage firewall : appid policy lookup deny
end-reason : policy-deny
Environment
- NGFW Firewall
- LSVPN
- Dynamic routing between hub and satellite locations using BGP
- Branch location firewall has an interface configured as a DHCP Relay
Cause
- DHCP (UDP) session is initially created/allowed when the tunnel is up (matching expected security policy) and going through the tunnel (zones: Trust -> Trust)
- When the tunnel goes down, BGP learned route (via tunnel peering) is removed and default route takes effect
- New DHCP traffic creates a new session (slowpath) via "potential" security policy match, but after app-id is performed, firewall decides to DISCARD session (as expected, based on the security policies in place) (zones: Trust -> Untrust; DHCP not allowed)
- Since session isn't blocked prior to session creation (due to potential match before app-id is determined), DISCARD session gets refreshed with new traffic with the same 6-tuple
- When the tunnel is back up, traffic may still be hitting/refreshing the old (DISCARD) session, thus not allowing for the new one to be created and DHCP traffic to be forwarded
Resolution
There are workarounds in place:
- Create a strict deny rule at the top targeting Trust -> Untrust direction that is port based, avoiding potential security policy matches before app-id is determined; This will insure that the traffic is denied in the slowpath stage and doesn't create a session (which needs to be cleared manually afterwards)
- Similar to above, but using DoS Policy with "Deny" action; It is less resource intensive as its processed prior to security policies, and we can match specific source/destination pairs
- Create a static null (Discard) route matching specific destination/egress interface with lower admin distance than the default route, but higher than BGP route; In this case we can't be as specific with source/port/app-id