How to troubleshoot an IKEv2 IPsec VPN tunnel brought down by DPD

How to troubleshoot an IKEv2 IPsec VPN tunnel brought down by DPD

12697
Created On 07/15/24 21:31 PM - Last Modified 07/18/24 16:22 PM


Objective


  • Verify the Liveness Check configuration on both endpoints of the IPsec VPN tunnel.
  • Use the timestamp of the DPD down event to correlate with other events that could affect the connectivity between the VPN peers.
  • Troubleshoot the IPsec VPN connectivity.
  • Collect debug packet captures to check if the firewall is sending and receiving Liveness Check packets.


Environment


  • IPsec VPN tunnel
  • IKEv2
  • DPD/ Liveness Check


Procedure


  1. Verify the Liveness Check configuration on both endpoints of the IPsec VPN tunnel. For NGFW, navigate to Network > Network Profiles > IKE Gateways > Advanced Options. Ensure that the Liveness Check is enabled and the interval matches the settings of the other end of the tunnel:Liveness Check
    1. IKEv2 uses a liveness check (similar to Dead Peer Detection (DPD) in IKEv1) to determine whether a peer is still available. The liveness check option is enabled by default.
  2. When the IPsec tunnel goes down due to DPD the logs messages below will show up in the system log:
    > show log system direction equal backward
    2019/03/06 18:24:04 low      vpn     seapa- ikev2-n 0  IKEv2 IKE SA is down determined by DPD.
    2019/03/06 17:26:27 low      vpn     seapa- ikev2-n 0  IKEv2 IKE SA is down determined by DPD.

    and the messages below will show up in the ikemgr logs:
    > less mp-log ikemgr.log
    2019-03-06 17:26:28.000 -0800  [INFO]: \{    1:    1}: DPD down, rekey vpn tunnel <awsseapa-to-azeapa>, SA state ESTABLISHED
    2019-03-06 18:24:05.000 -0800  [INFO]: \{    1:    1}: DPD down, rekey vpn tunnel <awsseapa-to-azeapa>, SA state ESTABLISHED
    The default interval of liveness checking is every 5 seconds when SA is idle. Upon losing connection, the firewall will do 10 liveness retries. After maximum retries are reached, the firewall will tear down phase 1 and phase 2 SAs. Look for the messages above and note the timestamp so you can use it to correlate other related events, which will help find the root cause of the liveness check failure.
  3. When the IPsec tunnel goes down because of DPD that is an indication that there is a connectivity issues between the IPsec VPN peers. For more details on how to remediate refer to How to Troubleshoot IPSec VPN connectivity issue
  4. For an ongoing issue of tunnel down due to DPD and if in doubt that the firewall is receiving or sending the liveness check packet which is the empty informational packet. Set a maintenance window to collect a debug ikemgr packet capture:
    debug ike pcap on
    view-pcap no-dns-lookup yes no-port-lookup yes debug-pcap ikemgr.pcap
    scp export debug-pcap from ikemgr.pcap to username@host:path
    debug ike pcap off 
    
    Note: Be very cautious with above commands specially with the one highlighted in orange as it may become resource intensive that is why it is recommended to issue it during a maintenance window.
    1. If the firewall is sending and receiving the empty informational liveness check packet then the packet capture will look like below:packet-capture
    2. Similarly you can perform a dataplane packet capture between the two endpoints of the tunnel by setting the source and destination of the packet filter to be the IP addresses of the endpoints as shown in the picture below:packet-capture 3
    3. Enable the debugs for the affected gateway using the command:
      debug ike gateway <name of the gateway> on debug
      The ikemg logs in this case will show the below messages:
      2024-07-15 10:55:41.005 -0700  [DEBG]: 10.46.36.241[500] - 10.46.36.240[500]:(nil) 1 times of 76 bytes message will be sent over socket 1024
      2024-07-15 10:55:41.007 -0700  [DEBG]: processing isakmp packet
      2024-07-15 10:55:41.007 -0700  [DEBG]: ===
      2024-07-15 10:55:41.007 -0700  [DEBG]: 76 bytes message received from 10.46.36.240
      2024-07-15 10:55:41.007 -0700  [DEBG]: {    2:     }: [IKE Initiator] response message_id 704 expected 704
      2024-07-15 10:55:41.013 -0700  [DEBG]: {    2:     }: response exch type 37
      2024-07-15 10:55:41.013 -0700  [DEBG]: {    2:     }: update response message_id 0x2c0
      2024-07-15 10:55:46.005 -0700  [DEBG]: 10.46.36.241[500] - 10.46.36.240[500]:(nil) 1 times of 76 bytes message will be sent over socket 1024
      2024-07-15 10:55:46.006 -0700  [DEBG]: processing isakmp packet
      2024-07-15 10:55:46.006 -0700  [DEBG]: ===
      2024-07-15 10:55:46.006 -0700  [DEBG]: 76 bytes message received from 10.46.36.240
      2024-07-15 10:55:46.006 -0700  [DEBG]: {    2:     }: [IKE Initiator] response message_id 705 expected 705
      2024-07-15 10:55:46.014 -0700  [DEBG]: {    2:     }: response exch type 37
      2024-07-15 10:55:46.014 -0700  [DEBG]: {    2:     }: update response message_id 0x2c1
    4. Disable the debugs using the command:
      debug ike gateway <name of the gateway> off
  5. IMPORTANT Note about IKEv2 liveness check behavior:
    1. DPD (which is the liveness check in the case of IKEv2) is always on. All IKEv2 packets besides the empty informational packet serve the purpose of liveness check.
    2. Liveness check packet (informational) is only sent out while there is no activity after dpd_interval over the IKE SA and child SA.
    3. enable: if it set to yes, empty informational message will be sent out after some time of inactivity (IKE). This wait time is defined by dpd_interval.
    4. If liveness check failed, the IKE SA and all child SAs setup through that IKE SA are deleted. IKE gateway will start a new IKE_SA_INIT exchange.
  6. Based on the packet capture collected between the two peers of the tunnel, the resolution of the problem may include adjusting the liveness check interval value, by increasing it on both endpoints to make the check less strict. Other resolutions could involve correcting any misconfigured policies, configuring proper route entries, or addressing any potential network issues affecting the reachability of the remote peer.


Additional Information


Liveness Check: If there has only been outgoing traffic on all of the SAs associated with an IKE SA, it is essential to confirm the liveness of the other endpoint to avoid black holes. IKEv2 gateways can perform liveness checks to prevent sending messages to a dead peer. Receipt of a fresh cryptographically protected message on an IKE SA or any of its child SAs ensures the liveness of the IKE SA and all of its child SAs.

Actions
  • Print
  • Copy Link

    https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA14u000000HDbVCAW&lang=en_US&refURL=http%3A%2F%2Fknowledgebase.paloaltonetworks.com%2FKCSArticleDetail

Choose Language