TCP session end reason aged-out in VM-series firewalls behind AWS Gateway Load Balancer

TCP session end reason aged-out in VM-series firewalls behind AWS Gateway Load Balancer

13190
Created On 11/25/22 16:07 PM - Last Modified 01/02/24 14:56 PM


Symptom


  • TCP sessions passing through one of the multiple VM-series firewalls behind a Gateway Load Balancer (GWLB) show "Session end reason" as "aged-out" under Monitor > Logs > Traffic
aws-gwlb-tcp-session-aged-out.PNG
  • The packet capture at the receive stage on the VM-series firewall that receives the traffic from the AWS GWLB and creates the original session (PA-VM-1) shows the correct establishment of (i) the underlay GENEVE tunnel between the GWLB (172.21.200.78) and the firewall's untrust interface (172.21.200.84), (ii) the TCP 3-way handshake and (iii) TLS exchange, as well as sending and receiving of some Application Data traffic and TCP Acknowledgements (ACK) between the client (172.21.43.170) and server (XXX.18.226.52) outside the AWS environment and facing the public internet, as part of the overlay TCP session.
aws-gwlb-tcp-tls-handshake.PNG
  • At some point the packet flow stops (for example, because the client is idle). If either client or server hit their TCP session timeout limit (for example, 400 seconds for the server below), they will send a TCP FIN-ACK message to gracefully terminate the TCP session.
aws-gwlb-tcp-timeout-server.PNG
  • However, the AWS GWLB's TCP session timeout is 350 seconds, so the underlay GENEVE tunnel with PA-VM-1 for that TCP session will have already been torn down. Therefore, the server's original TCP FIN-ACK and subsequent retransmissions (as it does not hear back from the client) will not be seen on PA-VM-1's packet capture at the transmit stage.
  • At some point, the client will send another request to the server, but the AWS GWLB will consider it as a brand new TCP session. The GWLB will consequently choose a firewall in its target group, which may or may not be the same firewall as the one in which the TCP session was originally created, and create a new underlay GENEVE tunnel with said firewall. For example, in this case, it selects a different firewall, PA-VM-2. As PA-VM-2 never saw the initial TCP 3-way handshake, it will drop the packets sent by the client to the server. This can be seen in PA-VM-2's simultaneous packet capture to that of PA-VM-1 at both the receive and drop stages.
aws-gwlb-tcp-timeout-drop.PNG
  • Meanwhile, the original TCP session in PA-VM-1 will eventually timeout and appear as "Session end reason" "aged-out" under Monitor > Traffic > Logs. No session will be shown under PA-VM-2's traffic logs, given that the original 3-way TCP handshake was not captured and hence a session will not have been created.


Environment


  • Amazon Web Services (AWS)
  • Consumer VPCs connected via Transit Gateway to a Security VPC
  • Security VPC contains multiple VM-series firewalls in different availability zones (AZ) behind a Gateway Load Balancer (GWLB)
  • Each availability zone in the Security VPC contains a Gateway Load Balancer Endpoint (GWLBE)
  • VM-series firewall (PAN-OS 10.2.2-h2)


Cause


The AWS GWLB's TCP session timeout is 350 seconds, while that of the client and server is higher.

Resolution


  1. Reduce the client's or the server's TCP session timeout to a value less than that of the AWS GWLB's (e.g. 325 seconds); or
  2. Enable TCP Keep-alive messages on the client, set to a value less than that of the AWS GWLB's (e.g. 325 seconds).
NOTE: Reducing the VM-series firewalls' TCP timeout from its global default of 3,600 seconds or the default TCP timeout specific to the affected application under Objects > Applications (e.g. 1,800 seconds for ssl) to a value less than that of the AWS GWLB's (e.g. 325 seconds) will not resolve the issue because the VM-series firewall also terminates the TCP session silently (i.e. without sending TCP RESET messages to both client and server), just like the AWS GWLB. Hence, both client and server will remain unaware that the TCP session has been terminated. This firewall behaviour cannot be modified.


Additional Information


PALO ALTO NETWORKS DOCUMENTATION
When Does Palo Alto Networks Firewall Send a TCP Reset (RST) to Terminate a Session?

AWS DOCUMENTATION
Best practices for deploying Gateway Load Balancer


Actions
  • Print
  • Copy Link

    https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA14u000000kFMPCA2&refURL=http%3A%2F%2Fknowledgebase.paloaltonetworks.com%2FKCSArticleDetail

Choose Language