Resolving the URL Category in Decryption When Multiple URLs Use the Same IP
A problem occurs when there are multiple web services behind the same IP, as is the case with Google, which hosts all its services (such as Drive, Translate, Search engine, Google+, Maps, Play, Gmail, Calendar and so on) behind the same group of IP addresses.
In cases where DNS resolves both www.google.com and www.drive.google.com in the same IP address (for example, 18.104.22.168), hosts use the same IP for both google.com and drive.google.com. So, if the first session traffic is to www.google.com, the local cache maps 22.214.171.124 to “search-engines." Then, if the next host goes to www.drive.google.com using the same destination IP, the URL category will be resolved in “search-engines” instead of “online-personal-storage."
A decryption policy set to decrypt only the “online-personal-storage” category misses this combination of traffic and real drive.google.com data will not be decrypted.
When troubleshooting issues related to SSL decryption, a good starting point is to understand how a decryption mechanism works with URL categorization. To establish a secure SSL tunnel, the client and server perform a method of authentication. The client usually authenticates the server’s identity based on its certificate. HTTPS connection is always initiated by the client that first resolves the server’s URL, then sends a Client Hello to the resolved IP address. The client then waits for a response from the server side, which should include its certificate.
To resolve the proper URL category and determine whether to decrypt certain SSL traffic, the Palo Alto Networks firewall relies on the Common Name (CN) field of the certificate received from the server. So, URL categorization is based on what is found in the CN field. The resolved URL category is then mapped to the destination IP of the intercepted packet sent from client side. To speed up the process of resolving the URL category, the firewall stores each URL to the destination IP mapping in its local cache memory. So, the next instance of SSL traffic to the same destination is resolved in the URL category already stored in local cache file. The mechanism of URL categorization for purposes of decryption looks like:
- Client Hello message is intercepted by the firewall
- Firewall determines packet’s destination IP
- Firewall compares that destination IP with the list of IP-to-URL category mapping from its local cache memory
- If the same IP is in the list, the URL category is then taken from local cache memory
- If no match is within local cache, the firewall waits for a response from the server to take a look in the server certificate's CN field
- URL resolution is based on the CN field, and that category is mapped to the server’s IP and added to the list in local cache memory for future use
PAN-OS 6.0 introduces a new method of resolving the URL category for purposes of decryption. This new method is not based on the server's certificate CN field, but on the SNI (Server Name Indication) value of the SSL ClientHello message. Using this method ensures that under each circumstance, the Palo Alto Networks firewall can properly resolve the URL category of upstream traffic and, with that information, engage the correct decryption policy.
Note: The SNI field is not supported by older versions of browsers, such as IE 8.0, which is the latest version in Win XP. So, the solution with SNI does not work for WinXP end hosts or other clients using an old browser version. In this case, or for firewalls running PAN-OS earlier than 6.0, the workaround is to create a broader decryption profile to comprise the URL category of each service located behind the same IP address.