How to Troubleshoot VoIP Issues with Palo Alto Networks Firewall

How to Troubleshoot VoIP Issues with Palo Alto Networks Firewall

197247
Created On 12/28/18 07:07 AM - Last Modified 01/11/24 21:29 PM


Objective


VoIP (i.e., transmission of voice over internet protocol) is highly used in today's world because it is economical and scalable.

However, there are challenges when using an IP network for VoIP communication such as:
  1. Voice transmission delay: Packets routed over the internet, unlike the traditional telephone switching equipment, making it more vulnerable to losses
  2. Complexity: Call establishment, call termination
  3. Interoperability: Multiple players in the domain. While most of them adhere to the RFCs, some deviations cause interoperability issues 
  4. Backward compatibility with existing PSTN (Public Switched Telephone Network)

In general, VoIP traffic has two components:
  1. Signalling:
    • Process of establishing and terminating calls
    • Commonly used protocols are SIP, H.323, MGCP, Skinny, etc.
  2. Audio/Video Data transmission
    • Audio is transferred using the Real-time Transport Protocol (RTP)
    • RTP message is encapsulated in a UDP datagram that is further encapsulated in an IP datagram for transmission

Initially, a parent signaling session is established between the entities involved. Information for RTP/RTCP communication is sent through a signaling channel, after which the RTP/RTCP streams are used for actual data.

Because of varied number of implementations for VoIP solutions, it is hard to explain or predict the behavior of Palo Alto Networks firewalls for all those solutions. However, there are general guidelines to help troubleshoot any VoIP Issues.


Environment


  • Palo Alto Firewall
  • VoIP


Procedure


Step 1: Identify the signaling protocol and product brief

This step is very important to understand the communication flow. Different signaling protocols have different message structures, and understanding the fields in the signaling communication is the key to debugging VoIP Issues.
For example:
  1. SIP uses SDP (Session Description Protocol) to exchange information about RTP/RTCP Streams.
  2. H.323 uses RAS and H.245 channels to exchange information about RTP/RTCP Streams.
All VoIP solutions have their product briefs, so it is helpful to understand the implementation. The product briefs often specify the settings on firewalls to allow a successful implementation.

Step 2: Identify the entities involved

Commonly used entities in SIP are:
  • User Agent Client (UAC): The entity that sends a request and receives a response.
  • User Agent Server (UAS): The entity that receives a request and sends a response.
  • Proxy Server: Sits in between two user agents and takes a request from a user agent and forwards it to another user.
  • Registrar Server: Accepts registration requests from user agents, helps users authenticate themselves within the network. It stores the URI and location of users in a database to help other SIP servers.
Commonly used entities in H.323 are:
  • Terminals: Devices that users would normally encounter like simple IP phone or a powerful, high-definition videoconferencing system.
  • Multi-point control units: Responsible for managing multi-point conferences. By placing a video call into an H.323 MCU, the user might be able to see all of the other participants in the conference, not only hear their voices.
  • Gateways: Enable communication between H.323 networks and other networks. Also used in order to enable videoconferencing devices based on H.320 and H.324 to communicate with H.323 systems.
  • Gatekeepers: (Optional component in the H.323 network.) Services include endpoint registration, address resolution, admission control, user authentication, and more. Gatekeepers may be designed to operate in one of two signaling modes, namely "direct routed" and "gatekeeper routed" mode. Direct routed mode is the most efficient and most widely deployed mode. In this mode, endpoints utilize the RAS protocol in order to learn the IP address of the remote endpoint and a call is established directly with the remote device. In the gatekeeper routed mode, call signaling always passes through the gatekeeper. While the latter requires the gatekeeper to have more processing power, it also gives the gatekeeper complete control over the call and the ability to provide supplementary services on behalf of the endpoints.
Additionally, there could be common components like PBX (private branch exchange), and Media Servers.

Step 3: Identify the physical and logical network topology

In order to understand the role of a firewall and troubleshooting any issues, it is imperative to understand the location of all the components involved as identified in Step 2 with respect to a firewall.

Some common possible topologies are:

Endpoints --------- Firewall --------- Internet --------- Servers, Endpoints

Endpoints, Servers --------- Firewall --------- Internet --------- Servers, Endpoints

Servers --------- Firewall --------- Internet --------- Servers, Endpoints

(Servers above represent any components like proxy, registrar, gateways, gatekeeper, etc.)

Step 4: Identify whether the firewall is doing NAT (inbound destination NAT/ outbound source NAT, static NAT) for any of the communications involved

This is crucial to identify the involvement of firewall VoIP ALGs. Also, identify if endpoint, PBX, or Proxy Servers are capable of NAT traversal for VoIP : STUN or TURN

Role of Firewall in VoIP Communication:
  1. Identifying the signaling application protocol using App-ID and allows or blocks based on security policies
  2. ALG is invoked if enabled, after which the firewall performs two important functions for the consecutive communication: Application Level Gateways
    1. It opens dynamic sessions called Predict Sessions where RTP channel info is communicated over signaling channel. These predict sessions are required to allow inbound and outbound RTP/RTCP streams which use random ports. This is important at least to allow inbound UDP connection. Otherwise, you will have to open high UDP ports in configuration thus exposing your network or attacks.
    2. Most important function of ALG is to perform NAT on the payloads of the signaling channel. That is when an endpoint or proxy server sends its private IP in the SDP or H245 channel as RTP parameters. ALG is supposed to translate them to the public IP as per the NAT rules configured. This is important. Otherwise, the RTP communication will not work resulting in audio or video issues.
    3. For SIP, check the SDP Payload in SIP Invite and SIP 200 OK packets. They contain the IP address for RTP in Connection Header and Ports in Media:
User-added image
 
 
For H.323, check the openLogicalChannel and openLogicalChannelACK packets. They contain the IP address for RTP in Connection Header and Ports in Media:
 
User-added image
 
You should see if the IPs in TX are Natted, if applicable, or else further communication will surely break down.

NOTE: If internal endpoints and servers are capable of NAT Traversal, firewall ALG's NAT function is redundant in which case only pin-holing is the main task. In this case, ALGs can be disabled but ports for RTP and RTCP will need to be allowed statically in policies. 


What information to collect:
  1. Complete the above steps and document it (i.e., signaling protocol, entities, topology and presence of NAT)
  2. Setup a packet capture on the Palo Alto Networks firewall: HOW TO RUN A PACKET CAPTURE . Use specific filters to look into the initial signaling communication first. Also, if NAT is involved, use a filter for Pre NAT C > S and Post NAT S > C.
Here's an example:
Let's assume the following communication flow:
  1. Client establishes H225 channel with External Server
Client (192.168.1.100)----Firewall (NAT to 198.51.100.100)-- H.225 ------- 203.0.113.100
  1. In H.225 channel, dynamic information for H.245 channel is sent across, which is a different external IP.
Client (192.168.1.100)----Firewall (NAT to 198.51.100.100)-- H.245 ------- 203.0.113.101
  1. In H.245 channel, openLogicalChannel packets are used to exchange IPs and Ports for RTP/RTCP, which are for a third public IP:
Client (192.168.1.100)----Firewall (NAT to 198.51.100.100)-- RTP/RTCP ------- 203.0.113.102
 
 

In the above example, the ideal filters for captures will be:
Filter 1: 192.168.1.100 > 203.0.113.100
Filter 2: 192.168.1.100 > 203.0.113.101
Filter 3: 192.168.1.100 > 203.0.113.102
Filter 4: 203.0.113.100 > 198.51.100.100
Filter 5: 203.0.113.101 > 198.51.100.100
Filter 6: 203.0.113.102 > 198.51.100.100

Although Palo Alto Networks firewalls are bidirectional in nature (e.g., they can capture both C2S and S2C flows with a single filter matching C2S parameters). However, there are times when it does not yield both direction Pcaps.

Pre PAN-OS 8.1, it was not possible to setup netmask in Pcap Filters. However, this option is available on versions after PAN-OS 8.1.

Best Filters to cover the above pre PAN-OS 8.1 would be (keep in mind a limit of 4 filters):
Filter 1: Source = 192.168.1.100
Filter 2: Source = 203.0.113.100, Protocol = 6
Filter 3: Source = 203.0.113.101, Protocol = 6
Filter 4: Source = 203.0.113.102, Protocol = 17

NOTE: The above filters could result in a large capture if the amount of traffic from 192.168.1.100 is huge. Exercise caution when breaking the captures into multiple attempts, or do this during non-business hours when there is less traffic.
  1. If possible, start the signaling communication from fresh by clearing existing sessions or restarting the client.
  2. External captures may also help, such as with the clients, inside server, or some other router or switch in path of the communication.
  3. On CLI, run the following commands while running a test with captures enabled:
> show session all filter source 192.168.1.100
> show session all filter source 203.0.113.100
> show session all filter source 203.0.113.101
> show session all filter source 203.0.113.102
> show session all filter type predict
> show running appinfo2ip
> show counter global filter packet-filter yes
  1. The above captures will give an understanding of how the communication is working, and what all functions are being performed by firewall. Compare the receive and transmit stage captures to see if NAT was done properly on Payloads. Check if any drop packets are seen.
  2. If any problems are noticed, then we can now additionally enable packet-diags for the same test again. This time we have to pay focus on the processing of the packets.


Additional Information


It's important to use flow basic and ctd basic for data collection to understand the operation of ALGs: Some common issues:
  1. Phones are unable to register or initial connection issues
  2. Phones are able to register but unable to make calls.
  3. Phones are able to make calls but audio/video is not working or only one way audio/video.
  4. Audio/video is working but the quality is very poor.
NOTE:
  1. If there is NAT involved and the clients or servers do not have NAT Traversal capability or they have setting to configure static Public IPs on the system, then ALG will be mandatory.
  2. Disabling ALG should be done only if it is verified that none of the ALG functionality will be required.
  3. Sometimes, it is better to App Override the parent signaling sessions to avoid the ALG functionality rather than disabling ALG globally. This will hold in cases when some traffic need ALG and some don't.
  4. Make sure policies are configured to allow RTP/RTCP traffic as well in policies, as even if predict sessions are formed, the policy lookup is still done to check if application is allowed or not.


Actions
  • Print
  • Copy Link

    https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA10g000000CmUiCAK&refURL=http%3A%2F%2Fknowledgebase.paloaltonetworks.com%2FKCSArticleDetail

Choose Language