How to troubleshoot the connection failure to MLAV cloud server
32852
Created On 08/14/24 17:52 PM - Last Modified 07/11/25 15:57 PM
Objective
- Find the reason behind the connection loss between the firewall and the MLAV cloud server.
- Retrieve the connection between the firewall and the MLAV cloud server
Environment
- Next Generation Firewall
- MLAV
Procedure
- Check the status of the connection between the firewall and the MLAV cloud server. Run the command:
> show mlav cloud-status
Example output:> show mlav cloud-status MLAV cloud Current cloud server: ml.service.paloaltonetworks.com Cloud connection: disconnected
In certain scenarios, you might get the following output:> show mlav cloud-status Server error : device busy or unavailable
- Check the TCP connection status between the firewall and the MLAV cloud server:
- Check the service route to the MLAV cloud server, it is the same as configured for "Palo Alto Networks Services".
- If the default is used that means it is the management interface and no need to add the parameter "source" to the command in 2.b
- If it is configured to be a dataplane interface then add the IP address of that interface as the source of the ping.
- Use the ping command to check the network reachability to the MLAV cloud server and to resolve the IP address of its FQDN:
- For default or management service route to "Palo Alto Networks Services", use the command:
ping host ml.service.paloaltonetworks.com
- For dataplane interface service route to "Palo Alto Networks Services", use the command:
ping source <IP address of the dataplane interface configured as service route to PANW Services> host ml.service.paloaltonetworks.com
- For default or management service route to "Palo Alto Networks Services", use the command:
- Check the netstat using the resolved IP address:
show netstat numeric-hosts yes numeric-ports yes | match <IP address of the MLAV cloud server>
Ensure that no device between the firewall interface used as service route to PANW Services and the MLAV cloud server is blocking port 443.
- Check the service route to the MLAV cloud server, it is the same as configured for "Palo Alto Networks Services".
- For further troubleshooting, perform a packet capture of the traffic between the firewall interface used as service route to PANW Services and the MLAV cloud server. Also check the devsrv.log look for Error message related to mlav.
less mp-log devsrv.log
- Common issues and resolutions:
- MLAV server certificate validation error. This issue may occur if there is a device decrypting the traffic between the firewall interface used as service route to PANW Services and the MLAV cloud server. The devsrv.log messages will look like below:
2023-08-28 13:21:23.813 -0400 Error: pan_mlav_get_metadata(pan_mlav_cloud.c:1667): Failed to send metadata query 2023-08-28 13:36:23.882 -0400 Error: pan_mlav_get_ver_from_file(pan_mlav_cloud.c:187): Failed to open file[/opt/pancfg/mgmt/content/pan_threatversion] 2023-08-28 13:36:24.605 -0400 Error: verify_cb(pan_ssl_curl_utils.c:641): Error with certificate at depth: 2 2023-08-28 13:36:24.605 -0400 Error: verify_cb(pan_ssl_curl_utils.c:643): Basic Validation of x509 cert Fail ; Code : 20 2023-08-28 13:36:24.606 -0400 Error: verify_cb(pan_ssl_curl_utils.c:652): Failed to validate x509 cert from ctx: (20) unable to get local issuer certificate 2023-08-28 13:36:24.606 -0400 Error: pan_mlav_send_query(pan_mlav_cloud.c:1394): Failed to send req type[3], curl error: Peer certificate cannot be authenticated with given CA certificatesIn this case the system log will show the message below:2023/08/08 16:07:21 high tls tls-X50 0 MLAV Server certificate validation failed. Dest Addr: ml.service.paloaltonetworks.com, Reason: unable to get local issuer certificateResolution: One of the solutions is to exclude *.paloaltonetworks.com from decryption. - MLAV authentication failure. This is related to a software issue that affects only the firewalls which do not have a Threat Prevention License and are running PAN-OS version less than 10.0.2. Note that a firewall does not require any license to establish a successful connection to the MLAV cloud server and that Machine Learning-based Antivirus (MLAV) feature was introduced with the release of PAN-OS version 10.0.0. In this case the devsrv.log messages will look like below:
2020-08-12 15:37:02.062 +0800 Error: pan_mlav_get_ver_from_file(pan_mlav_cloud.c:183): Failed to open file[/opt/pancfg/mgmt/content/pan_threatversion] 2020-08-12 15:37:02.203 +0800 Error: pan_mlav_process_http_resp_code(pan_mlav_cloud.c:538): Got HTTP resp code 400 2020-08-12 15:37:02.203 +0800 Error: pan_mlav_resp_parser(pan_mlav_cloud.c:802): Server resp code[400], msglen[71], msg: invalid flatbuffer request: invalid content version (with build number) 2020-08-12 15:37:02.203 +0800 Error: pan_mlav_send_query(pan_mlav_cloud.c:1301): Failed to parse type[3] response 2020-08-12 15:37:02.203 +0800 Error: pan_mlav_get_metadata(pan_mlav_cloud.c:1555): Failed to send metadata query 2020-08-12 15:37:02.203 +0800 Error: pan_mlav_check_model(pan_mlav_cloud.c:1793): Failed to get metadata from cloud 2020-08-12 15:37:02.203 +0800 Error: pan_mlav_req_handler(pan_mlav_cloud.c:1915): meta download s_connection_tries [1] 2020-08-12 15:37:02.203 +0800 mlav force_reconnect is true
In this case the system log will show the message below:2020/07/24 20:59:32 high general general 0 MLAV: Authentication or Client Certificate failure.Resolution: Upgrade your firewall to the current recommended PAN-OS release. - MLAV unknown error. In this case the system log will show the message below:
2022/09/22 20:30:01 medium general general 0 MLAV: Unknown error.
Further investigation is required, particularly checking the devsrv.log.- If devsrv.log shows the message below:
2024-07-26 08:49:42.148 +0900 Error: pan_mlav_process_http_resp_code(pan_mlav_cloud.c:558): Got HTTP resp code 404
The issue is most likely on the MLAV cloud server side. Resolution: Contact the customer support if the issue persists. - If the devsrv.log shows the message below:
2022-09-22 20:30:02.513 +0900 Error: pan_mlav_process_http_resp_code(pan_mlav_cloud.c:558): Got HTTP resp code 500 2022-09-22 20:30:02.513 +0900 Error: pan_mlav_resp_parser(pan_mlav_cloud.c:822): Server resp code[500], msglen[244], msg:
The issue is most likely related to Good Cloud Platform (GCP) load balancer. Resolution: Contact the customer support if the issue persists.
- If devsrv.log shows the message below:
- MLAV cloud error, all machine Learning engines stopped. In this case the system log will show the message below:
2022/04/22 15:29:53 high general general 0 MLAV cloud error, all machine Learning engines stopped
the devsrv.log will show below message:2022-04-26 07:32:59.061 -0500 Error: pan_mlav_send_query(pan_mlav_cloud.c:1396): Failed to send req type[3], curl error: Couldn't resolve host name
Resolution: Check if the firewall is running a PAN-OS version affected by PAN-229832. The issue is fixed starting from version 10.1.14, 10.2.11, 11.0.5 and 11.1.3
- MLAV server certificate validation error. This issue may occur if there is a device decrypting the traffic between the firewall interface used as service route to PANW Services and the MLAV cloud server. The devsrv.log messages will look like below:
Additional Information
IMPORTANT NOTE:
If you restart the device-server using the command "debug software restart process device-server," you'll need to perform a "commit force" for the firewall to reconnect to the MLAV cloud server.
In some cases, restarting the device-server followed by a commit force can be used to attempt a reconnection to the MLAV Cloud server. However, this process can be disruptive, so it is recommended to use this approach with caution and preferably during a maintenance window.