How to troubleshoot the connection failure to MLAV cloud server

How to troubleshoot the connection failure to MLAV cloud server

32852
Created On 08/14/24 17:52 PM - Last Modified 07/11/25 15:57 PM


Objective


  • Find the reason behind the connection loss between the firewall and the MLAV cloud server.
  • Retrieve the connection between the firewall and the MLAV cloud server


Environment


  • Next Generation Firewall
  • MLAV


Procedure


  1. Check the status of the connection between the firewall and the MLAV cloud server. Run the command:
    > show mlav cloud-status
    Example output:
    > show mlav cloud-status
    MLAV cloud
    Current cloud server:   ml.service.paloaltonetworks.com
    Cloud connection:       disconnected
    In certain scenarios, you might get the following output:
    > show mlav cloud-status
    Server error : device busy or unavailable
    
  2. Check the TCP connection status between the firewall and the MLAV cloud server:
    1. Check the service route to the MLAV cloud server, it is the same as configured for "Palo Alto Networks Services".
      1. If the default is used that means it is the management interface and no need to add the parameter "source" to the command in 2.b
      2. If it is configured to be a dataplane interface then add the IP address of that interface as the source of the ping.
    2. Use the ping command to check the network reachability to the MLAV cloud server and to resolve the IP address of its FQDN:
      1. For default or management service route to "Palo Alto Networks Services", use the command:
        ping host ml.service.paloaltonetworks.com
        
      2. For dataplane interface service route to "Palo Alto Networks Services", use the command:
        ping source <IP address of the dataplane interface configured as service route to PANW Services> host ml.service.paloaltonetworks.com
    3. Check the netstat using the resolved IP address:
      show netstat numeric-hosts yes numeric-ports yes | match <IP address of the MLAV cloud server>
      Ensure that no device between the firewall interface used as service route to PANW Services and the MLAV cloud server is blocking port 443.
  3. For further troubleshooting, perform a packet capture of the traffic between the firewall interface used as service route to PANW Services and the MLAV cloud server. Also check the devsrv.log look for Error message related to mlav.
    less mp-log devsrv.log
     
  4. Common issues and resolutions:
    1. MLAV server certificate validation error. This issue may occur if there is a device decrypting the traffic between the firewall interface used as service route to PANW Services and the MLAV cloud server. The devsrv.log messages will look like below:
      2023-08-28 13:21:23.813 -0400 Error: pan_mlav_get_metadata(pan_mlav_cloud.c:1667): Failed to send metadata query
      2023-08-28 13:36:23.882 -0400 Error: pan_mlav_get_ver_from_file(pan_mlav_cloud.c:187): Failed to open file[/opt/pancfg/mgmt/content/pan_threatversion]
      2023-08-28 13:36:24.605 -0400 Error: verify_cb(pan_ssl_curl_utils.c:641): Error with certificate at depth: 2
      2023-08-28 13:36:24.605 -0400 Error: verify_cb(pan_ssl_curl_utils.c:643): Basic Validation of x509 cert Fail ; Code : 20
      2023-08-28 13:36:24.606 -0400 Error: verify_cb(pan_ssl_curl_utils.c:652): Failed to validate x509 cert from ctx: (20) unable to get local issuer certificate
      2023-08-28 13:36:24.606 -0400 Error: pan_mlav_send_query(pan_mlav_cloud.c:1394): Failed to send req type[3], curl error: Peer certificate cannot be authenticated with given CA certificates
      In this case the system log will show the message below:
      2023/08/08 16:07:21 high     tls            tls-X50 0   MLAV Server certificate validation failed. Dest Addr: ml.service.paloaltonetworks.com, Reason: unable to get local issuer certificate
      Resolution: One of the solutions is to exclude *.paloaltonetworks.com from decryption.
    2. MLAV authentication failure. This is related to a software issue that affects only the firewalls which do not have a Threat Prevention License and are running PAN-OS version less than 10.0.2. Note that a firewall does not require any license to establish a successful connection to the MLAV cloud server and that Machine Learning-based Antivirus (MLAV) feature was introduced with the release of PAN-OS version 10.0.0. In this case the devsrv.log messages will look like below:
      2020-08-12 15:37:02.062 +0800 Error: pan_mlav_get_ver_from_file(pan_mlav_cloud.c:183): Failed to open file[/opt/pancfg/mgmt/content/pan_threatversion] 2020-08-12 15:37:02.203 +0800 Error: pan_mlav_process_http_resp_code(pan_mlav_cloud.c:538): Got HTTP resp code 400 2020-08-12 15:37:02.203 +0800 Error: pan_mlav_resp_parser(pan_mlav_cloud.c:802): Server resp code[400], msglen[71], msg: invalid flatbuffer request: invalid content version (with build number) 2020-08-12 15:37:02.203 +0800 Error: pan_mlav_send_query(pan_mlav_cloud.c:1301): Failed to parse type[3] response 2020-08-12 15:37:02.203 +0800 Error: pan_mlav_get_metadata(pan_mlav_cloud.c:1555): Failed to send metadata query 2020-08-12 15:37:02.203 +0800 Error: pan_mlav_check_model(pan_mlav_cloud.c:1793): Failed to get metadata from cloud 2020-08-12 15:37:02.203 +0800 Error: pan_mlav_req_handler(pan_mlav_cloud.c:1915): meta download s_connection_tries [1] 2020-08-12 15:37:02.203 +0800 mlav force_reconnect is true
      In this case the system log will show the message below:
      2020/07/24 20:59:32 high     general        general 0  MLAV: Authentication or Client Certificate failure.
      Resolution: Upgrade your firewall to the current recommended PAN-OS release.
    3. MLAV unknown error. In this case the system log will show the message below:
      2022/09/22 20:30:01 medium   general        general 0  MLAV: Unknown error.
      Further investigation is required, particularly checking the devsrv.log.
      1. If devsrv.log shows the message below:
        2024-07-26 08:49:42.148 +0900 Error: pan_mlav_process_http_resp_code(pan_mlav_cloud.c:558): Got HTTP resp code 404
        The issue is most likely on the MLAV cloud server side. Resolution: Contact the customer support if the issue persists.
      2. If the devsrv.log shows the message below:
        2022-09-22 20:30:02.513 +0900 Error:  pan_mlav_process_http_resp_code(pan_mlav_cloud.c:558): Got HTTP resp code 500
        2022-09-22 20:30:02.513 +0900 Error:  pan_mlav_resp_parser(pan_mlav_cloud.c:822): Server resp code[500], msglen[244], msg:
        The issue is most likely related to Good Cloud Platform (GCP) load balancer. Resolution: Contact the customer support if the issue persists.
    4. MLAV cloud error, all machine Learning engines stopped. In this case the system log will show the message below:
      2022/04/22 15:29:53 high     general        general 0  MLAV cloud error, all machine Learning engines stopped
      the devsrv.log will show below message:
      2022-04-26 07:32:59.061 -0500 Error:  pan_mlav_send_query(pan_mlav_cloud.c:1396): Failed to send req type[3], curl error: Couldn't resolve host name
      Resolution: Check if the firewall is running a PAN-OS version affected by PAN-229832. The issue is fixed starting from version 10.1.14, 10.2.11, 11.0.5 and 11.1.3
 


Additional Information


IMPORTANT NOTE:

If you restart the device-server using the command "debug software restart process device-server," you'll need to perform a "commit force" for the firewall to reconnect to the MLAV cloud server. 

In some cases, restarting the device-server followed by a commit force can be used to attempt a reconnection to the MLAV Cloud server. However, this process can be disruptive, so it is recommended to use this approach with caution and preferably during a maintenance window.



Actions
  • Print
  • Copy Link

    https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA14u000000HDuDCAW&lang=en_US&refURL=http%3A%2F%2Fknowledgebase.paloaltonetworks.com%2FKCSArticleDetail

Choose Language