Error:
An unexpected error occurred. Please click Reload to try again.
Error:
An unexpected error occurred. Please click Reload to try again.
How to troubleshoot inter Log Collector connection issue - Knowledge Base - Palo Alto Networks

How to troubleshoot inter Log Collector connection issue

4252
Created On 05/11/23 05:46 AM - Last Modified 12/21/24 03:26 AM


Objective


  • To troubleshoot and resolve issues related to Panorama and log collector connectivity and synchronization.
  • By following the instructions users should be able to diagnose and rectify common problems in their Panorama and Log Collector setups.
  • Some of the issues discussed that are generally seen with the setup.
    • Firewall logs are not sent to Panorama/Log Collector.
    • Log Collector is disconnected from Panorama.
    • Some logs are missing or delayed when a query is launched. 

 



Environment


  • Any Panorama
  • Log Collectors
  • PAN-OS 10.0 and above.


Procedure


  1. Verify that the appropriate permissions for the Panoramas (Panorama mode and/or Log Collector mode) to communicate are in place.
    1. Use Troubleshooting Panorama Connectivity.
    2. Make sure TCP port 28270 is open between the devices if a firewall is in the middle.
    3. If PAN-OS 11.1 version or higher is used, ensure port 28 and ports 9300-9302 are open. The inter LC communication uses these ports.
  2. Check for any flapping connection between devices, execute the command below few times to see any frequent disconnection/reconnection.
> show netstat numeric yes | match 28270
Proto Recv-Q Send-Q Local Address           Foreign Address         State   
  1. If the working communication between Panoramas/Log Collectors stop working:
    1. Check the ms.log to see the error messages using the command> less mp-log ms.log 
01:20:06.757 -0700 COMM: connection established. sock=29 remote ip=10.253.0.106 port=3978 local port=60640
01:20:06.757 -0700 cms agent: Pre. send buffer limit=46080. s=29
01:20:06.757 -0700 cms agent: Post. send buffer limit=425984. s=29
01:20:06.757 -0700 Error: cs_load_certs_ex(cs_common.c:655): keyfile not exists
01:20:06.757 -0700 Error: pan_cmsa_tcp_channel_setup(src_panos/cms_agent.c:883): cms agent: cs_load_certs_ex failed
01:20:06.757 -0700 cmsa: client will use default context
01:20:06.757 -0700 Warning: pan_cmsa_tcp_channel_setup(src_panos/cms_agent.c:988): client will not use SNI
09:13:27.723 -0800 Error: sc3_ca_exists(sc3_certs.c:221): SC3: Failed to get the current CA name. 
09:13:27.723 -0800 Warning: sc3_init_sc3(sc3_utils.c:351): SC3: Failed to get the Current CC name
  1. On the dedicated log collector, the sc3 issues seen in the ms.log can be resolved by using resetting the connection. Refer  How to reset secure communication between firewall and panorama.
  2. Do not run the sc3 reset command on Panorama managing firewalls or in mixed mode. Consult Support for resolution. Running the command will cause all the firewalls to be disconnected from Panorama.
  1. Was there any upgrade to one of the devices in the Collector Group? If yes, ensure all devices run the same version within the Collector Group.
  2. Restart the management server on both devices to reset the connection.
debug software restart process management-server
  1. Verify that Log Collectors are all healthy from Panorama's perspective, including the Log Collector local to Panorama.
    1. From Panorama's GUI: Panorama > Managed > Collectors
    2. Depending on the issue seen, use the following KBs accordingly.
      1. Log-Collector showing status as "out of sync" and "disconnected" 
      2. Log Collector showing Ring version mismatch
      3. Log Collector showing 'Out of Sync' due to "IP mismatch for mgmt interface"
  2. Verify that Log Collector's ElasticSearch is healthy, including the Log Collector local to the Panorama.
    1. Below is a sample of an unhealthy ElasticSearch.
admin@Log_Collector> debug elasticsearch es-state option health 

epoch      timestamp cluster         status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1683877500 07:45:00  __pan_cluster__ red             1         1    461 461    0    0      115             0                  -     80.03472222222221
  1. Restart ElasticSearch using the command below and recheck the health. Note if multiple log collectors are configured, one needs to identify the log collector having trouble and then restart the elasticsearch on the failed log collector. If help is needed, contact Support.
admin@Log_Collector> debug elasticsearch es-restart option all
Elasticsearch was restarted with option all by user admin
  1. Below is a sample of a healthy ElasticSearch.
admin@Log_Collector> debug elasticsearch es-state option health 

epoch      timestamp cluster         status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1683877131 07:38:51  __pan_cluster__ green           1         1     32  32    0    0        0             0                  -                100.0%
  1. If the logs are missing or delayed, Refer Log forwarding delays or Missing Logs due to high latency between log collectors in a collector group.


Actions
  • Print
  • Copy Link

    https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA14u000000g1lGCAQ&lang=en_US&refURL=http%3A%2F%2Fknowledgebase.paloaltonetworks.com%2FKCSArticleDetail

Choose Language