How to troubleshoot inter Log Collector connection issue - Knowledge Base - Palo Alto Networks

How to troubleshoot inter Log Collector connection issue

4252

Created On 05/11/23 05:46 AM - Last Modified 12/21/24 03:26 AM

M-200 Appliance

M-600 Appliance

M-300 Appliance

M-700 Appliance

10.2

10.1

11.0

Panorama

Strata Cloud Manager

Objective

To troubleshoot and resolve issues related to Panorama and log collector connectivity and synchronization.
By following the instructions users should be able to diagnose and rectify common problems in their Panorama and Log Collector setups.
Some of the issues discussed that are generally seen with the setup.

- Firewall logs are not sent to Panorama/Log Collector.
- Log Collector is disconnected from Panorama.
- Some logs are missing or delayed when a query is launched.

Environment

Any Panorama
Log Collectors
PAN-OS 10.0 and above.

Procedure

Verify that the appropriate permissions for the Panoramas (Panorama mode and/or Log Collector mode) to communicate are in place.
1. Use Troubleshooting Panorama Connectivity.
2. Make sure TCP port 28270 is open between the devices if a firewall is in the middle.
3. If PAN-OS 11.1 version or higher is used, ensure port 28 and ports 9300-9302 are open. The inter LC communication uses these ports.
Check for any flapping connection between devices, execute the command below few times to see any frequent disconnection/reconnection.

> show netstat numeric yes | match 28270
Proto Recv-Q Send-Q Local Address           Foreign Address         State

If the working communication between Panoramas/Log Collectors stop working:
1. Check the ms.log to see the error messages using the command> less mp-log ms.log

01:20:06.757 -0700 COMM: connection established. sock=29 remote ip=10.253.0.106 port=3978 local port=60640
01:20:06.757 -0700 cms agent: Pre. send buffer limit=46080. s=29
01:20:06.757 -0700 cms agent: Post. send buffer limit=425984. s=29
01:20:06.757 -0700 Error: cs_load_certs_ex(cs_common.c:655): keyfile not exists
01:20:06.757 -0700 Error: pan_cmsa_tcp_channel_setup(src_panos/cms_agent.c:883): cms agent: cs_load_certs_ex failed
01:20:06.757 -0700 cmsa: client will use default context
01:20:06.757 -0700 Warning: pan_cmsa_tcp_channel_setup(src_panos/cms_agent.c:988): client will not use SNI

09:13:27.723 -0800 Error: sc3_ca_exists(sc3_certs.c:221): SC3: Failed to get the current CA name. 
09:13:27.723 -0800 Warning: sc3_init_sc3(sc3_utils.c:351): SC3: Failed to get the Current CC name

On the dedicated log collector, the sc3 issues seen in the ms.log can be resolved by using resetting the connection. Refer How to reset secure communication between firewall and panorama.
Do not run the sc3 reset command on Panorama managing firewalls or in mixed mode. Consult Support for resolution. Running the command will cause all the firewalls to be disconnected from Panorama.

Was there any upgrade to one of the devices in the Collector Group? If yes, ensure all devices run the same version within the Collector Group.
Restart the management server on both devices to reset the connection.

debug software restart process management-server

Verify that Log Collectors are all healthy from Panorama's perspective, including the Log Collector local to Panorama.
1. From Panorama's GUI: Panorama > Managed > Collectors
2. Depending on the issue seen, use the following KBs accordingly.
Verify that Log Collector's ElasticSearch is healthy, including the Log Collector local to the Panorama.
1. Below is a sample of an unhealthy ElasticSearch.

admin@Log_Collector> debug elasticsearch es-state option health 

epoch      timestamp cluster         status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1683877500 07:45:00  __pan_cluster__ red             1         1    461 461    0    0      115             0                  -     80.03472222222221

Restart ElasticSearch using the command below and recheck the health. Note if multiple log collectors are configured, one needs to identify the log collector having trouble and then restart the elasticsearch on the failed log collector. If help is needed, contact Support.

admin@Log_Collector> debug elasticsearch es-restart option all
Elasticsearch was restarted with option all by user admin

Below is a sample of a healthy ElasticSearch.

admin@Log_Collector> debug elasticsearch es-state option health 

epoch      timestamp cluster         status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1683877131 07:38:51  __pan_cluster__ green           1         1     32  32    0    0        0             0                  -                100.0%

If the logs are missing or delayed, Refer Log forwarding delays or Missing Logs due to high latency between log collectors in a collector group.

How to troubleshoot inter Log Collector connection issue

Objective

Environment

Procedure

Other users also viewed: