Genindex.sh causing high MP CPU after upgrade

Genindex.sh causing high MP CPU after upgrade

22294
Created On 09/03/20 22:20 PM - Last Modified 09/04/20 21:42 PM


Symptom


Post upgrade to certain affected releases mentioned in the environment section,  genindex.sh process shows sustained 100% CPU on MP.

CPU of genindex.sh stays at 100% constantly and doesn’t go down for days or weeks. Or CPU spikes observed 30 days after an upgrade. Or CLI and GUI are not accessible. This behavior in low end platforms such as PA-2xx or PA-8xx can impact functioning of the firewall because they have a single CPU to handle both DP and MP functionalities.

Expected Behavior: CPU spikes of genindex.sh that start every 15 minutes and last for less than one minute.
 


Environment


  • Any Palo Alto Firewall except 7k series
  • PAN-OS 8.1.14 - 8.1.16,  9.0.8 - 9.0.10,  9.1.0 - 9.1.5, 10.0.0 -10.0.1


Cause


genindex.sh is a script that runs every 15 mins and indexes log files (such as traffic, threat, system) for faster querying of the logs. For performance reasons, genindex.sh does not index all log directories but only those that have been modified in the last 30 days.

The script is CPU intensive and can result in CPU spikes every 15 minutes. This should not cause any issues or be detrimental to the firewall and lasts only until it has completed indexing all the logs.  

Due to a fix introduced for PAN-132898, the modified timestamp of all the log directories gets updated, causing the genindex.sh script to walk through all the directories instead of the last 30 days. If there are logs stored from over a year, the number of log directories traversed can go up to a 1000, leading to a sustained cpu usage of gendindex.sh.  This issue happens only if you are running one  affected releases (PAN-OS 8.1.14 - 8.1.16,  9.0.8 - 9.0.10,  9.1.0 - 9.1.5, 10.0.0 -10.0.1).


Resolution


  1. Verify if the Genindex.sh process is causing sustained high CPU. Multiple ways of verifying the same are listed below.
  • Open a CLI session and run “show system resources follow”. Verify if the genindex.sh CPU processing time is long. Normally it should drop in a minute.
  •  Open a CLI session and run “less mp-log indexgen.log” and look for messages indicating genindex is generating indices for older logs. Here is an example of mentioned logs which shows a 7 months old log. 
generating indexes for weeklytrsum db on seqno actionflags xff_ip s_decrypted s_encrypted hostid src_category
src_profile dst_category dst_profile :/opt/pancfg/mgmt/logdb/weeklytrsum/1/20200130/pan.0000000000.log 
  • Open a CLI session and check the current retention period by using the command “show system logdb-quota” and verify which logs don’t have a retention policy.  In the example below, Retention days can be tuned.
<Output Omitted>
.........
traffic: Logs and Indexes: 1.3G Current Retention: 656 days
threat: Logs and Indexes: 124M Current Retention: 638 days
system: Logs and Indexes: 1.4G Current Retention: 656 days
<Output Omitted>
  1.  Once confirmed,  Set Max retention days to limit the number of days of logs stored on disk. This reduces the number of log directories that genindex.sh needs to process.
The retention can be set using GUI: Device > Setup > Management > Logging and Reporting Settings > Log Storage > Max Days. 
if the days are configured, modify this value to something less than what you already have. 

Configure Log Storage Quotas and Expiration Periods

 General recommendation:

If you are not using pre-defined reports and if it is enabled, consider disabling this feature for improved performance with logging and reporting. This is especially important for low end platforms, 

Disable Predefined Reports



 


Actions
  • Print
  • Copy Link

    https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA14u000000HAcfCAG&refURL=http%3A%2F%2Fknowledgebase.paloaltonetworks.com%2FKCSArticleDetail

Choose Language