Management CPU is 100% because of '%wa'
44321
Created On 06/11/20 21:38 PM - Last Modified 07/29/20 22:12 PM
Symptom
- Management CPU stays 100% from the dashboard widget (Dashboard > System Resource)
- Output from CLI is showing the below
> show system resources
top - 23:00:40 up 67 days, 9:08, 3 users, load average: 17.42, 17.03, 17.98
Tasks: 145 total, 4 running, 139 sleeping, 1 stopped, 1 zombie
Cpu(s): 1%us, 0.2%sy, 0%ni, 0%id, 98.8%wa, 0.0%hi, 0.8%si, 0.0%st
Mem: 3849872k total, 3737992k used, 111880k free, 65284k buffers
Swap: 3056660k total, 837292k used, 2219368k free, 1196828k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4092 20 0 1995m 345m 10m S 71.8 9.2 9348:08 mgmtsrvr
9333 20 0 313m 118m 17m D 27.2 3.1 0:02.64 pan_logquery
4326 20 0 1891m 379m 8168 S 13.6 10.1 6921:44 logrcvr
3960 20 0 1141m 403m 115m S 3.9 10.7 2584:54 useridd
20128 15 -5 95632 9992 3428 S 3.9 0.3 0:47.03 sysd
----- Skipped -----
Environment
- PA-3020
- 8.1.11
Cause
'%wa' shows percent of time CPU has been waiting for I/O to complete. Too high may be indicative of heavy swap usage. This can be caused by excessive logging. As you can see below incoming rate is almost 10 times more than written rate and lots of logs is discarded.
> debug log-receiver statistics Logging statistics ------------------------------ ----------- Log incoming rate: 23231/sec Log written rate: 2604/sec Corrupted packets: 0 Corrupted URL packets: 0 Corrupted HTTP HDR packets: 0 Corrupted HTTP HDR Insert packets: 0 Corrupted EMAIL HDR packets: 0 Logs discarded (queue full): 23085761 Traffic logs written: 7192191877 GTP logs written: 0 Tunnel logs written: 0 Auth logs written: 0 Userid logs written: 861133263 SCTP logs written: 0 URL logs written: 1032529198 Wildfire logs written: 0 Anti-virus logs written: 85 Widfire Anti-virus logs written: 83 Spyware logs written: 42482 Spyware-DNS logs written: 348301 Attack logs written: 0 Vulnerability logs written: 655878 Fileext logs written: 15035912 Fileext logs URL not written: 14753913 Fileext logs URL not written (timedout): 34467 URL cache age out count: 0 URL cache full count: 222305358 URL cache key exist count: 59082772 URL cache wrt incomplete http hdrs count: 0 URL cache rcv http hdr before url count: 3 URL cache full drop count(url log not received): 0One can also check logging rate from data plane
> show counter global filter delta yes | match log log_url_cnt 1569 121 info log system Number of url logs log_urlcontent_cnt 104 8 info log system Number of url content logs log_uid_req_cnt 786 60 info log system Number of uid request logs log_vulnerability_cnt 1 0 info log system Number of vulnerability logs log_fileext_cnt 32 2 info log system Number of file block logs log_traffic_cnt 29484 2279 info log system Number of traffic logs log_dlp_cnt 1 0 info log system Number of DLP logs log_suppress 420 32 info log system Logs suppressed by log suppression
Resolution
The following suggestions help reduce logging
- Avoid logging unnecessary traffic (i.e. DNS, NTP, OSPF/BGP, IKE, other non-user related traffic, etc.).
- If log-forwarding enabled, reduce the number of logs to be forwarded.
- Do not use log at session start.
- Check to ensure no data-plane debugs enabled. If enabled, disable them.
- Disable any Management Plane debugs.
Additional Information
For additional information, please review the following articles: