How to interpret output of "debug dataplane pow performance" during troubleshooting high DP CPU
74107
Created On 12/31/18 06:45 AM - Last Modified 06/09/23 09:24 AM
Objective
The SSO (Schedule/ Synchronisation/ Order) unit is the component of the Dataplane which manages packet scheduling and ordering. It is also known as POW (Packet / Order / Work). It is responsible for scheduling work to cores based on QOS Priority and Work Groups. It also maintains packet ingress order.
A work group is a set of applications and each core can be assigned to more than one work group. Work groups are used to balance the processing load among the different cores. The work group value of a packet is set automatically on ingress, and can be changed by software.
Functions are a subset of tasks executed by a core within a work group.
Note: functions and groups do not always have one-to-one mapping i.e. functions are not necessarily exclusive to a group. A function can be called from within different groups for e.g. func A can be called at some stage of processing from Group A and then later from Group B at a different stage of processing
This article provides some pointers on how to interpret pow performance data.
Environment
This output is common to all PAN-OS firewalls both hardware or virtual.
Procedure
- The output of "debug dataplane pow performance" provides the cumulative performance statistics of groups and functions since the last time it was reset.
- This data gets reset every 10 minutes by default when dp-monitor runs.
- It can also be manually reset by the command - “debug dataplane reset pow”
- This data contains:
- count : Number of times a work group and function was scheduled
- avg.proc us : Average processing time in microseconds (us) for each work groups and function
- max. proc us : Maximum processing time in microseconds (us) for each work groups and function
- 9.1+ onwards there are additional fields in pow data namely total-us, ac-max-us, ac-avg-us
- total-us : avg-us * count
- ac-count : accumulated count
- ac-max-us : accumulated max-us since last dp restart/reboot
- ac-avg-us : accumulated avg-us since last dp restart/reboot
Sample output pre 9.1:
(Sample out put from the groups section)
(Sample out put from the groups section)
> debug dataplane pow performance group max-us avg-us count :flow_lookup 0 0 0 :flow_fastpath 59441 15 8391514 :flow_slowpath 57527 75 262375 :flow_forwarding 64 3 1894246 :flow_mgmt 143 89 153 :flow_ctrl 55 5 911557 :nac_result 50 1 8536548 :flow_np 89 2 8794649 :dfa_result 6152 320 6451279 :module_internal 216 25 178952 :aho_result 1709 24 8528596 :zip_result 57 37 140 :pktlog_forwarding 84 8 137767 :flow_host 60 8 16144
Sample output post 9.1:
(Sample output from the functions section)
> debug dataplane pow performance func max-us avg-us count total-us ac-max-us ac-avg-us ac-count ac-total-us dfa_match 0 0.0 0 0 94 17.1 51 875 appid_match 0 0.0 0 0 726 680.0 4 2720 policy_lookup 0 0.0 0 0 145 37.8 88 3335 user_group_policy 0 0.0 0 0 14 5.8 88 513 get_gid 0 0.0 0 0 6 1.7 74 130 regex_lookup 0 0.0 0 0 11 7.0 42 298 regex_postfpga 0 0.0 0 0 0 0.0 0 0 reg_expr_match 0 0.0 0 0 0 0.0 0 0 sml_vm 0 0.0 0 0 87 10.4 566 5926 detector_run_p1 0 0.0 0 0 274 75.1 30 2255 detector_run_p2 0 0.0 0 0 96 26.4 32 847 ssl_proxy_proc 0 0.0 0 0 3450 510.9 43 21969 ssl_encode 0 0.0 0 0 148 61.2 15 918Explanation of workgroups :
flow_np : Network Processor. Very first program packets enter flow_lookup : Responsible for existing session/flow lookup. flow_mgmt : Installation and clearing of session table. flow_fastpath : Fastpath processing of packet (Includes post slowpath processing and content inspection). flow_slowpath : Slow path processing of packet (route and policy lookup and session setup). flow_forwarding : Handles forwarding stage of packets. dfa_result : DFA pattern match processing module_internal : Control messages between internal components aho_result : DFA pattern match processing zip_result : Used for ZIP processing. pktlog_forwarding : Responsible for forwarding logs to MP. flow_host : Process handling packets destined to Firewall flow_ctrl : Aging sessions and handling net-messages nac_result : Analyzing result of signature match output
Classification of some functions:
Layer-7 processing related functions - appid + ctd (includes L7 profiles)
dfa_match appid_match get_gid regex_lookup regex_postfpga reg_expr_match sml_vm detector_run_p1 detector_run_p2 appid_result ctd_token cad_threatsig_match cad_appsig_match mlav_cache_lookup mlav_xgb_model_classify mlav_lr_model_classify http2_proc_recv http2_proc_send hs_match hs_ctxt_free_match fpga_aho fpga_dfa ctd_pw_check ctd_pw_md4 ctd_pw_extract prl_rewrite pkt_rewrite raven_match raven_match_header raven_match_body raven_sig_get_action zip_deflateDecryption/Encryption related functions
ssl_proxy_proc ssl_encode rsa_operation_pub_enc rsa_operation_pub_dec rsa_operation_priv_enc rsa_operation_priv_dec rsa_key_gen eckey_dup ecdhe_key_gen ecdhe_key_gen_mul ecdhe_key_compute_mul ecdhe_key_compute ecdhe_generate_xchg_key ecdhe_gen_key_exchange_msg ecdhe_get_key_exchange_msg ecdhe_parse_server_key_exchange_msg ecdhe_gen_server_key_exchange_msg dhe_gen_para dhe_gen_key dhe_key_compute bn_mod_exp cipher_enc rsa_verify_sw rsa_sign_sw poll_delay_qat qat_packet_latency qat_batch_latency ecdsa_verify_sw ecdsa_sign_sw ecdhe_key_gen_sw ecdhe_key_compute_sw rsa_verify_hw rsa_sign_hw ecdsa_verify_hw ecdsa_sign_hw ecdhe_key_gen_hw ecdhe_key_compute_hw rsa_verify_hw_latency rsa_sign_hw_latency ecdsa_verify_hw_latency ecdsa_sign_hw_latency ecdhe_key_gen_hw_latency ecdhe_key_compute_hw_latency
Tunnel related functions
tunnel_encap tunnel_decap tunnel_esp tunnel_esp_decap tunnel_ah_decap tunnel_ah tunnel_fwd tunnel_prepare tunnel_post vpn_send_to_client vpn_cookie_response
URL filtering related functions:
urlcache_update urlcache_lookup urlcache_insert urlcache_delete urlcache_lru
The pow performance output gives us very necessary insight into the DP CPU usage
- Compare values between times when CPU was normal to when CPU is high
- Ascertaining functions that are spiking can help determine what is contributing to high DP CPU. For instance if Layer-7 processing functions are spiking indicates high DP CPU is caused by traffic that is subjected to Layer-7 processing
- Spike in avg-us indicates change in traffic pattern and
- Spike in count indicates increase in traffic volume
Additional Information
Case Study:
For example, CPU is 100% and workgroups look as below:
:group max. proc us ave. proc us count :flow_lookup 0 0 0 :flow_fastpath 122087 55 33971601 :flow_slowpath 199 72 345516 :flow_forwarding 75 2 8910565 :flow_mgmt 57 29 69 :flow_ctrl 100 6 302011 :nac_result 50 1 2015416 :flow_np 66 4 33255061 :dfa_result 21794 242 2059391 :module_internal 525 18 173530 :aho_result 119981 154 2014275 :zip_result 16771 234 197454 :pktlog_forwarding 84 7 287854 :lwm 0 0 0 :flow_host 52 11 5494
We know from above that fastpath is highest, but to find reason of which component is using more CPU lets look at the functions for the same:
:func max. proc us ave. proc us count :dfa_match 0 0 0 :policy_lookup 125 10 2239062 :user_group_policy 87 1 477903 :get_gid 41 0 1061755 :regex_lookup 6217 102 2233481 :regex_postfpga 910 16 1039761 :reg_expr_match 0 0 0 :sml_vm 208 6 31643903 :detector_run_p1 2342 34 906163 :detector_run_p2 31327 79 889286 :ssl_proxy_proc 63072 600 2500540 :ssl_encode 202 36 805522 :rsa_operation_pub_enc 766 309 6688 :rsa_operation_pub_dec 648 263 116 :rsa_operation_priv_enc 8216 8168 14362 :rsa_operation_priv_dec 8260 8178 6427 :rsa_key_gen 40126 23467 43 :rsa_RSA_sign 13840 13236 33704 :rsa_RSA_verify 963 430 71904 :ecdhe_key_gen 22925 7559 18 :ecdhe_key_gen_mul 0 0 0 :ecdhe_key_compute_mul 22121 6640 82229 :ecdhe_key_compute 22565 6856 82224 :ecdhe_generate_xchg_key 23514 7948 18 :ecdhe_gen_key_exchange_msg 22630 6949 43919 :ecdhe_get_key_exchange_msg 22656 6932 38261 :ecdhe_parse_server_key_exchange_msg 320 244 45817 :ecdhe_gen_server_key_exchange_msg 78 29 43561 :dhe_gen_para 0 0 0 :dhe_gen_key 0 0 0 :dhe_key_compute 0 0 0 :bn_mod_exp 8226 6196 27618 :cipher_enc 142 9 1956671 :zip_deflate 50 2 197543 :pktlog_log 0 0 0 :pktlog_send 0 0 0 :tunnel_encap 0 0 0 :tunnel_decap 0 0 0 :tunnel_esp 0 0 0 :tunnel_esp_decap 0 0 0 :tunnel_ah_decap 0 0 0 :tunnel_ah 0 0 0 :tunnel_fwd 0 0 0 :tunnel_prepare 0 0 0 :tunnel_post 0 0 0 :appid_result 0 0 0 :ctd_token 21789 239 2060687 :dos_update 17 0 212713 :urlcache_update 20065 178 2555 :urlcache_lookup 107402 125 540508 :urlcache_insert 0 0 0 :urlcache_delete 0 0 0 :urlcache_lru 0 0 0 :session_install 59 8 217532 :session_age 36 0 875144 :session_delete 165 11 212717 :session_purge 0 0 0 :inline_switch 0 0 0 :age_arp 20 2 595 :age_pbf_ret_mac 1 0 595 :stack_trace 0 0 0 :ctd_pw_check 0 0 0 :ctd_pw_md4 0 0 0 :ctd_pw_extract 0 0 0 :prl_rewrite 0 0 0 :pkt_rewrite 0 0 0 :vpn_send_to_client 0 0 0 :vpn_cookie_response 0 0 0 :raven_match 46 2 131751 :raven_match_header 52 2 187227 :raven_match_body 51 14 6 :raven_sig_get_action 0 0 0 :age_mfib 1 0 595
In above, if you multiply the count by average time, you can get a clear idea of what is using most CPU. For example see below (output truncated only for top used functions)
:func max. proc us ave. proc us count count*avg :ssl_proxy_proc 63072 600 2500540 1500324000 :ecdhe_key_compute 22565 6856 82224 563727744 :ecdhe_key_compute_mul 22121 6640 82229 546000560 :ctd_token 21789 239 2060687 492504193 :rsa_RSA_sign 13840 13236 33704 446106144 :ecdhe_gen_key_exchange_msg 22630 6949 43919 305193131 :ecdhe_get_key_exchange_msg 22656 6932 38261 265225252 :regex_lookup 6217 102 2233481 227815062 :sml_vm 208 6 31643903 189863418 :bn_mod_exp 8226 6196 27618 171121128 :rsa_operation_priv_enc 8216 8168 14362 117308816 :detector_run_p2 31327 79 889286 70253594 :urlcache_lookup 107402 125 540508 67563500 :rsa_operation_priv_dec 8260 8178 6427 52560006 :rsa_RSA_verify 963 430 71904 30918720 :detector_run_p1 2342 34 906163 30809542 :ssl_encode 202 36 805522 28998792 :policy_lookup 125 10 2239062 22390620 :cipher_enc 142 9 1956671 17610039 :regex_postfpga 910 16 1039761 16636176 :ecdhe_parse_server_key_exchange_msg 320 244 45817 11179348
You can see most of the top used functions are related to SSL decryption. So this gives us a good idea of what further steps to take alleviate this.