How to check WF-500 health status
Objective
How to check health and operational state of the WF-500.
Environment
- WF-500
- WF-500-B
Procedure
- Run show system software status to check if the processes and applications are up and running.
dmin@WF-500> show system software status
Overall control-plane status: running
----------------------------------------
Group 'all' will list status of all process members
Type Name State Info
Group all running
Group base running
Group batch running
Group cluster_svc running
Group dsms running
Group fips running
Group ha_ssh running
Group services running
Group third_party running
Group vm_mgr running
Group wf_3party running
Group wf_panav running
Group wf_redis running
Group wf_services running
Process appwkr_01_elink running (pid: 21397)
Process appwkr_02_doc running (pid: 21399)
Process appwkr_03_doc running (pid: 21396)
Process appwkr_04_doc running (pid: 21398)
Process appwkr_05_pe running (pid: 21400)
Process appwkr_06_url_upload_file running (pid: 21403)
Process appwkr_07_sessiononly running (pid: 21401)
Process appwkr_08_archive running (pid: 21402)
Process authd running (pid: 5399)
Process chasd running (pid: 4072)
Process cluster-mgr running (pid: 7408)
Process clusterd running (pid: 5381)
Process configd running (pid: 5375)
Process crypto running (pid: 4351)
Process dagger running (pid: 4025)
Process dockerd running (pid: 5395)
Process ehmon running (pid: 4071)
Process elinkbenignhdlr_01 running (pid: 2228)
Process elinkbenignhdlr_02 running (pid: 2201)
Process elinkparser_01 running (pid: 1953)
Process elinkparser_02 running (pid: 1954)
Process elinkparserupload_01 running (pid: 2227)
Process elinkparserupload_02 running (pid: 2226)
Process elinkrp_01 running (pid: 1917)
Process gdb running (pid: 4030)
Process gearmand running (pid: 8962)
Process ha-sshd running (pid: 4390)
Process ha_agent running (pid: 7654)
Process masterd running (pid: 3834)
Process mdadm running (pid: 5727)
Process mgmtsrvr running (pid: 9172)
Process mongodb running (pid: 8372)
Process monitor running (pid: 4026)
Process mysql running (pid: 5764)
Process mysql_local running (pid: 8892)
Process notifier_01 running (pid: 20296)
Process notifier_02 running (pid: 20299)
Process notifier_03 running (pid: 20228)
Process panavdns_01 running (pid: 20374)
Process panavdns_02 running (pid: 20373)
Process panavsync_01 running (pid: 20375)
Process panavsync_02 running (pid: 20372)
Process rabbitmq running (pid: 8234)
Process redis_6379 running (pid: 8886)
Process redis_6380 running (pid: 8909)
Process redis_6381 running (pid: 8931)
Process rsyncd running (pid: 8019)
Process sample_sync_01 running (pid: 9392)
Process sla_01 running (pid: 20227)
Process snmpd running (pid: 8093)
Process sshd running (pid: 4408)
Process sslmgr running (pid: 4035)
Process sysd running (pid: 3853)
Process sysdagent running (pid: 4034)
Process urlvmctrl_01 running (pid: 2435)
Process urlvmctrl_02 running (pid: 2499)
Process uwsgi running (pid: 21735)
Process varrcvr running (pid: 7625)
Process verdict_sync_01 running (pid: 20226)
Process vm_decoynet running (pid: 1931)
Process vm_torsvc running (pid: 1945)
Process vmctrl_01 running (pid: 2376)
Process vmctrl_02 running (pid: 2373)
Process vmctrl_03 running (pid: 2591)
Process vmctrl_04 running (pid: 2682)
Process vmctrl_05 running (pid: 2716)
Process vmctrl_06 running (pid: 2406)
Process vmctrl_07 running (pid: 2488)
Process vmctrl_08 running (pid: 2593)
Process vmctrl_09 running (pid: 2650)
Process vmctrl_10 running (pid: 2702)
Process vmctrl_11 running (pid: 2739)
Process vmctrl_12 running (pid: 2518)
Process vmctrl_13 running (pid: 2561)
Process vmctrl_14 running (pid: 2656)
Process vmctrl_15 running (pid: 2692)
Process vmctrl_16 running (pid: 2762)
Process vmctrl_17 running (pid: 3082)
Process vmctrl_18 running (pid: 2603)
Process vmctrl_19 running (pid: 2629)
Process vmctrl_20 running (pid: 2434)
Process vmctrl_21 running (pid: 2416)
Process vmctrl_22 running (pid: 2586)
Process vmctrl_23 running (pid: 2648)
Process vmctrl_24 running (pid: 2382)
Process vmctrl_25 running (pid: 2393)
Process vmctrl_26 running (pid: 2498)
Process vpnctl running (pid: 8016)
Process websrvr running (pid: 8219)
Process wf_devsrvr running (pid: 7506)
Process wf_lisasrvr running (pid: 5385)
Process wf_siggen running (pid: 20298)
Process wf_superv running (pid: 20295)
Process wf_task_queue running (pid: 5359)
admin@WF-500>
Example of the unhealthy state below:
admin@WF-500>show system software status
Overall control-plane status: startChildren
----------------------------------------
Group 'all' will list status of all process members
Type Name State Info
Group all startChildren
Group base running
Group batch scheduling - Requires services running
Group cluster_svc startChildren - Waiting for wf_services and vm_mgr ready...
Group dsms running
Group fips running
Group ha_ssh running
Group services startChildren
Group third_party startChildren
Group vm_mgr scheduling - Requires wf_services running
Group wf_3party startChildren
Group wf_panav stopped - Never Started
Group wf_redis startChildren
Group wf_services scheduling - Requires wf_redis ready
Process appwkr_01_elink stopped (pid: -1) - Never Started
Process appwkr_02_doc stopped (pid: -1) - Never Started
Process appwkr_03_doc stopped (pid: -1) - Never Started
Process appwkr_04_doc stopped (pid: -1) - Never Started
Process appwkr_05_pe stopped (pid: -1) - Never Started
Process appwkr_06_url_upload_file stopped (pid: -1) - Never Started
Process appwkr_07_sessiononly stopped (pid: -1) - Never Started
Process appwkr_08_archive stopped (pid: -1) - Never Started
Process authd running (pid: 5491)
Process chasd running (pid: 3993)
Process cluster-mgr running (pid: 6348)
Process clusterd running (pid: 5469)
Process configd running (pid: 5463)
Process crypto running (pid: 4181)
Process dagger running (pid: 3956)
Process dockerd running (pid: 5486)
Process ehmon running (pid: 3987)
Process elinkbenignhdlr_01 stopped (pid: -1) - Never Started
Process elinkbenignhdlr_02 stopped (pid: -1) - Never Started
Process elinkparser_01 stopped (pid: -1) - Never Started
Process elinkparser_02 stopped (pid: -1) - Never Started
Process elinkparserupload_01 stopped (pid: -1) - Never Started
Process elinkparserupload_02 stopped (pid: -1) - Never Started
Process elinkrp_01 stopped (pid: -1) - Never Started
Process gdb running (pid: 3961)
Process gearmand running (pid: 9738)
Process ha-sshd running (pid: 4303)
Process ha_agent running (pid: 8628)
Process masterd running (pid: 3765)
Process mdadm running (pid: 5741)
Process mgmtsrvr running (pid: 5593)
Process mongodb running (pid: 8894)
Process monitor running (pid: 3957)
Process mysql running (pid: 5817)
Process mysql_local running (pid: 9608)
Process notifier_01 stopped (pid: -1) - Never Started
Process notifier_02 stopped (pid: -1) - Never Started
Process notifier_03 stopped (pid: -1) - Never Started
Process rabbitmq running (pid: 8748)
Process redis_6379 running (pid: 9620)
Process redis_6380 execed (pid: 15763) Redis PING failed for too long!
Process redis_6381 scheduling (pid: -1) - Requires redis_6380 running
Process rsyncd running (pid: 8672)
Process sample_sync_01 scheduling (pid: -1) - Requires wf_3party ready
Process sla_01 stopped (pid: -1) - Never Started
Process snmpd running (pid: 8737)
Process sshd running (pid: 4321)
Process sslmgr running (pid: 3967)
Process sysd running (pid: 3784)
Process sysdagent running (pid: 3966)
Process urlvmctrl_01 stopped (pid: -1) - Never Started
Process urlvmctrl_02 stopped (pid: -1) - Never Started
Process uwsgi scheduling (pid: -1) - Requires wf_3party ready
Process varrcvr running (pid: 8407)
Process vm_decoynet stopped (pid: -1) - Never Started
Process vm_torsvc stopped (pid: -1) - Never Started
Process vmctrl_01 stopped (pid: -1) - Never Started
Process vmctrl_02 stopped (pid: -1) - Never Started
Process vmctrl_03 stopped (pid: -1) - Never Started
Process vmctrl_04 stopped (pid: -1) - Never Started
Process vmctrl_05 stopped (pid: -1) - Never Started
Process vmctrl_06 stopped (pid: -1) - Never Started
Process vmctrl_07 stopped (pid: -1) - Never Started
Process vmctrl_08 stopped (pid: -1) - Never Started
Process vmctrl_09 stopped (pid: -1) - Never Started
Process vmctrl_10 stopped (pid: -1) - Never Started
Process vmctrl_11 stopped (pid: -1) - Never Started
Process vmctrl_12 stopped (pid: -1) - Never Started
Process vmctrl_13 stopped (pid: -1) - Never Started
Process vmctrl_14 stopped (pid: -1) - Never Started
Process vmctrl_15 stopped (pid: -1) - Never Started
Process vmctrl_16 stopped (pid: -1) - Never Started
Process vmctrl_17 stopped (pid: -1) - Never Started
Process vmctrl_18 stopped (pid: -1) - Never Started
Process vmctrl_19 stopped (pid: -1) - Never Started
Process vmctrl_20 stopped (pid: -1) - Never Started
Process vmctrl_21 stopped (pid: -1) - Never Started
Process vmctrl_22 stopped (pid: -1) - Never Started
Process vmctrl_23 stopped (pid: -1) - Never Started
Process vmctrl_24 stopped (pid: -1) - Never Started
Process vmctrl_25 stopped (pid: -1) - Never Started
Process vmctrl_26 stopped (pid: -1) - Never Started
Process vpnctl running (pid: 8671)
Process websrvr running (pid: 9301)
Process wf_devsrvr running (pid: 8216)
Process wf_lisasrvr running (pid: 5475)
Process wf_siggen stopped (pid: -1) - Never Started
Process wf_superv stopped (pid: -1) - Never Started
Process wf_task_queue running (pid: 5449)
admin@WF-500>
If the box is in the unhealthy state possible reasons and further health up checks.
- Disk space - missing important partition
admin@WF-500>show system disk-space
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 17G 5.1G 11G 33% /
/dev/sda5 27G 11G 15G 44% /opt/pancfg
/dev/sda6 21G 13G 7.2G 64% /opt/panrepo
tmpfs 63G 176K 63G 1% /dev/shm
/dev/sda8 56G 563M 53G 2% /opt/panlogs
/dev/md1 275G 15G 247G 6% /opt/panlogs/ld1_1
/dev/md2 642G 6.6G 603G 2% /opt/panlogs/ld1_2
/dev/md3 275G 45G 217G 18% /opt/vmrepo
/dev/md4 642G 6.5G 603G 2% /opt/panlogs/ld2_2
!
! Full output omitted for brevity as it is not relevant for this article !
!
admin@WF-500>
One possible example of the output for unhealthy state (missing /dev/md2) below:
admin@WF-500> show system disk-space
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 17G 3.0G 13G 19% /
/dev/sda5 27G 4.8G 21G 20% /opt/pancfg
/dev/sda6 21G 11G 8.8G 56% /opt/panrepo
tmpfs 63G 0 63G 0% /dev/shm
/dev/sda8 56G 186M 53G 1% /opt/panlogs
/dev/md1 275G 14G 248G 5% /opt/panlogs/ld1_1
/dev/md3 275G 45G 217G 18% /opt/vmrepo
/dev/md4 642G 33G 577G 6% /opt/panlogs/ld2_2
!
! Full output omitted for brevity as it is not relevant for this article !
!
admin@WF-500>
Note: Instead of /dev/sda3 you could see /dev/sda2 as the output depends of the current active partition on the box.
- Raid disk check
Example of the healthy state below
admin@WF-500> show system raid detail
Disk Pair A Available
Status Partition 1: active ; Partition 2: clean ;
Disk id A1 Present
model : ST1000NX0423
size : 953869 MB
partition_1 : active sync
partition_2 : active sync
Disk id A2 Present
model : ST1000NX0423
size : 953869 MB
partition_1 : active sync
partition_2 : active sync
Disk Pair B Available
Status Partition 1: clean ; Partition 2: clean ;
Disk id B1 Present
model : ST1000NX0423
size : 953869 MB
partition_1 : active sync
partition_2 : active sync
Disk id B2 Present
model : ST1000NX0423
size : 953869 MB
partition_1 : active sync
partition_2 : active sync
admin@WF-500>
One possible example of the output for unhealthy state below:
admin@WF-500>show system raid detail
Disk Pair A Unavailable
Status Disk mount failure
Disk id A1 Present
model : ST1000NX0423
size : 953869 MB
partition_1 :
partition_2 : active sync
Disk id A2 Present
model : ST1000NX0423
size : 953869 MB
status : not in use
Disk Pair B Available
Status Partition 1: clean; Partition 2: clean;
Disk id B1 Present
model : ST1000NX0423
size : 953869 MB
partition_1 : active sync
partition_2 : active sync
Disk id B2 Present
model : ST1000NX0423
size : 953869 MB
partition_1 : active sync
partition_2 : active sync
admin@WF-500>
Note: It is expected to see transient and unhealthy state in some situations and under some circumstances for the brevity of time..
Situation & Scenario 1:
The most common and expected reason is due WF-500 reboot. It requires a time for the box to properly boot up. There is no SLA for the booting itself but if everything is OK box should be in operational state after around 15-20 min after reboot. Even if the box is not booting up around this time frame there are backend procedure which are automatically triggered by the box in order to self heal the operational status.
If the reboot is part of the upgrade process as well booting time frame can increased due possible changes between different major PAN-OS trains, for example if you are upgrading from 9.1 to 10.0.
Situation & Scenario 2:
During content installation and or content upgrade it is expected to see that some processes will restart itself to prevent atomic lock on the process itself, most notable processes like vmctrl*. How to check this? Check if there is any current or queued job on the box and in addition check the masterd.log. Once the job is finished there is no reason for further related processes restarts due triggered job.
admin@WF-500>show jobs id 3
Enqueued Dequeued ID Type Status Result Completed
------------------------------------------------------------------------------------------------------------------------------
2022/05/19 15:44:31 15:44:31 3 WF-Content FIN OK 19:42:47
Warnings:
Details:Configuration committed successfully
Successfully committed last configuration
admin@WF-500>
admin@WF-500> tail follow yes mp-log masterd.log
2022-05-19 15:55:21.193 +0200 INFO: urlvmctrl_01: process running with pid 32213
2022-05-19 15:55:21.197 +0200 INFO: vmctrl_06: process running with pid 32221
2022-05-19 15:55:21.448 +0200 INFO: vmctrl_21: process running with pid 32233
2022-05-19 15:55:21.456 +0200 INFO: vmctrl_02: process running with pid 32250
2022-05-19 15:55:21.459 +0200 INFO: vmctrl_20: process running with pid 32275
2022-05-19 15:55:21.462 +0200 INFO: vmctrl_25: process running with pid 32294
2022-05-19 15:55:21.464 +0200 INFO: vmctrl_26: process running with pid 32311
2022-05-19 15:55:21.467 +0200 INFO: vmctrl_07: process running with pid 32327
2022-05-19 15:55:21.470 +0200 INFO: vmctrl_12: process running with pid 32336
2022-05-19 15:55:21.472 +0200 INFO: urlvmctrl_02: process running with pid 32337
2022-05-19 15:55:27.935 +0200 INFO: vmctrl_03: process running with pid 32739
2022-05-19 15:55:27.941 +0200 INFO: vmctrl_04: process running with pid 32745
2022-05-19 15:55:28.073 +0200 INFO: vmctrl_13: process running with pid 32649
2022-05-19 15:55:28.080 +0200 INFO: vmctrl_08: process running with pid 321
2022-05-19 15:55:28.085 +0200 INFO: vmctrl_09: process running with pid 336
2022-05-19 15:55:28.089 +0200 INFO: vmctrl_18: process running with pid 32681
2022-05-19 15:55:28.094 +0200 INFO: vmctrl_22: process running with pid 32691
2022-05-19 15:55:28.098 +0200 INFO: vmctrl_14: process running with pid 371
2022-05-19 15:55:28.102 +0200 INFO: vmctrl_15: process running with pid 378
2022-05-19 15:55:28.106 +0200 INFO: vmctrl_19: process running with pid 380
2022-05-19 15:55:28.116 +0200 INFO: vmctrl_23: process running with pid 386
2022-05-19 15:55:28.249 +0200 INFO: vmctrl_10: process running with pid 403
2022-05-19 15:55:28.773 +0200 INFO: vmctrl_05: process running with pid 434
2022-05-19 15:55:29.881 +0200 INFO: vmctrl_11: process running with pid 477
2022-05-19 15:55:30.398 +0200 INFO: vmctrl_16: process running with pid 490
2022-05-19 15:55:32.039 +0200 INFO: vmctrl_17: process running with pid 561
2022-05-19 15:55:32.047 +0200 INFO: vm_mgr: running
Situation & Scenario 3:
First job on the box must be Auto-Commit and must be completed. This is requirement for box to be in the health state.
admin@WF-500>show jobs id 1
Enqueued Dequeued ID Type Status Result Completed
------------------------------------------------------------------------------------------------------------------------------
2022/04/04 14:13:38 14:13:38 1 AutoCom FIN OK 14:13:42
Configuration committed successfully
Successfully committed last configuration
admin@WF-500>
Situation & Scenario 4:
After reboot it is possible that raid disk is in the rebuilding state which will required additional time to finished. It is possible to see that Auto-Commit is not started before this process is finished.
admin@WF-500>tail follow yes mp-log raid.log
Jun 12 03:35:21 DEBUG: raid_util: argv: ['Rebuild80', '/dev/md3']
Jun 12 03:35:21 DEBUG: Rebuild of Disk Pair B Partition 1 80 percent complete.
Jun 12 03:35:52 DEBUG: raid_util: argv: ['Rebuild60', '/dev/md1']
Jun 12 03:35:52 DEBUG: Rebuild of Disk Pair A Partition 1 60 percent complete.
Jun 12 03:43:10 DEBUG: raid_util: argv: ['RebuildFinished', '/dev/md3']
Jun 12 03:43:10 INFO: Rebuild of Disk Pair B Partition 1 finished.
Jun 12 03:44:11 DEBUG: raid_util: argv: ['Rebuild80', '/dev/md1']
Jun 12 03:44:11 DEBUG: Rebuild of Disk Pair A Partition 1 80 percent complete.
Jun 12 03:52:31 DEBUG: raid_util: argv: ['RebuildFinished', '/dev/md1']
Jun 12 03:52:32 INFO: Rebuild of Disk Pair A Partition 1 finished.
admin@WF-500>
Additional Information
What to do next?
If above does not help and box is not booting up in some reasonable timeframe or some processes are down you can create support ticket with a following case taxonomy:
Type: Tech Support
Technology: Strata
Product/Problem Area: PAN-OS
SME Area: Management
For initial troubleshooting best option is to share tech support file but if that is not an option at least share CLI output from the commands shared in this article. As well share any information around the context and timeframe when the issue happened or observed like, after software upgrade, after configuration changes as anything which you are aware might be valuable.