System Diagnostics Configurations

System Diagnostics Configurations

562
Created On 05/26/21 13:18 PM - Last Modified 05/26/21 13:18 PM


Objective
The System Diagnostics page can be customized to display alerts specific to your needs. For example, the default size for a task to be considered ‘big’ is 150000 bytes. If you have system resource constraints, you might need to change this to a lower number. If you have playbooks that require large tasks and your deployment supports it, you might want to change this to a higher number. By adding server configurations, you can set custom thresholds for the size of records, the number of running Docker containers, the length of searches, etc. that generate alerts on the System Diagnostics page.

Procedure

To add a server configuration, go to Settings  > About > Troubleshooting > Add Server Configuration.


Additional Information
ComponentKeyDefault ValueDescription
insightCachediagnostics.insightCache.min.KB100Minimum size (in KB) of the indicator cache for it to be considered big.
 diagnostic.insightCache.enabledtrueEnables measurement of indicators cache size.
incident sizediagnostic.incident.size.enabledtrueEnables measurement of indicator field size.
 diagnostics.incident.min.KB1000Minimum size (in KB) of incident fields for them to be considered big.
Docker containersdiagnostics.docker.containers.atRisk.count150Number of running containers that cause an at risk alert.
 diagnostics.docker.containers.issue.count200Number of running containers that cause an issue alert.
Slow searchdiagnostics.slowSearch.days.ago1Number of days back to look for slow searches.
 diagnostics.slowSearch.minDuration.seconds60Minimum duration (in seconds) of a search for it to be defined as slow.
 diagnostics.slowSearch.issue.amount20Minimum number of slow searches (within a specified look back period) that cause an issue alert.
 diagnostics.slowSearch.warning.amount1Minimum number of slow searches (within a specified look back period) that cause an at risk alert.
 diagnostics.slowsearches.enabletrueEnables measurement of search duration.
 diagnostics.save.slow.searches.limit100Maximum number of slow search statistics that are saved.
 diagnostics.save.slow.searches.limit.days14Maximum number of days to save slow search statistics.
auditsdiagnostics.audits.threshold1073741824Minimum size (in bytes) for the audit log to be considered big. Default is 1 GB.
Auto-quiettask.size.limit.bytes250000Minimum size (in bytes) for a task to be changed automatically to quiet mode.
 task.size.limit.enabledtrueEnables measurement of task size.
 task.auto.quiet.mode.enabledtrueEnables automatically changing tasks to quiet mode based on task size.
 diagnostic.workplan.auto.quiet.filter.results.days14Number of days to look back to count total number of tasks automatically changed to quiet mode. Total number of tasks changed within this time period are displayed on the System Diagnostics page.
 diagnostic.workplan.auto.quiet.issue.last.days2Number of days to look back to count number of tasks automatically changed to quiet mode that should cause an issue alert.
 diagnostic.workplan.auto.quiet.task.issue.threshold0Maximum number of tasks automatically changed to quiet mode within the timeframe defined by diagnostic.workplan.auto.quiet.issue.last.days before causing an issue alert.
 task.auto.quiet.mode.ignore.playbook.[playbook_name]No default, it's set to trueTurns off auto quiet mode for a specific playbook. The key must be followed by the playbook name, with spaces or periods replaced by underscores. For example, for the playbook 'Email Address Enrichment - Generic v2.1', the key is task.auto.quiet.mode.ignore.playbook.email_address_Enrichment_-_Generic_v2_1. The value is true (disables auto quiet mode for the playbook) or false (enables auto quiet mode for the playbook).
big.tasksdiagnostic.workplan.big.tasks.filter.results.days14Number of days back to look for big tasks.
 diagnostic.workplan.big.tasks.bytes.threshold150000Minimum size (in bytes) of a task for it to be considered big.
 diagnostic.workplan.big.tasks.atRisk1Minimum number of big tasks to cause an at risk alert.
websocketwebsocket.diagnostics.queues.percent80Maximum percentage a websocket messages queue can be full before being counted toward the websocket.diagnostics.queues.threshold.
 websocket.diagnostics.queues.threshold5Maximum number of queues (per appserver) above the "percent full" threshold before causing an at risk alert. An at risk alert is caused if threshold is passed on at least 1 app server.
 websocket.diagnostics.enabledtrueEnables the websockets diagnostics (disconnects) service. Requires restart to take effect.
 websocket.diagnostics.disconnects.threshold30Maximum number of (ungraceful) websocket disconnects (per appserver), within the timeframe defined by websocket.diagnostics.disconnects.hours, before causing an at risk alert. An at risk alert is caused if threshold is passed on at least 1 app server.
 websocket.diagnostics.disconnects.hours12Number of hours the websocket disconnects service should save data. Requires restart to take effect.
big.workplandiagnostic.workplan.big.workplan.enabledtrueEnables measurement of big workplans.
 diagnostic.workplan.big.workplan.bytes.threshold3000000Minimum size (in bytes) of a workplan to be considered big. Must be larger than the diagnostic.workplan.big.workplan.indexer.min.save.bytes value.
 diagnostic.workplan.big.workplan.issue.threshold30Minimum number of big workplans that cause an issue alert.
 diagnostic.workplan.big.workplan.atRisk.threshold0Minimum number of big workplans that cause an at-risk alert.
 diagnostic.workplan.big.workplan.filter.results.days90Number of days to look back for big workplans.
 diagnostic.workplan.big.workplan.indexer.min.save.bytes1000000Minimum size (in bytes) of a workplan to be saved as a potentially big workplan. Saved workplans are then evaluated by diagnostic.workplan.big.workplan.bytes.threshold to check if they meet the big workplan threshold.
Investigation Contextdiagnostics.invContext.min.KB1000Minimum size (in KB) of incident context to be considered big.
 diagnostic.invContext.size.enabledtrueEnables measurement of the incident context.
Notificationsdiagnostics.notification.send.to.rolesAdministratorComma-separated list of roles that receive diagnostics notifications.
 diagnostics.notification.send.to.default.adminstrueSend diagnostics notifications to the default administrators.
 diagnostics.notification.enabledtrueEnables diagnostics notifications.
 diagnostics.notification.send.on.atRisktrueSend diagnostics notifications when a diagnostic check is "at risk". Set to "false" to only receive notifications when a diagnostic check reports an "issue".
war.roomdiagnostic.war.room.issue.above.mb3Minimum size (in MB) of a War Room to be considered big. 
 diagnostic.war.room.min.count.for.issue1Minimum number of big War Rooms to cause an issue alert.
disk spacediagnostic.disk.stats.used.space.at.risk.percent75Maximum percentage a local storage mount can be full before causing an at-risk alert.
 diagnostic.disk.stats.used.space.issue.percent90Maximum percentage a local storage mount can be full before causing an issue alert.
disk latencydiagnostic.disk.stats.latency.at.risk.ms7Maximum disk latency (in milliseconds) before causing an at risk alert.
 diagnostic.disk.stats.latency.issue.ms10Maximum disk latency (in milliseconds) before causing an issue alert.


Actions
  • Print
  • Copy Link

    https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA14u0000001VToCAM&refURL=http%3A%2F%2Fknowledgebase.paloaltonetworks.com%2FKCSArticleDetail

Attachments