What is the AV signature generation process?
Question
How do I know if Palo Alto Networks has coverage for a malicious file and how is that signature created?
Answer
Signature creation and real-time threat protection is ever changing due to the threat landscape's continual evolution; to combat this Palo Alto Networks has created a two-fold approach for AV coverage using standard signatures and ML-AV (Machine Learning Antivirus).
Customers often inquire if there is coverage for particular files. The first step is a search on Threat Vault (https://threatvault.paloaltonetworks.com/). Threat Vault is a customer-facing portal showing all coverage Palo Alto Networks has. This tool is immensely useful for customers as a first step, as it can avoid the need to open a case with Support to ask for information that’s readily available. This speeds up users' research time. Customers can search for a myriad of indicators such as CVEs, IP addresses, hashes, signature ID, domains, URLs and IPs.
Something to keep in mind in regard to file hashes is Palo Alto Networks use the SHA256 so there may be times where the MD5 of a file doesn’t show up, but it’s SHA256 does. This is fine as the SHA256 is what’s used. The Hashes in Threat Vault are the hash of file or files used to create or update the signature. It is not a list of the hashes of all the files that can be detected with this signature. Palo Alto Networks employs a method of pattern match detection so a single signature may be used across multiple files, this elevates the issue of having to create an entirely new signature for a variant if the malicious pattern is still present.
Process for signature creation from Wildfire:
0. The customer NGFW (Next-Generation Firewall) sees an unknown file
1. The file triggers NGFW forwarding
2. The NGFW submits the file to the WF (Wildfire) Analyzer Cloud
3. Static Analysis is run on the file and potentially Dynamic Analysis. Due to Dynamic analysis being so resource-intensive, Dynamic Analysis is only run if WF sees something that requires a more in-depth analysis. If the file is given a malware verdict, it moves to step 4.
4. The WF analyzer cloud sends a message to the AV system for siggen (signature generation).
5. The AV system processes the message, generates a signature, and then stores it in the AV database.
Signature Delivery:
1. . Next 5 min WF package -> signature should be picked up and delivered to customers
2. Next daily AV package
3. Signature is also streamed to the RTSig (Real-Time Signature) cloud and available for query/stream.
NOTE:
When a file is marked malicious and a signature hasn't been created for it there may be an issue with writing to the XML file after the WF analyzer cloud sends the message to the AV system for the siggen. When this happens a Support Case should be opened so the Content Engineers can manually push the file through the siggen process.
——
ML-AV (Machine Learning Antivirus)
ML-AV does not use signatures. It looks for similarities in behavior based on training models. It allows the NGFW data plane to apply ML to files. Each inline ML model dynamically detects malicious files of a specific type by evaluating file details, including decoder fields and patterns, to formulate a high probability classification of a file. To keep up with the latest changes in the threat landscape, inline ML models are added or updated via content releases.
Mid transfer an on-box AV cache and RTSig cloud lookup will take place to look for pattern matches of trained models. If one is found the traffic will be blocked.
ML-AV is not a replacement for Wildfire signatures but rather is an augment to it.
Additional Information
https://docs.paloaltonetworks.com/pan-os/10-0/pan-os-admin/threat-prevention/wildfire-inline-ml
https://threatvault.paloaltonetworks.com/