How does PAN-DB learn that a previously malicious or compromised web-site has been remediated?
10514
Created On 12/18/20 18:25 PM - Last Modified 03/16/21 19:00 PM
Question
What is the mechanism used by PAN-DB to learn that a site does no longer contain malicious content?
Environment
- Palo Alto Networks Firewall
- Valid URL Filtering PAN-DB license
Answer
We do not actively re-crawl malware sites to see if they’ve gone benign. This is true for both legacy as well as for newly discovered malware sites. In essence, do always rely on a user manually reporting sites as clean for us to remediate our categorization.
Couple of reasons:
- The crawler can go to a site right now and observe malicious content. Go back again and only observe benign content. The crawler can then go to the site from a different source IP and get malicious content, etc. Based on this potential intermittency in detection, if we actively went back, we would incorrectly categorize a malware site as benign.
- The intermittency is not always 50/50 as it can sometimes go to a site 10 times and only 1 of those times observe malicious content, the other 9 times may observe benign (or highly suspicious) content. Malware can also be targeted to very specific geolocations and/or endpoints so we could see inconsistent results.
When WildFire is the source of the categorization, we’ll have all sorts of false-positive checks. We see lots of malicious packages that will check google.com, ntp.gov, etc and we don’t mark those as malware (we use whitelisting).