Data filtering profile not matching Regex for SMTP traffic
9224
Created On 03/08/19 10:51 AM - Last Modified 04/08/25 08:59 AM
Symptom
- A relevant number of European SSN regex is not included natively in the Predefined data Pattern and need custom regex pattern type Regular Expression.
- When trying to filter emails by SSN numbers using regular expressions, like for instance UK National Insurance number where the format is two prefix letters with six digits and one suffix letter, the Palo Alto Networks Data Filtering engine cannot detect the pattern and take the proper action
- The regular expression in this example is following next regex data pattern:
- The SSN in this example is QQ123456C. Often, the number is printed with spaces to pair off the digits, like this: QQ 12 34 56 C.
Environment
- NGFW
- Supported PANOS versions
- Data Filtering Profile with Regular Expression Data Pattern
- SMTP traffic
Cause
- It is not possible to detect next regular expression by matching only through the SSN in the email body or subject:
- However, using tool Curl and "POST" method through HTTP and next regular expression, that Data Pattern is working as expected:
- Data pattern named "third danish":
.*(Content).*(([a-zA-Z])([a-zA-Z])( )?([0-9])([0-9])( )?([0-9])([0-9])( )?([0-9])([0-9])( )?([a-zA-Z])?)
- Content of the file:
QQ 12 34 56 C
- This file was properly detected:
2019/02/27 03:08:59 web-browsing Trust2-L3 63305 192.168.5.5 hgk allow Untrust-L3 80 93.184.216.34 info third danish(60003) 0
- The cause is that PAN can not be configured to perform Data Pattern matching on the email text body itself. What it is supported is to examine SMTP traffic and perform data pattern matching on the files attached to it. For example, it is possible to perform data pattern matching on doc, docx, and other file types.
- The pattern match is keyed off the string match. First the string is matched and then the regex is checked. The regex match is limited to the packet where the string is matched plus the 36 bytes from the prior packet.
Resolution
- PAN firewalls do not support Data Pattern matching on the email text body itself since this feature only scans files.
- The workaround is to attach a file with the body of the email in the email itself, which is the expected behavior of the Data Filtering tool when it is not included natively in the Predefined Data Pattern feature.
Additional Information
For other regex expressions impacted, it is possible to verify next links:
https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA10g000000ClqXCAS
https://docs.paloaltonetworks.com/pan-os/8-0/pan-os-web-interface-help/objects/objects-custom-objects/objects-custom-objects-data-patterns/syntax-for-regular-expression-data-patterns.html