How do I define phrases for custom dictionaries?
This article provides instructions on defining phrases for custom dictionaries in Administration > DLP Dictionaries & Engines. For information on other steps required to configure custom dictionaries, see Adding Custom Dictionaries in How do I configure Dictionaries and Engines?
You can add to your custom dictionaries phrases that represent content you want to protect for your organization. General guidelines for phrases include the following:
- A dictionary can contain up to 120 phrases.
- Each phrase can have a maximum of 128 characters.
- The dictionary phrase-matching is not case-sensitive and ignore punctuations.
- The dictionary counts all matching phrases, including identical phrases.
- You can place quotes around phrases to specify that the dictionary detect only phrases that exactly match the phrase in the order given within the quotes.
The Zscaler service uses fuzzy matching techniques to ensure that phrases do not go undetected by dictionaries because of capitalization or spacing discrepancies and the existence of noise words (such as spurious words or HTML tags) between words. For instance, the configured phrase “security service” would match the text “security <b>service</b>”.
Sometimes this fuzzy matching results in matching phrases from an irrelevant context and can cause false positives. In such cases, quotes or double quotes can be placed around the words of a phrase to disable fuzzy matching and match only the phrase within the quotes. Zscaler then ignores the different types of whitespace between the words. For instance, “security service” will match “security-service,” “security,service,” and other such unrelated phrases.
To add phrase(s):
- Enter phrase(s) you want the dictionary to match when scanning content (see guidelines for phrases above). Note that you can enter phrases one by one, or you can bulk-load phrases by copying and pasting into the field. If you're bulk-loading phrases, ensure that each phrase is on a different line when you copy and paste. The dictionary will count all words on a single line as one phrase.
Click for more on bulk-loading phrases.
- For the phrase, specify the Action the dictionary takes upon detecting a valid match. Select one of the following options from the dropdown menu:
The dictionary ignores matches of the phrase. The Ignore action is for testing purposes; no action is taken if the phrase is detected, but occurrences of the phrase are recorded for your analysis in the logs for DLP.
The dictionary counts each match of the phrase toward the Number of Violations threshold. (For example, consider a custom dictionary for which the phrase "Confidential Information" has been defined, and Count as the specified action for the pattern. If the content this dictionary scans contains three instances of the phrase "Confidential Information," all three instances would count as three matches.)
The dictionary immediately triggers upon a match of the phrase.
- To add another phrase, select the Add Phrase icon, as shown below.