Netskope Help

Use Dictionary Files

A dictionary file is a CSV file that can contain keywords, phrases, or regular expressions you want to find using a DLP policy. Each dictionary file can contain either keywords and phrases, or regular expressions.

To use a dictionary file, create a CSV file with one keyword, phrase, or regular expression per line.

You can optionally specify a weight for each keyword or phrase. The weight of a keyword is the number based on which the violation score is calculated. Violation score of a rule is the sum of weights of the rule count where, a rule count is the number of times a rule is matched. The higher the keyword weight, the higher the violation score.

The violation score determines when to trigger a rule in case of a violation.

If a weight is not specified, then a default weight of 1 is assigned to the keyword or phrase.

Note

Weight is not assigned to regular expressions.

To define the keyword in the CSV file, use the format  [keyword],[weight]  where the weight is optional and can be any value between -100 and 100. Use positive values to increase the violation score and negative values to decrease the violation score.

Example 1. Example

For example, if you are creating a DLP policy to identify AWS access keys, your access key dictionary can contain the following keywords and phrases with weights.

access key ID, 50
AWS, 10
AWS access key, 100
AWSAccessKeyId, 100
access keys
access, -20
Public Cloud, -100

If you created a rule such as C0 NEAR D0 where,

  • C0 is a custom identifier (?<![A-Z0-9])[A-Z0-9]{20}(?![A-Z0-9]) to identify an AWS access key ID, and

  • D0 is the access key dictionary.

As an example, if a document is found to contain the following statements,

"Generate the access key"

"Enter the AWS access key ID AKIAIVLZMKR5WEXAMPLE"

then, the rule count for "Generate the access key" is zero and the rule count for "Enter the AWS access key ID AKIAIVLZMKR5WZSQO5ZA" is one.

The total violation score for this document will be 100.



To create a new dictionary,

  1. Go to Policies > Profiles > DLP > Edit Rules > Data Loss Prevention > Dictionary.

    DLPdictionaryFile.png
  2. Click New Dictionary.

  3. Enter a name for the dictionary file and choose whether the terms are case-sensitive or not.

  4. If your dictionary file contains regular expressions, select Regex Dictionary.

    Note

    A regex dictionary file can only contain a maximum of 25 entries per file.

    For information on building regular expressions, see Building Regular Expressions.

  5. Click Select CSV File. Locate and select your dictionary file, click Open, and then click Save.

  6. When creating a DLP rule, the dictionary file can be selected on the Custom page of the DLP Rule workflow.