DLP Entity

DLP Entity

In Netskope DLP, Entities refer to data identifiers and dictionaries. Data identifiers are common terms used to categorize certain types of identifiable data, and dictionaries are files with keyword and regular expressions. Entities are used in a rule to identify sensitive data.

To open the Entities page, in the Netskope UI go to Policies > Profiles > DLP > EDIT RULES and select Data Loss Prevention. Then click on the Entities tab.

dlp_entity.png

Data Identifier

Netskope provides a wide list of predefined data identifiers with meaningful names and descriptions. The full list of predefined data identifiers can be seen in the New DLP Rule workflow.

To view the full list of predefined data identifiers, in the Rules tab of the Data Loss Prevention Rules page, click New Rule. In the New DLP Rule dialog box, all the predefined identifiers are listed under categories.

DLPEntityIdentifiers.png

To create custom data identifiers,

  1. In the ENTITIES tab, click NEW ENTITY. The Create Entity dialog box is displayed with the Data identifier option selected.
    dlp_create_custom_data_identifier.png
  2. Provide a name for the custom data identifier and choose whether the new identifier is case-sensitive or not.
  3. Add a predefined data identifier in the format {{predefined_data_identifier}}, a keyword, or a regular expression. For example, a predefined identifier such as {{Full Names (US)}}, a keyword such as Name, or a regex such as d[5,10].

    Click the Validate Regex button to validate the syntax of the regular expression. For more information on the supported operators, quantifiers, and metacharacters for regular expressions, see Building Regular Expressions.

  4. Under Advanced Options, you can set various conditions to narrow down the results when this identifier is used in a DLP rule. For more information, see the Advanced Options section.

Entity Obfuscation

By enabling Entity Obfuscation, this entity’s matched data will be obfuscated in DLP’s incident forensic data. On any obfuscation method, if the number of obfuscated characters in the match is fewer than 5, then all of the digits and/or letters are obfuscated.

DLPEntityObfuscationOptions.png

There are four methods of obfuscation:

  1. Obfuscate all characters – eg: XXXX-XXXX-XXXX-XXXX
  2. Display only the first 4 characters – eg: 1234-XXXX-XXXX-XXXX
  3. Display only the last 4 characters – eg: XXXX-XXXX-XXXX-7890
  4. Display only the first and last 4 characters – eg: 1234-XXXX-XXXX-7890

Filters

Filters reject matches that are unlikely or implausible. There are special considerations for predefined Entities.

Use Filters — provides two filters: “Common-Sense” and “Unlikely Matches”.

  • If a predefined Entity has no filters, the Use Filters and Use Validator for Regex options will be grayed out.
  • If a predefined Entity has only one of the two filters, the Use Validator For Regex option will be greyed out and the default validator for the selected predefined entity will be applied.
  • If a predefined Entity has both filters, Use Filters and both filters underneath will be checked. Use Validator for Regex options will be grayed out.

Dictionary

A dictionary can be a keyword dictionary or a regular expression dictionary. A dictionary file is a CSV file that can contain keywords and phrases, or regular expressions you want to find using a DLP rule. Each dictionary file can contain either keywords and phrases, or regular expressions.

To use a dictionary file, create a CSV file with one keyword, phrase, or regular expression per line. A regular expression dictionary file can contain up to 25 entries. For more information on the supported operators, quantifiers, and metacharacters for regular expressions, see Building Regular Expressions.

Netskope also supports weighted dictionaries where you can specify a weight for each keyword or phrase. The weight of a keyword is the number based on which the violation score is calculated. Violation score of a rule is the sum of weights of the rule count where a rule count is the number of times a rule is matched. The higher the keyword weight, the higher the violation score. The violation score determines when to trigger a rule in case of a violation. If a weight is not specified, then a default weight of 1 is assigned to the keyword or phrase.

Note

Weight is not assigned to regular expressions.

To define the keyword in the CSV file, use the format [keyword],[weight] where the weight is optional and can be any value between -100 and 100. Use positive values to increase the violation score and negative values to decrease the violation score.

Example

For example, if you are creating a DLP policy to identify AWS access keys, your access key dictionary can contain the following keywords and phrases with weights.

access key ID, 50
AWS, 10
AWS access key, 100
AWSAccessKeyId, 100
access keys
access, -20
Public Cloud, -100

If you created a rule such as C0 NEAR D0 where,

  • C0 is a custom identifier (?<![A-Z0-9])[A-Z0-9]{20}(?![A-Z0-9]) to identify an AWS access key ID, and
  • D0 is the access key dictionary.

As an example, if a document is found to contain the following statements,

  • Generate the access key
  • Enter the AWS access key ID AKIAIVLZMKR5WZSQO5ZA

then, the rule count for “Generate the access key” is zero and the rule count for “Enter the AWS access key ID AKIAIVLZMKR5WZSQO5ZA” is one.

The total violation score for this document will be 100.

To create a new dictionary,

  1. In the Entity tab, click New Entity. The Create Entity dialog box is displayed.
  2. Select Dictionary and then select Keyword Dictionary or RegEx Dictionary.
    dlp_create_dictionary.png
  3. Provide a name for the dictionary and choose whether the new dictionary is case sensitive or not.
  4. Click Select File. Locate and select your dictionary file, click Open to upload the file.
  5. Under Advanced Options, you can set various conditions to narrow down the results when this dictionary is used in a DLP rule. For more information, see the Advanced Options section.

Advanced Options

The Advanced Options enable you to set conditions that can help you narrow down the search results for the entity when used in a DLP rule. The following are the Advanced Options.

  • Begins with, Ends with, Does not match: provides you options to add conditions to include or exclude specific keywords or regexes.
    dlp_new_entity_advop_1.png
  • Use Filters — provides two filters: “Common-Sense” and “Unlikely Matches”.
    • The “Common-Sense” filter rejects a match that consists primarily of repeating or sequential characters. For example, “aabbcc” or “22222“.
    • The “Unlikely Matches” filter rejects an unlikely match by examining the characters present before or after the matched data. For example, “80*125752000=10060160000"does not likely contain a 9-digit US Social Security Number of any interest, so this match would be rejected.
  • Use Validators For Regex — provides three common validation algorithm options (“Luhn”, “Elfproef”, and “Verhoeff”) to reject matches that do not pass the validation check for the selected algorithm.
Share this Doc

DLP Entity

Or copy link

In this topic ...