Skip to main content

Netskope Help

Building Regular Expressions

DLP engine contains 3000+ predefined data identifiers that can be used in the DLP rules. DLP engine also supports custom data identifiers that use either a keyword search or regular expression search. This page describes how to write custom data identifiers for DLP using regular expressions.


This section describes the regular expressions syntax that the DLP engine supports. The DLP engine parser interprets regular expression syntax identically to the UNIX regular expression syntax.

Supported Operators


Matched Pattern


Quote the next metacharacter.


Match the beginning of a line.


Match the end of a line.


Match any character (except newline).



( )

Used for grouping to force operator precedence


Character x or y


The range of characters between x and z


Any character except z

Supported Quantifiers


Matched Pattern


Match 0 or more times


Match 1 or more times


Match 0 or 1 times


Match exactly n times


Match atleast n times


Match atleast n times, but no more than m times


The use of unrestricted greedy quantifiers of arbitrary characters such as, .* or .+ are not allowed. If you are attempting to include the characters in a class or set, reverse them. For example, *.



Matched Pattern


Match tab


Match newline


Match return


Match form feed


Match alarm (bell, beep and so on)


Match escape


Match vertical tab


Match octal character (in this example, 21 octal)


Match hex character (in this example, F0 hex)


Match wide hex character (Unicode)


Match word character (alphanum plus '_')


Match non-word character


Match whitespace character. This metacharacter also includes \n and \r


Match non-whitespace character


Match digit character


Match non-digit character


Match word boundary


Match non-word boundary


Match start of string (never match at line breaks)


Match end of string. Never match at line breaks; only match at the end of the final buffer of text submitted for matching

Examples of Regular Expressions
  • Regex to detect 16-digit credit card number



    \d - Checks for digit character.

    {4} - Match exactly n times. It validates that there are exactly 4 digits.

    -? - This would validate that the digits are occasionally separated by hyphen. ? indicates 0 or 1 times.

    This simple regex would validate that the number is a16 digit number occasionally separated by -.

    Example matches

    The regex would match 1234-5678-9123-4567 or 1234567891234567.

  • Regex to validate if the 16-digit credit card number is from a major credit card issuer

    Matches major credit cards including Visa (length 16, prefix 4) or MasterCard (length 16, prefix 51-55)



    ^ - Matches beginning of the line

    4 - To validate if the first digit is 4. Visa card starts with 4

    \d{3} - followed by 3 digits

    | - Alternation is used for matching a single regular expression out of many possible regular expressions

    (5[1-5]\d{2}) - Matches MasterCard prefix 51 to 55 followed by 2 digits

    -? - This validates if the digits are occasionally separated by hyphens. ? Indicates 0 or

    Example matches

    The regex would match 4001123456781234 or 5100123456781234.

  • Regex to check the medical record number

    Assume you have a medical record number which is 16 characters long prefixed by "NWH" which represents that the patient record is from Northwestern Hospital, followed by first 3 letters of the first name and 3 letters of the last name, followed by 7 digits.



    \b - Match the word boundary

    (NWH) - Looks for prefix NWH

    -? - This is to check if 0 or 1 occurrence of "-" exists

    [a-zA-z]{3} - Checks for three alphabet characters. It could be any character from a-z or A-Z

    \d{7} - Check for seven digit character

    Example matches

    The regex would match NWHCARVAN0000001 or NWH-TIM-BRO-0000002.