Special character sequences

Printable characters, such as “a” or “b”, are defined by simply typing them into a regular expression. In addition, there are some shorthands for common non-printable characters and character classes.

Special character sequences are listed in the following table:

Table 1. Special character sequences
Sequence Description
\a Bell (BEL) = \x07
\t Horizontal tab (HT) = \x09
\n Linefeed (LF) = \x0A
\f Formfeed (FF) = \x0C
\r Carriage return (CR) = \x0D
\e Escape (ESC) = \x1B
\OOO Octal code OOO of the character.
\xHH Hexadecimal code HH of the character. Case-insensitive. For example, "\xaa" is regarded to be the same as "\xAA".
\c<char> Control character that corresponds to Ctrl+<char >, where <char> is an uppercase letter.
\w "word" class character = [A-Za-z0-9_]
\W Non-"word" class character = [^A-Za-z0-9_]
\s Whitespace character = [ \t\r\n\f]
\S Non-whitespace character = [^ \t\r\n\f]
\d Digit character = [0-9]
\D Non-digit character = [^0-9]
\b Backspace (BS) = \x08
Note: Allowed only in bracket expressions.
\Q

<expr>

\E

Quotes all metacharacters between \Q and \E. Backslashes are regarded as normal characters.

For example, "\QC:\file.exe\E" matches the "C:\file.exe" string, not the "C:\x0Cile.exe" string, where \x0C is the formfeed "\f".

Example of using special character sequences

# This fingerprint matches HTTP content
# for which the length is >= 10000
# The situation context for this regular expression could be either 
# "HTTP Request Header Line" or "HTTP Reply Header Line"
Content-Length: \d\d\d\d\d

# The regular expression could be also written as shown below
Content-Length: \d{5}