Character Escapes in Regular Expressions


The backslash (\) in a regular expression indicates one of the following:

  • The character that follows it is a special character, as shown in the table in the following section. For example, \b is an anchor that indicates that a regular expression match should begin on a word boundary, \t represents a tab, and \x020 represents a space.

  • A character that otherwise would be interpreted as an unescaped language construct should be interpreted literally. For example, a brace ({) begins the definition of a quantifier, but a backslash followed by a brace (\{) indicates that the regular expression engine should match the brace. Similarly, a single backslash marks the beginning of an escaped language construct, but two backslashes (\\) indicate that the regular expression engine should match the backslash.

System_CAPS_ICON_note.jpg Note

Character escapes are recognized in regular expression patterns but not in replacement patterns.

The following table lists the character escapes supported by regular expressions in the .NET Framework.

Character or sequenceDescription
All characters except for the following:

. $ ^ { [ ( | ) * + ? \
Characters other than those listed in the Character or sequence column have no special meaning in regular expressions; they match themselves.

The characters included in the Character or sequence column are special regular expression language elements. To match them in a regular expression, they must be escaped or included in a positive character group. For example, the regular expression \$\d+ or [$]\d+ matches "$1200".
\aMatches a bell (alarm) character, \u0007.
\bIn a [character_group] character class, matches a backspace, \u0008. (See Character Classes.) Outside a character class, \b is an anchor that matches a word boundary. (See Anchors.)
\tMatches a tab, \u0009.
\rMatches a carriage return, \u000D. Note that \r is not equivalent to the newline character, \n.
\vMatches a vertical tab, \u000B.
\fMatches a form feed, \u000C.
\nMatches a new line, \u000A.
\eMatches an escape, \u001B.
\ nnnMatches an ASCII character, where nnn consists of two or three digits that represent the octal character code. For example, \040 represents a space character. This construct is interpreted as a backreference if it has only one digit (for example, \2) or if it corresponds to the number of a capturing group. (See Backreference Constructs.)
\x nnMatches an ASCII character, where nn is a two-digit hexadecimal character code.
\c XMatches an ASCII control character, where X is the letter of the control character. For example, \cC is CTRL-C.
\u nnnnMatches a UTF-16 code unit whose value is nnnn hexadecimal. Note: The Perl 5 character escape that is used to specify Unicode is not supported by the .NET Framework. The Perl 5 character escape has the form \x{####…}, where #### is a series of hexadecimal digits. Instead, use \unnnn.
\When followed by a character that is not recognized as an escaped character, matches that character. For example, \* matches an asterisk (*) and is the same as \x2A.

The following example illustrates the use of character escapes in a regular expression. It parses a string that contains the names of the world's largest cities and their populations in 2009. Each city name is separated from its population by a tab (\t) or a vertical bar (| or \u007c). Individual cities and their populations are separated from each other by a carriage return and line feed.

using System;
using System.Text.RegularExpressions;

public class Example
   public static void Main()
      string delimited = @"\G(.+)[\t\u007c](.+)\r?\n";
      string input = "Mumbai, India|13,922,125\t\n" + 
                            "Shanghai, China\t13,831,900\n" + 
                            "Karachi, Pakistan|12,991,000\n" + 
                            "Delhi, India\t12,259,230\n" + 
                            "Istanbul, Turkey|11,372,613\n";
      Console.WriteLine("Population of the World's Largest Cities, 2009");
      Console.WriteLine("{0,-20} {1,10}", "City", "Population");
      foreach (Match match in Regex.Matches(input, delimited))
         Console.WriteLine("{0,-20} {1,10}", match.Groups[1].Value, 
// The example displyas the following output:
//       Population of the World's Largest Cities, 2009
//       City                 Population
//       Mumbai, India        13,922,125
//       Shanghai, China      13,831,900
//       Karachi, Pakistan    12,991,000
//       Delhi, India         12,259,230
//       Istanbul, Turkey     11,372,613

The regular expression \G(.+)[\t|\u007c](.+)\r?\n is interpreted as shown in the following table.

\GBegin the match where the last match ended.
(.+)Match any character one or more times. This is the first capturing group.
[\t\u007c]Match a tab (\t) or a vertical bar (|).
(.+)Match any character one or more times. This is the second capturing group.
\r?\nMatch zero or one occurrence of a carriage return followed by a new line.

Regular Expression Language - Quick Reference