Character Escapes
The backslash (\) in a regular expression indicates one of the following:
The character that follows it is a special character, as shown in the table in the following section. For example, \b is an anchor that indicates that a regular expression match should begin on a word boundary, \t represents a tab, and \x020 represents a space.
A character that otherwise would be interpreted as an unescaped language construct should be interpreted literally. For example, a brace ({) begins the definition of a quantifier, but a backslash followed by a brace (\{) indicates that the regular expression engine should match the brace. Similarly, a single backslash marks the beginning of an escaped language construct, but two backslashes (\\) indicate that the regular expression engine should match the backslash.
![]() |
---|
Character escapes are recognized in regular expression patterns but not in replacement patterns. |
The following table lists the character escapes supported by regular expressions in the .NET Framework.
Character or sequence | Description |
---|---|
All characters except for the following: . $ ^ { [ ( | ) * + ? \ | These characters have no special meanings in regular expressions; they match themselves. |
\a | Matches a bell (alarm) character, \u0007. |
\b | In a [character_group] character class, matches a backspace, \u0008. (See Character Classes.) Outside a character class, \b is an anchor that matches a word boundary. (See Anchors in Regular Expressions.) |
\t | Matches a tab, \u0009. |
\r | Matches a carriage return, \u000D. Note that \r is not equivalent to the newline character, \n. |
\v | Matches a vertical tab, \u000B. |
\f | Matches a form feed, \u000C. |
\n | Matches a new line, \u000A. |
\e | Matches an escape, \u001B. |
\nnn | Matches an ASCII character, where nnn consists of up to three digits that represent the octal character code. For example, \040 represents a space character. However, this construct is interpreted as a backreference if it has only one digit (for example, \2) or if it corresponds to the number of a capturing group. (See Backreference Constructs.) |
\xnn | Matches an ASCII character, where nn is a two-digit hexadecimal character code. |
\cX | Matches an ASCII control character, where X is the letter of the control character. For example, \cC is CTRL-C. |
\unnnn | Matches a Unicode character, where nnnn is a four-digit hexadecimal UTF-16 code unit. ![]() The Perl 5 character escape that is used to specify Unicode is not supported by the .NET Framework. The Perl 5 character escape has the form \x{####…}, where ####… is a series of hexadecimal digits. Instead, use \unnnn. |
\ | When followed by a character that is not recognized as an escaped character, matches that character. For example, \* matches an asterisk (*) and is the same as \x2A. |
The following example illustrates the use of character escapes in a regular expression. It parses a string that contains the names of the world's largest cities and their populations in 2009. Each city name is separated from its population by a tab (\t) or a vertical bar (| or \u007c). Individual cities and their populations are separated from each other by a carriage return and line feed.
using System; using System.Text.RegularExpressions; public class Example { public static void Main() { string delimited = @"\G(.+)[\t\u007c](.+)\r?\n"; string input = "Mumbai, India|13,922,125\t\n" + "Shanghai, China\t13,831,900\n" + "Karachi, Pakistan|12,991,000\n" + "Dehli, India\t12,259,230\n" + "Istanbul, Turkey|11,372,613\n"; Console.WriteLine("Population of the World's Largest Cities, 2009"); Console.WriteLine(); Console.WriteLine("{0,-20} {1,10}", "City", "Population"); Console.WriteLine(); foreach (Match match in Regex.Matches(input, delimited)) Console.WriteLine("{0,-20} {1,10}", match.Groups[1].Value, match.Groups[2].Value); } } // The example displyas the following output: // Population of the World's Largest Cities, 2009 // // City Population // // Mumbai, India 13,922,125 // Shanghai, China 13,831,900 // Karachi, Pakistan 12,991,000 // Dehli, India 12,259,230 // Istanbul, Turkey 11,372,613
The regular expression \G(.+)[\t|\u007c](.+)\r?\n is interpreted as shown in the following table.
Pattern | Description |
---|---|
\G | Begin the match where the last match ended. |
(.+) | Match any character one or more times. This is the first capturing group. |
[\t\u007c] | Match a tab (\t) or a vertical bar (|). |
(.+) | Match any character one or more times. This is the second capturing group. |
\r?\n | Match zero or one occurrence of a carriage return followed by a new line. |