Export (0) Print
Expand All

Grouping Constructs 

Grouping constructs delineate subexpressions of a regular expression and typically capture substrings of an input string. The following table describes the regular expression grouping constructs.

Grouping construct Description

(subexpression)

Captures the matched subexpression (or noncapturing group; for more information, see the ExplicitCapture option in Regular Expression Options). Captures using () are numbered automatically based on the order of the opening parenthesis, starting from one. The first capture, capture element number zero, is the text matched by the whole regular expression pattern.

(?<name> subexpression)

Captures the matched subexpression into a group name or number name. The string used for name must not contain any punctuation and cannot begin with a number. You can use single quotes instead of angle brackets; for example, (?'name').

(?<name1-name2> subexpression)

(Balancing group definition.) Deletes the definition of the previously defined group name2 and stores in group name1 the interval between the previously defined name2 group and the current group. If no group name2 is defined, the match backtracks. Because deleting the last definition of name2 reveals the previous definition of name2, this construct allows the stack of captures for group name2 to be used as a counter for keeping track of nested constructs such as parentheses. In this construct, name1 is optional. You can use single quotes instead of angle brackets; for example, (?'name1-name2').

For more information, see the example in this topic.

(?: subexpression)

(Noncapturing group.) Does not capture the substring matched by the subexpression.

(?imnsx-imnsx: subexpression)

Applies or disables the specified options within the subexpression. For example, (?i-s: ) turns on case insensitivity and disables single-line mode. For more information, see Regular Expression Options.

(?= subexpression)

(Zero-width positive lookahead assertion.) Continues match only if the subexpression matches at this position on the right. For example, \w+(?=\d) matches a word followed by a digit, without matching the digit. This construct does not backtrack.

(?! subexpression)

(Zero-width negative lookahead assertion.) Continues match only if the subexpression does not match at this position on the right. For example, \b(?!un)\w+\b matches words that do not begin with un.

(?<= subexpression)

(Zero-width positive lookbehind assertion.) Continues match only if the subexpression matches at this position on the left. For example, (?<=19)99 matches instances of 99 that follow 19. This construct does not backtrack.

(?<! subexpression)

(Zero-width negative lookbehind assertion.) Continues match only if the subexpression does not match at the position on the left.

(?> subexpression)

(Nonbacktracking subexpression (also known as a "greedy" subexpression.)) The subexpression is fully matched once, and then does not participate piecemeal in backtracking. (That is, the subexpression matches only strings that would be matched by the subexpression alone.)

By default, if a match does not succeed, backtracking searches for other possible matches. If you know backtracking cannot succeed, you can use a nonbacktracking subexpression to prevent unnecessary searching, which improves performance.

Named captures are numbered sequentially, based on the left-to-right order of the opening parenthesis (like unnamed captures), but the numbering of named captures starts after all unnamed captures have been counted. For example, the pattern ((?<One>abc)/d+)?(?<Two>xyz)(.*) produces the following capturing groups by number and name. (The first capture (number 0) always refers to the entire pattern).

Number Name Pattern

0

0 (default name)

((?<One>abc)/d+)?(?<Two>xyz)(.*)

1

1 (default name)

((?<One>abc)/d+)

2

2 (default name)

(.*)

3

One

(?<One>abc)

4

Two

(?<Two>xyz)

The following code example demonstrates using a balancing group definition to match left and right angle brackets (<>) in an input string. The capture collections of the Open and Close groups in the example are used like a stack to track matching pairs of angle brackets: each captured left angle bracket is pushed into the capture collection of the Open group; each captured right angle bracket is pushed into the capture collection of the Close group; and the balancing group definition ensures there is a matching right angle bracket for each left angle bracket.

// This code example demonstrates using the balancing group definition feature of 
// regular expressions to match balanced left angle bracket (<) and right angle 
// bracket (>) characters in a string. 

using System;
using System.Text.RegularExpressions;

class Sample 
{
    public static void Main() 
    {
/*
    The following expression matches all balanced left and right angle brackets(<>). 
    The expression:
    1)   Matches and discards zero or more non-angle bracket characters.
    2)   Matches zero or more of:
    2a)  One or more of:
    2a1) A group named "Open" that matches a left angle bracket, followed by zero 
         or more non-angle bracket characters. 
         "Open" essentially counts the number of left angle brackets.
    2b) One or more of:
    2b1) A balancing group named "Close" that matches a right angle bracket, 
         followed by zero or more non-angle bracket characters. 
         "Close" essentially counts the number of right angle brackets.
    3)   If the "Open" group contains an unaccounted for left angle bracket, the 
        entire regular expression fails.
*/
    string pattern = "^[^<>]*" +
                     "(" + 
                       "((?'Open'<)[^<>]*)+" +
                       "((?'Close-Open'>)[^<>]*)+" +
                     ")*" +
                     "(?(Open)(?!))$";
    string input = "<abc><mno<xyz>>";
//
    Match m = Regex.Match(input, pattern);
    if (m.Success == true)
        Console.WriteLine("Input: \"{0}\" \nMatch: \"{1}\"", input, m);
    else
        Console.WriteLine("Match failed.");
    }
}

/*
This code example produces the following results:

Input: "<abc><mno<xyz>>"
Match: "<abc><mno<xyz>>"

*/

Community Additions

ADD
Show:
© 2014 Microsoft