Quantifiers in Regular Expressions

Quantifiers specify how many instances of a character, group, or character class must be present in the input for a match to be found. The following table lists the quantifiers supported by the .NET Framework.

Greedy quantifier

Lazy quantifier

Description

*

*?

Match zero or more times.

+

+?

Match one or more times.

?

??

Match zero or one time.

{ n }

{ n }?

Match exactly n times.

{ n ,}

{ n ,}?

Match at least n times.

{ n , m }

{ n , m }?

Match from n to m times.

The quantities n and m are integer constants. Ordinarily, quantifiers are greedy; they cause the regular expression engine to match as many occurrences of particular patterns as possible. Appending the ? character to a quantifier makes it lazy; it causes the regular expression engine to match as few occurrences as possible. For a complete description of the difference between greedy and lazy quantifiers, see the section Greedy and Lazy Quantifiers later in this topic.

Important note Important

Nesting quantifiers (for example, as the regular expression pattern (a*)* does) can increase the number of comparisons that the regular expression engine must perform, as an exponential function of the number of characters in the input string. For more information about this behavior and its workarounds, see Backtracking in Regular Expressions.

The following sections list the quantifiers supported by .NET Framework regular expressions.

Note Note

If the *, +, ?, {, and } characters are encountered in a regular expression pattern, the regular expression engine interprets them as quantifiers or part of quantifier constructs unless they are included in a character class. To interpret these as literal characters outside a character class, you must escape them by preceding them with a backslash. For example, the string \* in a regular expression pattern is interpreted as a literal asterisk ("*") character.

The * quantifier matches the preceding element zero or more times. It is equivalent to the {0,} quantifier. * is a greedy quantifier whose lazy equivalent is *?.

The following example illustrates this regular expression. Of the nine digits in the input string, five match the pattern and four (95, 929, 9129, and 9919) do not.

string pattern = @"\b91*9*\b";   
string input = "99 95 919 929 9119 9219 999 9919 91119";
foreach (Match match in Regex.Matches(input, pattern))
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);

// The example displays the following output:    
//       '99' found at position 0. 
//       '919' found at position 6. 
//       '9119' found at position 14. 
//       '999' found at position 24. 
//       '91119' found at position 33.

The regular expression pattern is defined as shown in the following table.

Pattern

Description

\b

Start at a word boundary.

91*

Match a "9" followed by zero or more "1" characters.

9*

Match zero or more "9" characters.

\b

End at a word boundary.

The + quantifier matches the preceding element one or more times. It is equivalent to {1,}. + is a greedy quantifier whose lazy equivalent is +?.

For example, the regular expression \ban+\w*?\b tries to match entire words that begin with the letter a followed by one or more instances of the letter n. The following example illustrates this regular expression. The regular expression matches the words an, annual, announcement, and antique, and correctly fails to match autumn and all.

string pattern = @"\ban+\w*?\b";

string input = "Autumn is a great time for an annual announcement to all antique collectors.";
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);

// The example displays the following output:    
//       'an' found at position 27. 
//       'annual' found at position 30. 
//       'announcement' found at position 37. 
//       'antique' found at position 57.      

The regular expression pattern is defined as shown in the following table.

Pattern

Description

\b

Start at a word boundary.

an+

Match an "a" followed by one or more "n" characters.

\w*?

Match a word character zero or more times, but as few times as possible.

\b

End at a word boundary.

The ? quantifier matches the preceding element zero or one time. It is equivalent to {0,1}. ? is a greedy quantifier whose lazy equivalent is ??.

For example, the regular expression \ban?\b tries to match entire words that begin with the letter a followed by zero or one instances of the letter n. In other words, it tries to match the words a and an. The following example illustrates this regular expression.

string pattern = @"\ban?\b";
string input = "An amiable animal with a large snount and an animated nose.";
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);

// The example displays the following output:    
//        'An' found at position 0. 
//        'a' found at position 23. 
//        'an' found at position 42.

The regular expression pattern is defined as shown in the following table.

Pattern

Description

\b

Start at a word boundary.

an?

Match an "a" followed by zero or one "n" character.

\b

End at a word boundary.

The {n} quantifier matches the preceding element exactly n times, where n is any integer. {n} is a greedy quantifier whose lazy equivalent is {n}?.

For example, the regular expression \b\d+\,\d{3}\b tries to match a word boundary followed by one or more decimal digits followed by three decimal digits followed by a word boundary. The following example illustrates this regular expression.

string pattern = @"\b\d+\,\d{3}\b";
string input = "Sales totaled 103,524 million in January, " + 
                      "106,971 million in February, but only " + 
                      "943 million in March.";
foreach (Match match in Regex.Matches(input, pattern))
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);

//  The example displays the following output:    
//        '103,524' found at position 14. 
//        '106,971' found at position 45.

The regular expression pattern is defined as shown in the following table.

Pattern

Description

\b

Start at a word boundary.

\d+

Match one or more decimal digits.

\,

Match a comma character.

\d{3}

Match three decimal digits.

\b

End at a word boundary.

The {n,} quantifier matches the preceding element at least n times, where n is any integer. {n,} is a greedy quantifier whose lazy equivalent is {n}?.

For example, the regular expression \b\d{2,}\b\D+ tries to match a word boundary followed by at least two digits followed by a word boundary and a non-digit character. The following example illustrates this regular expression. The regular expression fails to match the phrase "7 days" because it contains just one decimal digit, but it successfully matches the phrases "10 weeks and 300 years".

string pattern = @"\b\d{2,}\b\D+";   
string input = "7 days, 10 weeks, 300 years";
foreach (Match match in Regex.Matches(input, pattern))
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);

//  The example displays the following output: 
//        '10 weeks, ' found at position 8. 
//        '300 years' found at position 18.

The regular expression pattern is defined as shown in the following table.

Pattern

Description

\b

Start at a word boundary.

\d{2,}

Match at least two decimal digits.

\b

Match a word boundary.

\D+

Match at least one non-decimal digit.

The {n,m} quantifier matches the preceding element at least n times, but no more than m times, where n and m are integers. {n,m} is a greedy quantifier whose lazy equivalent is {n,m}?.

In the following example, the regular expression (00\s){2,4} tries to match between two and four occurrences of two zero digits followed by a space. Note that the final portion of the input string includes this pattern five times rather than the maximum of four. However, only the initial portion of this substring (up to the space and the fifth pair of zeros) matches the regular expression pattern.

string pattern = @"(00\s){2,4}";
string input = "0x00 FF 00 00 18 17 FF 00 00 00 21 00 00 00 00 00";
foreach (Match match in Regex.Matches(input, pattern))
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);

//  The example displays the following output: 
//        '00 00 ' found at position 8. 
//        '00 00 00 ' found at position 23. 
//        '00 00 00 00 ' found at position 35.

The *? quantifier matches the preceding element zero or more times, but as few times as possible. It is the lazy counterpart of the greedy quantifier *.

In the following example, the regular expression \b\w*?oo\w*?\b matches all words that contain the string oo.

 string pattern = @"\b\w*?oo\w*?\b";
 string input = "woof root root rob oof woo woe";
 foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
    Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);

 //  The example displays the following output: 
//        'woof' found at position 0. 
//        'root' found at position 5. 
//        'root' found at position 10. 
//        'oof' found at position 19. 
//        'woo' found at position 23.

The regular expression pattern is defined as shown in the following table.

Pattern

Description

\b

Start at a word boundary.

\w*?

Match zero or more word characters, but as few characters as possible.

oo

Match the string "oo".

\w*?

Match zero or more word characters, but as few characters as possible.

\b

End on a word boundary.

The +? quantifier matches the preceding element one or more times, but as few times as possible. It is the lazy counterpart of the greedy quantifier +.

For example, the regular expression \b\w+?\b matches one or more characters separated by word boundaries. The following example illustrates this regular expression.

string pattern = @"\b\w+?\b";
string input = "Aa Bb Cc Dd Ee Ff";
foreach (Match match in Regex.Matches(input, pattern))
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);

//  The example displays the following output: 
//        'Aa' found at position 0. 
//        'Bb' found at position 3. 
//        'Cc' found at position 6. 
//        'Dd' found at position 9. 
//        'Ee' found at position 12. 
//        'Ff' found at position 15.

The ?? quantifier matches the preceding element zero or one time, but as few times as possible. It is the lazy counterpart of the greedy quantifier ?.

For example, the regular expression ^\s*(System.)??Console.Write(Line)??\(?? attempts to match the strings "Console.Write" or "Console.WriteLine". The string can also include "System." before "Console", and it can be followed by an opening parenthesis. The string must be at the beginning of a line, although it can be preceded by white space. The following example illustrates this regular expression.

string pattern = @"^\s*(System.)??Console.Write(Line)??\(??";
string input = "System.Console.WriteLine(\"Hello!\")\n" + 
                      "Console.Write(\"Hello!\")\n" + 
                      "Console.WriteLine(\"Hello!\")\n" + 
                      "Console.ReadLine()\n" + 
                      "   Console.WriteLine";
foreach (Match match in Regex.Matches(input, pattern, 
                                      RegexOptions.IgnorePatternWhitespace | 
                                      RegexOptions.IgnoreCase | 
                                      RegexOptions.Multiline))
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);

//  The example displays the following output: 
//        'System.Console.Write' found at position 0. 
//        'Console.Write' found at position 36. 
//        'Console.Write' found at position 61. 
//        '   Console.Write' found at position 110.

The regular expression pattern is defined as shown in the following table.

Pattern

Description

^

Match the start of the input stream.

\s*

Match zero or more white-space characters.

(System.)??

Match zero or one occurrence of the string "System.".

Console.Write

Match the string "Console.Write".

(Line)??

Match zero or one occurrence of the string "Line".

\(??

Match zero or one occurrence of the opening parenthesis.

The {n}? quantifier matches the preceding element exactly n times, where n is any integer. It is the lazy counterpart of the greedy quantifier {n}+.

In the following example, the regular expression \b(\w{3,}?\.){2}?\w{3,}?\b is used to identify a Web site address. Note that it matches "www.microsoft.com" and "msdn.microsoft.com", but does not match "mywebsite" or "mycompany.com".

string pattern = @"\b(\w{3,}?\.){2}?\w{3,}?\b";
string input = "www.microsoft.com msdn.microsoft.com mywebsite mycompany.com";
foreach (Match match in Regex.Matches(input, pattern))
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);

//  The example displays the following output: 
//        'www.microsoft.com' found at position 0. 
//        'msdn.microsoft.com' found at position 18.

The regular expression pattern is defined as shown in the following table.

Pattern

Description

\b

Start at a word boundary.

(\w{3,}?\.)

Match at least 3 word characters, but as few characters as possible, followed by a dot or period character. This is the first capturing group.

(\w{3,}?\.){2}?

Match the pattern in the first group two times, but as few times as possible.

\b

End the match on a word boundary.

The {n,}? quantifier matches the preceding element at least n times, where n is any integer, but as few times as possible. It is the lazy counterpart of the greedy quantifier {n,}.

See the example for the {n}? quantifier in the previous section for an illustration. The regular expression in that example uses the {n,} quantifier to match a string that has at least three characters followed by a period.

The {n,m}? quantifier matches the preceding element between n and m times, where n and m are integers, but as few times as possible. It is the lazy counterpart of the greedy quantifier {n,m}.

In the following example, the regular expression \b[A-Z](\w*\s+){1,10}?[.!?] matches sentences that contain between one and ten words. It matches all the sentences in the input string except for one sentence that contains 18 words.

string pattern = @"\b[A-Z](\w*?\s*?){1,10}[.!?]";
string input = "Hi. I am writing a short note. Its purpose is " + 
                      "to test a regular expression that attempts to find " + 
                      "sentences with ten or fewer words. Most sentences " + 
                      "in this note are short.";
foreach (Match match in Regex.Matches(input, pattern))
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);

//  The example displays the following output: 
//        'Hi.' found at position 0. 
//        'I am writing a short note.' found at position 4. 
//        'Most sentences in this note are short.' found at position 132.

The regular expression pattern is defined as shown in the following table.

Pattern

Description

\b

Start at a word boundary.

[A-Z]

Match an uppercase character from A to Z.

(\w*\s+)

Match zero or more word characters, followed by one or more white-space characters. This is the first capture group.

{1,10}?

Match the previous pattern between 1 and 10 times, but as few times as possible.

[.!?]

Match any one of the punctuation characters ".", "!", or "?".

A number of the quantifiers have two versions:

  • A greedy version.

    A greedy quantifier tries to match an element as many times as possible.

  • A non-greedy (or lazy) version.

    A non-greedy quantifier tries to match an element as few times as possible. You can turn a greedy quantifier into a lazy quantifier by simply adding a ?.

Consider a simple regular expression that is intended to extract the last four digits from a string of numbers such as a credit card number. The version of the regular expression that uses the * greedy quantifier is \b.*([0-9]{4})\b. However, if a string contains two numbers, this regular expression matches the last four digits of the second number only, as the following example shows.

string greedyPattern = @"\b.*([0-9]{4})\b";
string input1 = "1112223333 3992991999";
foreach (Match match in Regex.Matches(input1, greedyPattern))
   Console.WriteLine("Account ending in ******{0}.", match.Groups[1].Value);

// The example displays the following output: 
//       Account ending in ******1999.

The regular expression fails to match the first number because the * quantifier tries to match the previous element as many times as possible in the entire string, and so it finds its match at the end of the string.

This is not the desired behavior. Instead, you can use the *?lazy quantifier to extract digits from both numbers, as the following example shows.

string lazyPattern = @"\b.*?([0-9]{4})\b";
string input2 = "1112223333 3992991999";
foreach (Match match in Regex.Matches(input2, lazyPattern))
   Console.WriteLine("Account ending in ******{0}.", match.Groups[1].Value);

// The example displays the following output: 
//       Account ending in ******3333. 
//       Account ending in ******1999.

In most cases, regular expressions with greedy and lazy quantifiers return the same matches. They most commonly return different results when they are used with the wildcard (.) metacharacter, which matches any character.

The quantifiers *, +, and {n,m} and their lazy counterparts never repeat after an empty match when the minimum number of captures has been found. This rule prevents quantifiers from entering infinite loops on empty subexpression matches when the maximum number of possible group captures is infinite or near infinite.

For example, the following code shows the result of a call to the Regex.Match method with the regular expression pattern (a?)*, which matches zero or one "a" character zero or more times. Note that the single capturing group captures each "a" as well as String.Empty, but that there is no second empty match, because the first empty match causes the quantifier to stop repeating.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = "(a?)*";
      string input = "aaabbb";
      Match match = Regex.Match(input, pattern);
      Console.WriteLine("Match: '{0}' at index {1}", 
                        match.Value, match.Index);
      if (match.Groups.Count > 1) {
         GroupCollection groups = match.Groups;
         for (int grpCtr = 1; grpCtr <= groups.Count - 1; grpCtr++) {
            Console.WriteLine("   Group {0}: '{1}' at index {2}", 
                              grpCtr, 
                              groups[grpCtr].Value,
                              groups[grpCtr].Index);
            int captureCtr = 0;
            foreach (Capture capture in groups[grpCtr].Captures) {
               captureCtr++;
               Console.WriteLine("      Capture {0}: '{1}' at index {2}", 
                                 captureCtr, capture.Value, capture.Index);
            }
         } 
      }   
   }
}
// The example displays the following output: 
//       Match: 'aaa' at index 0 
//          Group 1: '' at index 3 
//             Capture 1: 'a' at index 0 
//             Capture 2: 'a' at index 1 
//             Capture 3: 'a' at index 2 
//             Capture 4: '' at index 3

To see the practical difference between a capturing group that defines a minimum and a maximum number of captures and one that defines a fixed number of captures, consider the regular expression patterns (a\1|(?(1)\1)){0,2} and (a\1|(?(1)\1)){2}. Both regular expressions consist of a single capturing group, which is defined as shown in the following table.

Pattern

Description

(a\1

Either match "a" along with the value of the first captured group …

|(?(1)

… or test whether the first captured group has been defined. (Note that the (?(1) construct does not define a capturing group.)

\1))

If the first captured group exists, match its value. If the group does not exist, the group will match String.Empty.

The first regular expression tries to match this pattern between zero and two times; the second, exactly two times. Because the first pattern reaches its minimum number of captures with its first capture of String.Empty, it never repeats to try to match a\1; the {0,2} quantifier allows only empty matches in the last iteration. In contrast, the second regular expression does match "a" because it evaluates a\1 a second time; the minimum number of iterations, 2, forces the engine to repeat after an empty match.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern, input;

      pattern = @"(a\1|(?(1)\1)){0,2}";
      input = "aaabbb"; 

      Console.WriteLine("Regex pattern: {0}", pattern);
      Match match = Regex.Match(input, pattern);
      Console.WriteLine("Match: '{0}' at position {1}.", 
                        match.Value, match.Index);
      if (match.Groups.Count > 1) {
         for (int groupCtr = 1; groupCtr <= match.Groups.Count - 1; groupCtr++)
         {
            Group group = match.Groups[groupCtr];         
            Console.WriteLine("   Group: {0}: '{1}' at position {2}.", 
                              groupCtr, group.Value, group.Index);
            int captureCtr = 0;
            foreach (Capture capture in group.Captures) {
               captureCtr++;
               Console.WriteLine("      Capture: {0}: '{1}' at position {2}.", 
                                 captureCtr, capture.Value, capture.Index);
            }   
         }
      }
      Console.WriteLine();

      pattern = @"(a\1|(?(1)\1)){2}";
      Console.WriteLine("Regex pattern: {0}", pattern);
      match = Regex.Match(input, pattern);
         Console.WriteLine("Matched '{0}' at position {1}.", 
                           match.Value, match.Index);
      if (match.Groups.Count > 1) {
         for (int groupCtr = 1; groupCtr <= match.Groups.Count - 1; groupCtr++)
         {
            Group group = match.Groups[groupCtr];         
            Console.WriteLine("   Group: {0}: '{1}' at position {2}.", 
                              groupCtr, group.Value, group.Index);
            int captureCtr = 0;
            foreach (Capture capture in group.Captures) {
               captureCtr++;
               Console.WriteLine("      Capture: {0}: '{1}' at position {2}.", 
                                 captureCtr, capture.Value, capture.Index);
            }   
         }
      }
   }
}
// The example displays the following output: 
//       Regex pattern: (a\1|(?(1)\1)){0,2} 
//       Match: '' at position 0. 
//          Group: 1: '' at position 0. 
//             Capture: 1: '' at position 0. 
//        
//       Regex pattern: (a\1|(?(1)\1)){2} 
//       Matched 'a' at position 0. 
//          Group: 1: 'a' at position 0. 
//             Capture: 1: '' at position 0. 
//             Capture: 2: 'a' at position 0.
Was this page helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2014 Microsoft