Export (0) Print
Expand All

Quantifiers in Regular Expressions

Quantifiers specify how many instances of a character, group, or character class must be present in the input for a match to be found. The following table lists the quantifiers supported by the .NET Framework.

Greedy quantifier

Lazy quantifier

Description

*

*?

Match zero or more times.

+

+?

Match one or more times.

?

??

Match zero or one time.

{ n }

{ n }?

Match exactly n times.

{ n ,}

{ n ,}?

Match at least n times.

{ n , m }

{ n , m }?

Match from n to m times.

The quantities n and m are integer constants. Ordinarily, quantifiers are greedy; they cause the regular expression engine to match as many occurrences of particular patterns as possible. Appending the ? character to a quantifier makes it lazy; it causes the regular expression engine to match as few occurrences as possible. For a complete description of the difference between greedy and lazy quantifiers, see the section Greedy and Lazy Quantifiers later in this topic.

Important note Important

Nesting quantifiers (for example, as the regular expression pattern (a*)* does) can increase the number of comparisons that the regular expression engine must perform, as an exponential function of the number of characters in the input string. For more information about this behavior and its workarounds, see Backtracking in Regular Expressions.

The following sections list the quantifiers supported by .NET Framework regular expressions.

Note Note

If the *, +, ?, {, and } characters are encountered in a regular expression pattern, the regular expression engine interprets them as quantifiers or part of quantifier constructs unless they are included in a character class. To interpret these as literal characters outside a character class, you must escape them by preceding them with a backslash. For example, the string \* in a regular expression pattern is interpreted as a literal asterisk ("*") character.

The * quantifier matches the preceding element zero or more times. It is equivalent to the {0,} quantifier. * is a greedy quantifier whose lazy equivalent is *?.

The following example illustrates this regular expression. Of the nine digits in the input string, five match the pattern and four (95, 929, 9129, and 9919) do not.

Dim pattern As String = "\b91*9*\b"    
Dim input As String = "99 95 919 929 9119 9219 999 9919 91119" 
For Each match As Match In Regex.Matches(input, pattern)
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next      
' The example displays the following output:    
'       '99' found at position 0. 
'       '919' found at position 6. 
'       '9119' found at position 14. 
'       '999' found at position 24. 
'       '91119' found at position 33.

The regular expression pattern is defined as shown in the following table.

Pattern

Description

\b

Start at a word boundary.

91*

Match a "9" followed by zero or more "1" characters.

9*

Match zero or more "9" characters.

\b

End at a word boundary.

The + quantifier matches the preceding element one or more times. It is equivalent to {1,}. + is a greedy quantifier whose lazy equivalent is +?.

For example, the regular expression \ban+\w*?\b tries to match entire words that begin with the letter a followed by one or more instances of the letter n. The following example illustrates this regular expression. The regular expression matches the words an, annual, announcement, and antique, and correctly fails to match autumn and all.

Dim pattern As String = "\ban+\w*?\b" 

Dim input As String = "Autumn is a great time for an annual announcement to all antique collectors." 
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next    
' The example displays the following output:    
'       'an' found at position 27. 
'       'annual' found at position 30. 
'       'announcement' found at position 37. 
'       'antique' found at position 57.      

The regular expression pattern is defined as shown in the following table.

Pattern

Description

\b

Start at a word boundary.

an+

Match an "a" followed by one or more "n" characters.

\w*?

Match a word character zero or more times, but as few times as possible.

\b

End at a word boundary.

The ? quantifier matches the preceding element zero or one time. It is equivalent to {0,1}. ? is a greedy quantifier whose lazy equivalent is ??.

For example, the regular expression \ban?\b tries to match entire words that begin with the letter a followed by zero or one instances of the letter n. In other words, it tries to match the words a and an. The following example illustrates this regular expression.

Dim pattern As String = "\ban?\b" 
Dim input As String = "An amiable animal with a large snount and an animated nose." 
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next   
' The example displays the following output:    
'       'An' found at position 0. 
'       'a' found at position 23. 
'       'an' found at position 42.

The regular expression pattern is defined as shown in the following table.

Pattern

Description

\b

Start at a word boundary.

an?

Match an "a" followed by zero or one "n" character.

\b

End at a word boundary.

The {n} quantifier matches the preceding element exactly n times, where n is any integer. {n} is a greedy quantifier whose lazy equivalent is {n}?.

For example, the regular expression \b\d+\,\d{3}\b tries to match a word boundary followed by one or more decimal digits followed by three decimal digits followed by a word boundary. The following example illustrates this regular expression.

Dim pattern As String = "\b\d+\,\d{3}\b" 
Dim input As String = "Sales totaled 103,524 million in January, " + _
                      "106,971 million in February, but only " + _
                      "943 million in March." 
For Each match As Match In Regex.Matches(input, pattern)
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next      
' The example displays the following output:    
'       '103,524' found at position 14. 
'       '106,971' found at position 45.

The regular expression pattern is defined as shown in the following table.

Pattern

Description

\b

Start at a word boundary.

\d+

Match one or more decimal digits.

\,

Match a comma character.

\d{3}

Match three decimal digits.

\b

End at a word boundary.

The {n,} quantifier matches the preceding element at least n times, where n is any integer. {n,} is a greedy quantifier whose lazy equivalent is {n}?.

For example, the regular expression \b\d{2,}\b\D+ tries to match a word boundary followed by at least two digits followed by a word boundary and a non-digit character. The following example illustrates this regular expression. The regular expression fails to match the phrase "7 days" because it contains just one decimal digit, but it successfully matches the phrases "10 weeks and 300 years".

Dim pattern As String = "\b\d{2,}\b\D+"   
 Dim input As String = "7 days, 10 weeks, 300 years" 
For Each match As Match In Regex.Matches(input, pattern)
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next  
' The example displays the following output: 
'       '10 weeks, ' found at position 8. 
'       '300 years' found at position 18.

The regular expression pattern is defined as shown in the following table.

Pattern

Description

\b

Start at a word boundary.

\d{2,}

Match at least two decimal digits.

\b

Match a word boundary.

\D+

Match at least one non-decimal digit.

The {n,m} quantifier matches the preceding element at least n times, but no more than m times, where n and m are integers. {n,m} is a greedy quantifier whose lazy equivalent is {n,m}?.

In the following example, the regular expression (00\s){2,4} tries to match between two and four occurrences of two zero digits followed by a space. Note that the final portion of the input string includes this pattern five times rather than the maximum of four. However, only the initial portion of this substring (up to the space and the fifth pair of zeros) matches the regular expression pattern.

Dim pattern As String = "(00\s){2,4}" 
Dim input As String = "0x00 FF 00 00 18 17 FF 00 00 00 21 00 00 00 00 00" 
For Each match As Match In Regex.Matches(input, pattern)
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next  
' The example displays the following output: 
'       '00 00 ' found at position 8. 
'       '00 00 00 ' found at position 23. 
'       '00 00 00 00 ' found at position 35.

The *? quantifier matches the preceding element zero or more times, but as few times as possible. It is the lazy counterpart of the greedy quantifier *.

In the following example, the regular expression \b\w*?oo\w*?\b matches all words that contain the string oo.

Dim pattern As String = "\b\w*?oo\w*?\b" 
 Dim input As String = "woof root root rob oof woo woe" 
 For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
    Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
 Next  
 ' The example displays the following output: 
'       'woof' found at position 0. 
'       'root' found at position 5. 
'       'root' found at position 10. 
'       'oof' found at position 19. 
'       'woo' found at position 23.

The regular expression pattern is defined as shown in the following table.

Pattern

Description

\b

Start at a word boundary.

\w*?

Match zero or more word characters, but as few characters as possible.

oo

Match the string "oo".

\w*?

Match zero or more word characters, but as few characters as possible.

\b

End on a word boundary.

The +? quantifier matches the preceding element one or more times, but as few times as possible. It is the lazy counterpart of the greedy quantifier +.

For example, the regular expression \b\w+?\b matches one or more characters separated by word boundaries. The following example illustrates this regular expression.

Dim pattern As String = "\b\w+?\b" 
 Dim input As String = "Aa Bb Cc Dd Ee Ff" 
For Each match As Match In Regex.Matches(input, pattern)
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next  
' The example displays the following output: 
'       'Aa' found at position 0. 
'       'Bb' found at position 3. 
'       'Cc' found at position 6. 
'       'Dd' found at position 9. 
'       'Ee' found at position 12. 
'       'Ff' found at position 15.

The ?? quantifier matches the preceding element zero or one time, but as few times as possible. It is the lazy counterpart of the greedy quantifier ?.

For example, the regular expression ^\s*(System.)??Console.Write(Line)??\(?? attempts to match the strings "Console.Write" or "Console.WriteLine". The string can also include "System." before "Console", and it can be followed by an opening parenthesis. The string must be at the beginning of a line, although it can be preceded by white space. The following example illustrates this regular expression.

Dim pattern As String = "^\s*(System.)??Console.Write(Line)??\(??" 
Dim input As String = "System.Console.WriteLine(""Hello!"")" + vbCrLf + _
                      "Console.Write(""Hello!"")" + vbCrLf + _
                      "Console.WriteLine(""Hello!"")" + vbCrLf + _
                      "Console.ReadLine()" + vbCrLf + _
                      "   Console.WriteLine" 
For Each match As Match In Regex.Matches(input, pattern, _
                                         RegexOptions.IgnorePatternWhitespace Or RegexOptions.IgnoreCase Or RegexOptions.MultiLine)
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next  
' The example displays the following output: 
'       'System.Console.Write' found at position 0. 
'       'Console.Write' found at position 36. 
'       'Console.Write' found at position 61. 
'       '   Console.Write' found at position 110.

The regular expression pattern is defined as shown in the following table.

Pattern

Description

^

Match the start of the input stream.

\s*

Match zero or more white-space characters.

(System.)??

Match zero or one occurrence of the string "System.".

Console.Write

Match the string "Console.Write".

(Line)??

Match zero or one occurrence of the string "Line".

\(??

Match zero or one occurrence of the opening parenthesis.

The {n}? quantifier matches the preceding element exactly n times, where n is any integer. It is the lazy counterpart of the greedy quantifier {n}+.

In the following example, the regular expression \b(\w{3,}?\.){2}?\w{3,}?\b is used to identify a Web site address. Note that it matches "www.microsoft.com" and "msdn.microsoft.com", but does not match "mywebsite" or "mycompany.com".

Dim pattern As String = "\b(\w{3,}?\.){2}?\w{3,}?\b" 
 Dim input As String = "www.microsoft.com msdn.microsoft.com mywebsite mycompany.com" 
For Each match As Match In Regex.Matches(input, pattern)
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next      
' The example displays the following output: 
'       'www.microsoft.com' found at position 0. 
'       'msdn.microsoft.com' found at position 18.

The regular expression pattern is defined as shown in the following table.

Pattern

Description

\b

Start at a word boundary.

(\w{3,}?\.)

Match at least 3 word characters, but as few characters as possible, followed by a dot or period character. This is the first capturing group.

(\w{3,}?\.){2}?

Match the pattern in the first group two times, but as few times as possible.

\b

End the match on a word boundary.

The {n,}? quantifier matches the preceding element at least n times, where n is any integer, but as few times as possible. It is the lazy counterpart of the greedy quantifier {n,}.

See the example for the {n}? quantifier in the previous section for an illustration. The regular expression in that example uses the {n,} quantifier to match a string that has at least three characters followed by a period.

The {n,m}? quantifier matches the preceding element between n and m times, where n and m are integers, but as few times as possible. It is the lazy counterpart of the greedy quantifier {n,m}.

In the following example, the regular expression \b[A-Z](\w*\s+){1,10}?[.!?] matches sentences that contain between one and ten words. It matches all the sentences in the input string except for one sentence that contains 18 words.

Dim pattern As String = "\b[A-Z](\w*\s?){1,10}?[.!?]" 
Dim input As String = "Hi. I am writing a short note. Its purpose is " + _
                      "to test a regular expression that attempts to find " + _
                      "sentences with ten or fewer words. Most sentences " + _
                      "in this note are short." 
For Each match As Match In Regex.Matches(input, pattern)
   Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next  
' The example displays the following output: 
'       'Hi.' found at position 0. 
'       'I am writing a short note.' found at position 4. 
'       'Most sentences in this note are short.' found at position 132.

The regular expression pattern is defined as shown in the following table.

Pattern

Description

\b

Start at a word boundary.

[A-Z]

Match an uppercase character from A to Z.

(\w*\s+)

Match zero or more word characters, followed by one or more white-space characters. This is the first capture group.

{1,10}?

Match the previous pattern between 1 and 10 times, but as few times as possible.

[.!?]

Match any one of the punctuation characters ".", "!", or "?".

A number of the quantifiers have two versions:

  • A greedy version.

    A greedy quantifier tries to match an element as many times as possible.

  • A non-greedy (or lazy) version.

    A non-greedy quantifier tries to match an element as few times as possible. You can turn a greedy quantifier into a lazy quantifier by simply adding a ?.

Consider a simple regular expression that is intended to extract the last four digits from a string of numbers such as a credit card number. The version of the regular expression that uses the * greedy quantifier is \b.*([0-9]{4})\b. However, if a string contains two numbers, this regular expression matches the last four digits of the second number only, as the following example shows.

Dim greedyPattern As String = "\b.*([0-9]{4})\b" 
Dim input1 As String = "1112223333 3992991999" 
For Each match As Match In Regex.Matches(input1, greedypattern)
   Console.WriteLine("Account ending in ******{0}.", match.Groups(1).Value)
Next 
' The example displays the following output: 
'       Account ending in ******1999.

The regular expression fails to match the first number because the * quantifier tries to match the previous element as many times as possible in the entire string, and so it finds its match at the end of the string.

This is not the desired behavior. Instead, you can use the *?lazy quantifier to extract digits from both numbers, as the following example shows.

Dim lazyPattern As String = "\b.*?([0-9]{4})\b" 
Dim input2 As String = "1112223333 3992991999" 
For Each match As Match In Regex.Matches(input2, lazypattern)
   Console.WriteLine("Account ending in ******{0}.", match.Groups(1).Value)
Next      
' The example displays the following output: 
'       Account ending in ******3333. 
'       Account ending in ******1999.

In most cases, regular expressions with greedy and lazy quantifiers return the same matches. They most commonly return different results when they are used with the wildcard (.) metacharacter, which matches any character.

The quantifiers *, +, and {n,m} and their lazy counterparts never repeat after an empty match when the minimum number of captures has been found. This rule prevents quantifiers from entering infinite loops on empty subexpression matches when the maximum number of possible group captures is infinite or near infinite.

For example, the following code shows the result of a call to the Regex.Match method with the regular expression pattern (a?)*, which matches zero or one "a" character zero or more times. Note that the single capturing group captures each "a" as well as String.Empty, but that there is no second empty match, because the first empty match causes the quantifier to stop repeating.

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "(a?)*" 
      Dim input As String = "aaabbb" 
      Dim match As Match = Regex.Match(input, pattern)
      Console.WriteLine("Match: '{0}' at index {1}", 
                        match.Value, match.Index)
      If match.Groups.Count > 1 Then 
         Dim groups As GroupCollection = match.Groups
         For grpCtr As Integer = 1 To groups.Count - 1
            Console.WriteLine("   Group {0}: '{1}' at index {2}", 
                              grpCtr, 
                              groups(grpCtr).Value,
                              groups(grpCtr).Index)
            Dim captureCtr As Integer = 0
            For Each capture As Capture In groups(grpCtr).Captures
               captureCtr += 1
               Console.WriteLine("      Capture {0}: '{1}' at index {2}", 
                                 captureCtr, capture.Value, capture.Index)
            Next 
         Next  
      End If    
   End Sub 
End Module 
' The example displays the following output: 
'       Match: 'aaa' at index 0 
'          Group 1: '' at index 3 
'             Capture 1: 'a' at index 0 
'             Capture 2: 'a' at index 1 
'             Capture 3: 'a' at index 2 
'             Capture 4: '' at index 3

To see the practical difference between a capturing group that defines a minimum and a maximum number of captures and one that defines a fixed number of captures, consider the regular expression patterns (a\1|(?(1)\1)){0,2} and (a\1|(?(1)\1)){2}. Both regular expressions consist of a single capturing group, which is defined as shown in the following table.

Pattern

Description

(a\1

Either match "a" along with the value of the first captured group …

|(?(1)

… or test whether the first captured group has been defined. (Note that the (?(1) construct does not define a capturing group.)

\1))

If the first captured group exists, match its value. If the group does not exist, the group will match String.Empty.

The first regular expression tries to match this pattern between zero and two times; the second, exactly two times. Because the first pattern reaches its minimum number of captures with its first capture of String.Empty, it never repeats to try to match a\1; the {0,2} quantifier allows only empty matches in the last iteration. In contrast, the second regular expression does match "a" because it evaluates a\1 a second time; the minimum number of iterations, 2, forces the engine to repeat after an empty match.

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern, input As String

      pattern = "(a\1|(?(1)\1)){0,2}"
      input = "aaabbb" 

      Console.WriteLine("Regex pattern: {0}", pattern)
      Dim match As Match = Regex.Match(input, pattern)
      Console.WriteLine("Match: '{0}' at position {1}.", 
                        match.Value, match.Index)
      If match.Groups.Count > 1 Then 
         For groupCtr As Integer = 1 To match.Groups.Count - 1
            Dim group As Group = match.Groups(groupCtr)         
            Console.WriteLine("   Group: {0}: '{1}' at position {2}.", 
                              groupCtr, group.Value, group.Index)
            Dim captureCtr As Integer = 0
            For Each capture As Capture In group.Captures
               captureCtr += 1
               Console.WriteLine("      Capture: {0}: '{1}' at position {2}.", 
                                 captureCtr, capture.Value, capture.Index)
            Next    
         Next 
      End If
      Console.WriteLine()

      pattern = "(a\1|(?(1)\1)){2}"
      Console.WriteLine("Regex pattern: {0}", pattern)
      match = Regex.Match(input, pattern)
         Console.WriteLine("Matched '{0}' at position {1}.", 
                           match.Value, match.Index)
      If match.Groups.Count > 1 Then 
         For groupCtr As Integer = 1 To match.Groups.Count - 1
            Dim group As Group = match.Groups(groupCtr)         
            Console.WriteLine("   Group: {0}: '{1}' at position {2}.", 
                              groupCtr, group.Value, group.Index)
            Dim captureCtr As Integer = 0
            For Each capture As Capture In group.Captures
               captureCtr += 1
               Console.WriteLine("      Capture: {0}: '{1}' at position {2}.", 
                                 captureCtr, capture.Value, capture.Index)
            Next    
         Next 
      End If 
   End Sub 
End Module 
' The example displays the following output: 
'       Regex pattern: (a\1|(?(1)\1)){0,2} 
'       Match: '' at position 0. 
'          Group: 1: '' at position 0. 
'             Capture: 1: '' at position 0. 
'        
'       Regex pattern: (a\1|(?(1)\1)){2} 
'       Matched 'a' at position 0. 
'          Group: 1: 'a' at position 0. 
'             Capture: 1: '' at position 0. 
'             Capture: 2: 'a' at position 0.
Show:
© 2014 Microsoft