.NET Framework 4 Quantifiers Updated: October 2010 Quantifiers specify how many instances of a character, group, or character class must be present in the input for a match to be found. The following table lists the quantifiers supported by the .NET Framework. Greedy quantifier | Lazy quantifier | Description |
|---|
* | *? | Match zero or more times. | + | +? | Match one or more times. | ? | ?? | Match zero or one time. | {n} | {n}? | Match exactly n times. | {n,} | {n,}? | Match at least n times. | {n,m} | {n,m}? | Match from n to m times. |
The quantities n and m are integer constants. Ordinarily, quantifiers are greedy; they cause the regular expression engine to match as many occurrences of particular patterns as possible. Appending the ? character to a quantifier makes it lazy; it causes the regular expression engine to match as few occurrences as possible. For a complete description of the difference between greedy and lazy quantifiers, see the section Greedy and Lazy Quantifiers later in this topic. Important |
|---|
Nesting quantifiers (for example, as the regular expression pattern (a*)* does) can increase the number of comparisons that the regular expression engine must perform, as an exponential function of the number of characters in the input string. For more information about this behavior and its workarounds, see Backtracking. |

Regular Expression Quantifiers
The following sections list the quantifiers supported by .NET Framework regular expressions. Note |
|---|
If the *, +, ?, {, and } characters are encountered in a regular expression pattern, the regular expression engine interprets them as quantifiers or part of quantifier constructs unless they are included in a character class. To interpret these as literal characters outside a character class, you must escape them by preceding them with a backslash. For example, the string \* in a regular expression pattern is interpreted as a literal asterisk ("*") character. |
Match Zero or More Times: *The * quantifier matches the preceding element zero or more times. It is equivalent to the {0,} quantifier. * is a greedy quantifier whose lazy equivalent is *?. The following example illustrates this regular expression. Of the nine digits in the input string, five match the pattern and four (95, 929, 9129, and 9919) do not.
Dim pattern As String = "\b91*9*\b"
Dim input As String = "99 95 919 929 9119 9219 999 9919 91119"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' '99' found at position 0.
' '919' found at position 6.
' '9119' found at position 14.
' '999' found at position 24.
' '91119' found at position 33.
string pattern = @"\b91*9*\b";
string input = "99 95 919 929 9119 9219 999 9919 91119";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// '99' found at position 0.
// '919' found at position 6.
// '9119' found at position 14.
// '999' found at position 24.
// '91119' found at position 33.
The regular expression pattern is defined as shown in the following table. Pattern | Description |
|---|
\b | Start at a word boundary. | 91* | Match a "9" followed by zero or more "1" characters. | 9* | Match zero or more "9" characters. | \b | End at a word boundary. |
Match One or More Times: +The + quantifier matches the preceding element one or more times. It is equivalent to {1,}. + is a greedy quantifier whose lazy equivalent is +?. For example, the regular expression \ban+\w*?\b tries to match entire words that begin with the letter a followed by one or more instances of the letter n. The following example illustrates this regular expression. The regular expression matches the words an, annual, announcement, and antique, and correctly fails to match autumn and all.
Dim pattern As String = "\ban+\w*?\b"
Dim input As String = "Autumn is a great time for an annual announcement to all antique collectors."
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'an' found at position 27.
' 'annual' found at position 30.
' 'announcement' found at position 37.
' 'antique' found at position 57.
string pattern = @"\ban+\w*?\b";
string input = "Autumn is a great time for an annual announcement to all antique collectors.";
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'an' found at position 27.
// 'annual' found at position 30.
// 'announcement' found at position 37.
// 'antique' found at position 57.
The regular expression pattern is defined as shown in the following table. Pattern | Description |
|---|
\b | Start at a word boundary. | an+ | Match an "a" followed by one or more "n" characters. | \w*? | Match a word character zero or more times, but as few times as possible. | \b | End at a word boundary. |
Match Zero or One Time: ?The ? quantifier matches the preceding element zero or one time. It is equivalent to {0,1}. ? is a greedy quantifier whose lazy equivalent is ??. For example, the regular expression \ban?\b tries to match entire words that begin with the letter a followed by zero or one instances of the letter n. In other words, it tries to match the words a and an. The following example illustrates this regular expression.
Dim pattern As String = "\ban?\b"
Dim input As String = "An amiable animal with a large snount and an animated nose."
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'An' found at position 0.
' 'a' found at position 23.
' 'an' found at position 42.
string pattern = @"\ban?\b";
string input = "An amiable animal with a large snount and an animated nose.";
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'An' found at position 0.
// 'a' found at position 23.
// 'an' found at position 42.
The regular expression pattern is defined as shown in the following table. Pattern | Description |
|---|
\b | Start at a word boundary. | an? | Match an "a" followed by zero or one "n" character. | \b | End at a word boundary. |
Match Exactly n Times: {n}The {n} quantifier matches the preceding element exactly n times, where n is any integer. {n} is a greedy quantifier whose lazy equivalent is {n}?. For example, the regular expression \b\d+\,\d{3}\b tries to match a word boundary followed by one or more decimal digits followed by three decimal digits followed by a word boundary. The following example illustrates this regular expression.
Dim pattern As String = "\b\d+\,\d{3}\b"
Dim input As String = "Sales totaled 103,524 million in January, " + _
"106,971 million in February, but only " + _
"943 million in March."
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' '103,524' found at position 14.
' '106,971' found at position 45.
string pattern = @"\b\d+\,\d{3}\b";
string input = "Sales totaled 103,524 million in January, " +
"106,971 million in February, but only " +
"943 million in March.";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// '103,524' found at position 14.
// '106,971' found at position 45.
The regular expression pattern is defined as shown in the following table. Pattern | Description |
|---|
\b | Start at a word boundary. | \d+ | Match one or more decimal digits. | \, | Match a comma character. | \d{3} | Match three decimal digits. | \b | End at a word boundary. |
Match at Least n Times: {n,}The {n,} quantifier matches the preceding element at least n times, where n is any integer. {n,} is a greedy quantifier whose lazy equivalent is {n}?. For example, the regular expression \b\d{2,}\b\D+ tries to match a word boundary followed by at least two digits followed by a word boundary and a non-digit character. The following example illustrates this regular expression. The regular expression fails to match the phrase "7 days" because it contains just one decimal digit, but it successfully matches the phrases "10 weeks and 300 years".
Dim pattern As String = "\b\d{2,}\b\D+"
Dim input As String = "7 days, 10 weeks, 300 years"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' '10 weeks, ' found at position 8.
' '300 years' found at position 18.
string pattern = @"\b\d{2,}\b\D+";
string input = "7 days, 10 weeks, 300 years";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// '10 weeks, ' found at position 8.
// '300 years' found at position 18.
The regular expression pattern is defined as shown in the following table. Pattern | Description |
|---|
\b | Start at a word boundary. | \d{2,} | Match at least two decimal digits. | \b | Match a word boundary. | \D+ | Match at least one non-decimal digit. |
Match Between n and m Times: {n,m}The {n,m} quantifier matches the preceding element at least n times, but no more than m times, where n and m are integers. {n,m} is a greedy quantifier whose lazy equivalent is {n,m}?. In the following example, the regular expression (00\s){2,4} tries to match between two and four occurrences of two zero digits followed by a space. Note that the final portion of the input string includes this pattern five times rather than the maximum of four. However, only the initial portion of this substring (up to the space and the fifth pair of zeros) matches the regular expression pattern.
Dim pattern As String = "(00\s){2,4}"
Dim input As String = "0x00 FF 00 00 18 17 FF 00 00 00 21 00 00 00 00 00"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' '00 00 ' found at position 8.
' '00 00 00 ' found at position 23.
' '00 00 00 00 ' found at position 35.
string pattern = @"(00\s){2,4}";
string input = "0x00 FF 00 00 18 17 FF 00 00 00 21 00 00 00 00 00";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// '00 00 ' found at position 8.
// '00 00 00 ' found at position 23.
// '00 00 00 00 ' found at position 35.
Match Zero or More Times (Lazy Match): *?The *? quantifier matches the preceding element zero or more times, but as few times as possible. It is the lazy counterpart of the greedy quantifier *. In the following example, the regular expression \b\w*?oo\w*?\b matches all words that contain the string oo.
Dim pattern As String = "\b\w*?oo\w*?\b"
Dim input As String = "woof root root rob oof woo woe"
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'woof' found at position 0.
' 'root' found at position 5.
' 'root' found at position 10.
' 'oof' found at position 19.
' 'woo' found at position 23.
string pattern = @"\b\w*?oo\w*?\b";
string input = "woof root root rob oof woo woe";
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'woof' found at position 0.
// 'root' found at position 5.
// 'root' found at position 10.
// 'oof' found at position 19.
// 'woo' found at position 23.
The regular expression pattern is defined as shown in the following table. Pattern | Description |
|---|
\b | Start at a word boundary. | \w*? | Match zero or more word characters, but as few characters as possible. | oo | Match the string "oo". | \w*? | Match zero or more word characters, but as few characters as possible. | \b | End on a word boundary. |
Match One or More Times (Lazy Match): +?The +? quantifier matches the preceding element one or more times, but as few times as possible. It is the lazy counterpart of the greedy quantifier +. For example, the regular expression \b\w+?\b matches one or more characters separated by word boundaries. The following example illustrates this regular expression.
Dim pattern As String = "\b\w+?\b"
Dim input As String = "Aa Bb Cc Dd Ee Ff"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'Aa' found at position 0.
' 'Bb' found at position 3.
' 'Cc' found at position 6.
' 'Dd' found at position 9.
' 'Ee' found at position 12.
' 'Ff' found at position 15.
string pattern = @"\b\w+?\b";
string input = "Aa Bb Cc Dd Ee Ff";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'Aa' found at position 0.
// 'Bb' found at position 3.
// 'Cc' found at position 6.
// 'Dd' found at position 9.
// 'Ee' found at position 12.
// 'Ff' found at position 15.
Match Zero or One Time (Lazy Match): ??The ?? quantifier matches the preceding element zero or one time, but as few times as possible. It is the lazy counterpart of the greedy quantifier ?. For example, the regular expression ^\s*(System.)??Console.Write(Line)??\(?? attempts to match the strings "Console.Write" or "Console.WriteLine". The string can also include "System." before "Console", and it can be followed by an opening parenthesis. The string must be at the beginning of a line, although it can be preceded by white space. The following example illustrates this regular expression.
Dim pattern As String = "^\s*(System.)??Console.Write(Line)??\(??"
Dim input As String = "System.Console.WriteLine(""Hello!"")" + vbCrLf + _
"Console.Write(""Hello!"")" + vbCrLf + _
"Console.WriteLine(""Hello!"")" + vbCrLf + _
"Console.ReadLine()" + vbCrLf + _
" Console.WriteLine"
For Each match As Match In Regex.Matches(input, pattern, _
RegexOptions.IgnorePatternWhitespace Or RegexOptions.IgnoreCase Or RegexOptions.MultiLine)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'System.Console.Write' found at position 0.
' 'Console.Write' found at position 36.
' 'Console.Write' found at position 61.
' ' Console.Write' found at position 110.
string pattern = @"^\s*(System.)??Console.Write(Line)??\(??";
string input = "System.Console.WriteLine(\"Hello!\")\n" +
"Console.Write(\"Hello!\")\n" +
"Console.WriteLine(\"Hello!\")\n" +
"Console.ReadLine()\n" +
" Console.WriteLine";
foreach (Match match in Regex.Matches(input, pattern,
RegexOptions.IgnorePatternWhitespace |
RegexOptions.IgnoreCase |
RegexOptions.Multiline))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'System.Console.Write' found at position 0.
// 'Console.Write' found at position 36.
// 'Console.Write' found at position 61.
// ' Console.Write' found at position 110.
The regular expression pattern is defined as shown in the following table. Pattern | Description |
|---|
^ | Match the start of the input stream. | \s* | Match zero or more white-space characters. | (System.)?? | Match zero or one occurrence of the string "System.". | Console.Write | Match the string "Console.Write". | (Line)?? | Match zero or one occurrence of the string "Line". | \(?? | Match zero or one occurrence of the opening parenthesis. |
Match Exactly n Times (Lazy Match): {n}?The {n}? quantifier matches the preceding element exactly n times, where n is any integer. It is the lazy counterpart of the greedy quantifier {n}+. In the following example, the regular expression \b(\w{3,}?\.){2}?\w{3,}?\b is used to identify a Web site address. Note that it matches "www.microsoft.com" and "msdn.microsoft.com", but does not match "mywebsite" or "mycompany.com".
Dim pattern As String = "\b(\w{3,}?\.){2}?\w{3,}?\b"
Dim input As String = "www.microsoft.com msdn.microsoft.com mywebsite mycompany.com"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'www.microsoft.com' found at position 0.
' 'msdn.microsoft.com' found at position 18.
string pattern = @"\b(\w{3,}?\.){2}?\w{3,}?\b";
string input = "www.microsoft.com msdn.microsoft.com mywebsite mycompany.com";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'www.microsoft.com' found at position 0.
// 'msdn.microsoft.com' found at position 18.
The regular expression pattern is defined as shown in the following table. Pattern | Description |
|---|
\b | Start at a word boundary. | (\w{3,}? \.) | Match at least 3 word characters, but as few characters as possible, followed by a dot or period character. This is the first capturing group. | (\w{3,}? \.){2}? | Match the pattern in the first group two times, but as few times as possible. | \b | End the match on a word boundary. |
Match at Least n Times (Lazy Match): {n,}?The {n,}? quantifier matches the preceding element at least n times, where n is any integer, but as few times as possible. It is the lazy counterpart of the greedy quantifier {n,}. See the example for the {n}? quantifier in the previous section for an illustration. The regular expression in that example uses the {n,} quantifier to match a string that has at least three characters followed by a period. Match Between n and m Times (Lazy Match): {n,m}?The {n,m}? quantifier matches the preceding element between n and m times, where n and m are integers, but as few times as possible. It is the lazy counterpart of the greedy quantifier {n,m}. In the following example, the regular expression \b[A-Z](\w*\s+){1,10}?[.!?] matches sentences that contain between one and ten words. It matches all the sentences in the input string except for one sentence that contains 18 words.
Dim pattern As String = "\b[A-Z](\w*\s?){1,10}?[.!?]"
Dim input As String = "Hi. I am writing a short note. Its purpose is " + _
"to test a regular expression that attempts to find " + _
"sentences with ten or fewer words. Most sentences " + _
"in this note are short."
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'Hi.' found at position 0.
' 'I am writing a short note.' found at position 4.
' 'Most sentences in this note are short.' found at position 132.
string pattern = @"\b[A-Z](\w*?\s*?){1,10}[.!?]";
string input = "Hi. I am writing a short note. Its purpose is " +
"to test a regular expression that attempts to find " +
"sentences with ten or fewer words. Most sentences " +
"in this note are short.";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'Hi.' found at position 0.
// 'I am writing a short note.' found at position 4.
// 'Most sentences in this note are short.' found at position 132.
The regular expression pattern is defined as shown in the following table. Pattern | Description |
|---|
\b | Start at a word boundary. | [A-Z] | Match an uppercase character from A to Z. | (\w*\s+) | Match zero or more word characters, followed by one or more white-space characters. This is the first capture group. | {1,10}? | Match the previous pattern between 1 and 10 times, but as few times as possible. | [.!?] | Match any one of the punctuation characters ".", "!", or "?". |

Greedy and Lazy Quantifiers
A number of the quantifiers have two versions: A greedy version. A greedy quantifier tries to match an element as many times as possible. A non-greedy (or lazy) version. A non-greedy quantifier tries to match an element as few times as possible. You can turn a greedy quantifier into a lazy quantifier by simply adding a ?.
Consider a simple regular expression that is intended to extract the last four digits from a string of numbers such as a credit card number. The version of the regular expression that uses the * greedy quantifier is \b.*([0-9]{4})\b. However, if a string contains two numbers, this regular expression matches the last four digits of the second number only, as the following example shows.
Dim greedyPattern As String = "\b.*([0-9]{4})\b"
Dim input1 As String = "1112223333 3992991999"
For Each match As Match In Regex.Matches(input1, greedypattern)
Console.WriteLine("Account ending in ******{0}.", match.Groups(1).Value)
Next
' The example displays the following output:
' Account ending in ******1999.
string greedyPattern = @"\b.*([0-9]{4})\b";
string input1 = "1112223333 3992991999";
foreach (Match match in Regex.Matches(input1, greedyPattern))
Console.WriteLine("Account ending in ******{0}.", match.Groups[1].Value);
// The example displays the following output:
// Account ending in ******1999.
The regular expression fails to match the first number because the * quantifier tries to match the previous element as many times as possible in the entire string, and so it finds its match at the end of the string. This is not the desired behavior. Instead, you can use the *?lazy quantifier to extract digits from both numbers, as the following example shows.
Dim lazyPattern As String = "\b.*?([0-9]{4})\b"
Dim input2 As String = "1112223333 3992991999"
For Each match As Match In Regex.Matches(input2, lazypattern)
Console.WriteLine("Account ending in ******{0}.", match.Groups(1).Value)
Next
' The example displays the following output:
' Account ending in ******3333.
' Account ending in ******1999.
string lazyPattern = @"\b.*?([0-9]{4})\b";
string input2 = "1112223333 3992991999";
foreach (Match match in Regex.Matches(input2, lazyPattern))
Console.WriteLine("Account ending in ******{0}.", match.Groups[1].Value);
// The example displays the following output:
// Account ending in ******3333.
// Account ending in ******1999.
In most cases, regular expressions with greedy and lazy quantifiers return the same matches. They most commonly return different results when they are used with the wildcard (.) metacharacter, which matches any character.

Quantifiers and Empty Matches
The quantifiers *, +, and {n,m} and their lazy counterparts never repeat after an empty match when the minimum number of captures has been found. This rule prevents quantifiers from entering infinite loops on empty subexpression matches when the maximum number of possible group captures is infinite or near infinite. For example, the following code shows the result of a call to the Regex..::.Match method with the regular expression pattern (a?)*, which matches zero or one "a" character zero or more times. Note that the single capturing group captures each "a" as well as String..::.Empty, but that there is no second empty match, because the first empty match causes the quantifier to stop repeating.
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim pattern As String = "(a?)*"
Dim input As String = "aaabbb"
Dim match As Match = Regex.Match(input, pattern)
Console.WriteLine("Match: '{0}' at index {1}",
match.Value, match.Index)
If match.Groups.Count > 1 Then
Dim groups As GroupCollection = match.Groups
For grpCtr As Integer = 1 To groups.Count - 1
Console.WriteLine(" Group {0}: '{1}' at index {2}",
grpCtr,
groups(grpCtr).Value,
groups(grpCtr).Index)
Dim captureCtr As Integer = 0
For Each capture As Capture In groups(grpCtr).Captures
captureCtr += 1
Console.WriteLine(" Capture {0}: '{1}' at index {2}",
captureCtr, capture.Value, capture.Index)
Next
Next
End If
End Sub
End Module
' The example displays the following output:
' Match: 'aaa' at index 0
' Group 1: '' at index 3
' Capture 1: 'a' at index 0
' Capture 2: 'a' at index 1
' Capture 3: 'a' at index 2
' Capture 4: '' at index 3
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = "(a?)*";
string input = "aaabbb";
Match match = Regex.Match(input, pattern);
Console.WriteLine("Match: '{0}' at index {1}",
match.Value, match.Index);
if (match.Groups.Count > 1) {
GroupCollection groups = match.Groups;
for (int grpCtr = 1; grpCtr <= groups.Count - 1; grpCtr++) {
Console.WriteLine(" Group {0}: '{1}' at index {2}",
grpCtr,
groups[grpCtr].Value,
groups[grpCtr].Index);
int captureCtr = 0;
foreach (Capture capture in groups[grpCtr].Captures) {
captureCtr++;
Console.WriteLine(" Capture {0}: '{1}' at index {2}",
captureCtr, capture.Value, capture.Index);
}
}
}
}
}
// The example displays the following output:
// Match: 'aaa' at index 0
// Group 1: '' at index 3
// Capture 1: 'a' at index 0
// Capture 2: 'a' at index 1
// Capture 3: 'a' at index 2
// Capture 4: '' at index 3
To see the practical difference between a capturing group that defines a minimum and a maximum number of captures and one that defines a fixed number of captures, consider the regular expression patterns (a\1|(?(1)\1)){0,2} and (a\1|(?(1)\1)){2}. Both regular expressions consist of a single capturing group, which is defined as shown in the following table. Pattern | Description |
|---|
(a\1 | Either match "a" along with the value of the first captured group … | |(?(1) | … or test whether the first captured group has been defined. (Note that the (?(1) construct does not define a capturing group.) | \1)) | If the first captured group exists, match its value. If the group does not exist, the group will match String..::.Empty. |
The first regular expression tries to match this pattern between zero and two times; the second, exactly two times. Because the first pattern reaches its minimum number of captures with its first capture of String..::.Empty, it never repeats to try to match a\1; the {0,2} quantifier allows only empty matches in the last iteration. In contrast, the second regular expression does match "a" because it evaluates a\1 a second time; the minimum number of iterations, 2, forces the engine to repeat after an empty match.
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim pattern, input As String
pattern = "(a\1|(?(1)\1)){0,2}"
input = "aaabbb"
Console.WriteLine("Regex pattern: {0}", pattern)
Dim match As Match = Regex.Match(input, pattern)
Console.WriteLine("Match: '{0}' at position {1}.",
match.Value, match.Index)
If match.Groups.Count > 1 Then
For groupCtr As Integer = 1 To match.Groups.Count - 1
Dim group As Group = match.Groups(groupCtr)
Console.WriteLine(" Group: {0}: '{1}' at position {2}.",
groupCtr, group.Value, group.Index)
Dim captureCtr As Integer = 0
For Each capture As Capture In group.Captures
captureCtr += 1
Console.WriteLine(" Capture: {0}: '{1}' at position {2}.",
captureCtr, capture.Value, capture.Index)
Next
Next
End If
Console.WriteLine()
pattern = "(a\1|(?(1)\1)){2}"
Console.WriteLine("Regex pattern: {0}", pattern)
match = Regex.Match(input, pattern)
Console.WriteLine("Matched '{0}' at position {1}.",
match.Value, match.Index)
If match.Groups.Count > 1 Then
For groupCtr As Integer = 1 To match.Groups.Count - 1
Dim group As Group = match.Groups(groupCtr)
Console.WriteLine(" Group: {0}: '{1}' at position {2}.",
groupCtr, group.Value, group.Index)
Dim captureCtr As Integer = 0
For Each capture As Capture In group.Captures
captureCtr += 1
Console.WriteLine(" Capture: {0}: '{1}' at position {2}.",
captureCtr, capture.Value, capture.Index)
Next
Next
End If
End Sub
End Module
' The example displays the following output:
' Regex pattern: (a\1|(?(1)\1)){0,2}
' Match: '' at position 0.
' Group: 1: '' at position 0.
' Capture: 1: '' at position 0.
'
' Regex pattern: (a\1|(?(1)\1)){2}
' Matched 'a' at position 0.
' Group: 1: 'a' at position 0.
' Capture: 1: '' at position 0.
' Capture: 2: 'a' at position 0.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern, input;
pattern = @"(a\1|(?(1)\1)){0,2}";
input = "aaabbb";
Console.WriteLine("Regex pattern: {0}", pattern);
Match match = Regex.Match(input, pattern);
Console.WriteLine("Match: '{0}' at position {1}.",
match.Value, match.Index);
if (match.Groups.Count > 1) {
for (int groupCtr = 1; groupCtr <= match.Groups.Count - 1; groupCtr++)
{
Group group = match.Groups[groupCtr];
Console.WriteLine(" Group: {0}: '{1}' at position {2}.",
groupCtr, group.Value, group.Index);
int captureCtr = 0;
foreach (Capture capture in group.Captures) {
captureCtr++;
Console.WriteLine(" Capture: {0}: '{1}' at position {2}.",
captureCtr, capture.Value, capture.Index);
}
}
}
Console.WriteLine();
pattern = @"(a\1|(?(1)\1)){2}";
Console.WriteLine("Regex pattern: {0}", pattern);
match = Regex.Match(input, pattern);
Console.WriteLine("Matched '{0}' at position {1}.",
match.Value, match.Index);
if (match.Groups.Count > 1) {
for (int groupCtr = 1; groupCtr <= match.Groups.Count - 1; groupCtr++)
{
Group group = match.Groups[groupCtr];
Console.WriteLine(" Group: {0}: '{1}' at position {2}.",
groupCtr, group.Value, group.Index);
int captureCtr = 0;
foreach (Capture capture in group.Captures) {
captureCtr++;
Console.WriteLine(" Capture: {0}: '{1}' at position {2}.",
captureCtr, capture.Value, capture.Index);
}
}
}
}
}
// The example displays the following output:
// Regex pattern: (a\1|(?(1)\1)){0,2}
// Match: '' at position 0.
// Group: 1: '' at position 0.
// Capture: 1: '' at position 0.
//
// Regex pattern: (a\1|(?(1)\1)){2}
// Matched 'a' at position 0.
// Group: 1: 'a' at position 0.
// Capture: 1: '' at position 0.
// Capture: 2: 'a' at position 0.

See Also

Change History
Date | History | Reason |
|---|
October 2010
| Added the "Quantifiers and Empty Matches" section. |
Information enhancement.
|
|
.NET Framework 4 量指定子 更新 : 2010 年 10 月 量指定子は、一致と見なすために、入力中に文字、グループ、または文字クラスがいくつ存在しなければならないかを指定します。 .NET Framework でサポートされている量指定子を次の表に示します。 最長一致の量指定子 | 最短一致の量指定子 | 説明 |
|---|
* | *? | 0 回以上の繰り返しに一致します。 | + | +? | 1 回以上の繰り返しに一致します。 | ? | ?? | 0 回または 1 回の繰り返しに一致します。 | {}n | {}?n | ちょうど n 回の繰り返しに一致します。 | {,}n | {,}?n | n 回以上の繰り返しに一致します。 | {n,m} | {n,m}? | n 回以上 m 回以下の繰り返しに一致します。 |
量 n および m は、整数定数です。 通常、量指定子は最長一致です。最長一致の場合、正規表現エンジンでは、特定のパターンの繰り返しができるだけ多くなるように照合が行われます。 量指定子に ? 文字を付けると最短一致になります。最短一致の場合は、繰り返しができるだけ少なくなるように照合が行われます。 最長一致の量指定子と最短一致の量指定子の違いの詳細については、このトピックの「最長一致の量指定子と最短一致の量指定子」のセクションを参照してください。 重要 |
|---|
量指定子を入れ子にすると (正規表現パターン (a*)* などの場合)、入力文字列の文字数に応じて指数関数的に、正規表現エンジンで実行する必要がある比較の回数が増加する可能性があります。 この動作と回避方法の詳細については、「バックトラッキング」を参照してください。 |

正規表現の量指定子
以降のセクションでは、.NET Framework の正規表現でサポートされている量指定子について説明します。 メモ |
|---|
正規表現エンジンでは、正規表現パターンで *、+、?、{、および } の各文字を検出すると、文字クラスに含まれているもの以外は量指定子または量指定子構造の一部として解釈します。 文字クラスの外側でこれらをリテラル文字として解釈するには、文字の前に円記号を付けてエスケープする必要があります。 たとえば、正規表現パターン内の \* という文字列は、リテラルのアスタリスク ("*") 文字と解釈されます。 |
0 回以上の繰り返しに一致: ** 量指定子は、直前の要素の 0 回以上の繰り返しに一致します。 これは {0,} 指定子と同じです。 * は最長一致の量指定子であり、最短一致でこれに対応するのは *? です。 以下の例はこの正規表現を示しています。 入力文字列の 9 つの数字のうち、5 つがパターンに一致し、4 つ (95、929、9129、および 9919) が一致しません。
Dim pattern As String = "\b91*9*\b"
Dim input As String = "99 95 919 929 9119 9219 999 9919 91119"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' '99' found at position 0.
' '919' found at position 6.
' '9119' found at position 14.
' '999' found at position 24.
' '91119' found at position 33.
string pattern = @"\b91*9*\b";
string input = "99 95 919 929 9119 9219 999 9919 91119";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// '99' found at position 0.
// '919' found at position 6.
// '9119' found at position 14.
// '999' found at position 24.
// '91119' found at position 33.
正規表現パターンは、次の表に示すように定義されています。 パターン | 説明 |
|---|
\b | ワード境界から開始します。 | 91* | "9" の後に文字 "1" が 0 個以上続くパターンに一致します。 | 9* | 0 個以上の文字 "9" に一致します。 | \b | ワード境界で終了します。 |
1 回以上の繰り返しに一致: ++ 量指定子は、直前の要素の 1 回以上の繰り返しに一致します。 これは {1,} と同じです。 + は最長一致の量指定子であり、最短一致でこれに対応するのは +? です。 たとえば、正規表現 \ban+\w*?\b は、文字 a で始まり、文字 n が 1 回以上繰り返されるすべての単語に一致を試みます。 以下の例はこの正規表現を示しています。 この正規表現は、単語 an、annual、announcement、および antique に一致し、autumn と all では正しく不一致となります。
Dim pattern As String = "\ban+\w*?\b"
Dim input As String = "Autumn is a great time for an annual announcement to all antique collectors."
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'an' found at position 27.
' 'annual' found at position 30.
' 'announcement' found at position 37.
' 'antique' found at position 57.
string pattern = @"\ban+\w*?\b";
string input = "Autumn is a great time for an annual announcement to all antique collectors.";
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'an' found at position 27.
// 'annual' found at position 30.
// 'announcement' found at position 37.
// 'antique' found at position 57.
正規表現パターンは、次の表に示すように定義されています。 パターン | 説明 |
|---|
\b | ワード境界から開始します。 | an+ | "a" の後に文字 "n" が 1 個以上続くパターンに一致します。 | \w*? | 単語に使用される文字の 0 回以上の繰り返しに一致しますが、できるだけ少ない繰り返しを探します。 | \b | ワード境界で終了します。 |
0 回または 1 回の繰り返しに一致: ?? 量指定子は、直前の要素の 0 回または 1 回の繰り返しに一致します。 これは {0,1} と同じです。 ? は最長一致の量指定子であり、最短一致でこれに対応するのは ?? です。 たとえば、正規表現 \ban?\b は、文字 a で始まり、文字 n が 0 回または 1 回繰り返されるすべての単語に一致を試みます。 つまり、単語 a と an に一致を試みます。 以下の例はこの正規表現を示しています。
Dim pattern As String = "\ban?\b"
Dim input As String = "An amiable animal with a large snount and an animated nose."
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'An' found at position 0.
' 'a' found at position 23.
' 'an' found at position 42.
string pattern = @"\ban?\b";
string input = "An amiable animal with a large snount and an animated nose.";
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'An' found at position 0.
// 'a' found at position 23.
// 'an' found at position 42.
正規表現パターンは、次の表に示すように定義されています。 パターン | 説明 |
|---|
\b | ワード境界から開始します。 | an? | "a" の後に文字 "n" が 0 個または 1 個続くパターンに一致します。 | \b | ワード境界で終了します。 |
ちょうど n 回の繰り返しに一致: {n}{n} 量指定子は、直前の要素のちょうど n 回の繰り返しに一致します (n は任意の整数)。 {n} は最長一致の量指定子であり、最短一致でこれに対応するのは {n}? です。 たとえば、正規表現 \b\d+\,\d{3}\b は、ワード境界、1 つ以上の数字、3 つの数字、ワード境界の順に続く文字列に一致を試みます。 以下の例はこの正規表現を示しています。
Dim pattern As String = "\b\d+\,\d{3}\b"
Dim input As String = "Sales totaled 103,524 million in January, " + _
"106,971 million in February, but only " + _
"943 million in March."
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' '103,524' found at position 14.
' '106,971' found at position 45.
string pattern = @"\b\d+\,\d{3}\b";
string input = "Sales totaled 103,524 million in January, " +
"106,971 million in February, but only " +
"943 million in March.";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// '103,524' found at position 14.
// '106,971' found at position 45.
正規表現パターンは、次の表に示すように定義されています。 パターン | 説明 |
|---|
\b | ワード境界から開始します。 | \d+ | 1 個以上の 10 進数と一致します。 | \, | コンマ文字と一致します。 | \d{3} | 3 個の 10 進数と一致します。 | \b | ワード境界で終了します。 |
n 回以上の繰り返しに一致: {n,}{n,} 量指定子は、直前の要素の n 回以上の繰り返しに一致します (n は任意の整数)。 {n,} は最長一致の量指定子であり、最短一致でこれに対応するのは {n}? です。 たとえば、正規表現 \b\d{2,}\b\D+ は、ワード境界、2 つ以上の数字、ワード境界、数字以外の文字の順に続く文字列に一致を試みます。 以下の例はこの正規表現を示しています。 この正規表現は、"7 days" という語句に一致しません。これは、10 進数字が 1 つしか含まれていないためです。しかし、"10 weeks and 300 years" という語句には正常に一致します。
Dim pattern As String = "\b\d{2,}\b\D+"
Dim input As String = "7 days, 10 weeks, 300 years"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' '10 weeks, ' found at position 8.
' '300 years' found at position 18.
string pattern = @"\b\d{2,}\b\D+";
string input = "7 days, 10 weeks, 300 years";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// '10 weeks, ' found at position 8.
// '300 years' found at position 18.
正規表現パターンは、次の表に示すように定義されています。 パターン | 説明 |
|---|
\b | ワード境界から開始します。 | \d{2,} | 2 個以上の 10 進数と一致します。 | \b | ワード境界に一致します。 | \D+ | 1 個以上の 10 進数以外の文字と一致します。 |
n 回以上 m 回以下の繰り返しに一致: {n,m}{n,m} 量指定子は、直前の要素の n 回以上 m 回以下の繰り返しに一致します (n および m は任意の整数)。 {n,m} は最長一致の量指定子であり、最短一致でこれに対応するのは {n,m}? です。 次の例では、正規表現 (00\s){2,4} は、2 つの 0 とスペースが、2 回以上 4 回以下だけ現れる文字列に一致を試みます。 入力文字列の最後の部分には、このパターンが最大の 4 回ではなく 5 回含まれていることに注意してください。 しかし、この部分文字列の最初の部分 (5 つ目の 0 のペアの前のスペースまで) だけが、正規表現パターンに一致します。
Dim pattern As String = "(00\s){2,4}"
Dim input As String = "0x00 FF 00 00 18 17 FF 00 00 00 21 00 00 00 00 00"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' '00 00 ' found at position 8.
' '00 00 00 ' found at position 23.
' '00 00 00 00 ' found at position 35.
string pattern = @"(00\s){2,4}";
string input = "0x00 FF 00 00 18 17 FF 00 00 00 21 00 00 00 00 00";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// '00 00 ' found at position 8.
// '00 00 00 ' found at position 23.
// '00 00 00 00 ' found at position 35.
0 回以上の繰り返しに一致 (最短一致): *?*? 量指定子は、直前の要素の 0 回以上の繰り返しに一致しますが、できるだけ少ない繰り返しを探します。 これは、最長一致の量指定子 * に対応する、最短一致の量指定子です。 次の例では、正規表現 \b\w*?oo\w*?\b は、文字列 oo を含むすべての単語に一致します。
Dim pattern As String = "\b\w*?oo\w*?\b"
Dim input As String = "woof root root rob oof woo woe"
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'woof' found at position 0.
' 'root' found at position 5.
' 'root' found at position 10.
' 'oof' found at position 19.
' 'woo' found at position 23.
string pattern = @"\b\w*?oo\w*?\b";
string input = "woof root root rob oof woo woe";
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'woof' found at position 0.
// 'root' found at position 5.
// 'root' found at position 10.
// 'oof' found at position 19.
// 'woo' found at position 23.
正規表現パターンは、次の表に示すように定義されています。 パターン | 説明 |
|---|
\b | ワード境界から開始します。 | \w*? | 0 個以上の単語に使用される文字と一致しますが、できるだけ少ない繰り返しを探します。 | oo | 文字列 "oo" と一致します。 | \w*? | 0 個以上の単語に使用される文字と一致しますが、できるだけ少ない繰り返しを探します。 | \b | ワード境界で終了します。 |
1 回以上の繰り返しに一致 (最短一致): +?+? 量指定子は、直前の要素の 1 回以上の繰り返しに一致しますが、できるだけ少ない繰り返しを探します。 これは、最長一致の量指定子 + に対応する、最短一致の量指定子です。 たとえば、正規表現 \b\w+?\b は、ワード境界で区切られた 1 つ以上の文字に一致します。 以下の例はこの正規表現を示しています。
Dim pattern As String = "\b\w+?\b"
Dim input As String = "Aa Bb Cc Dd Ee Ff"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'Aa' found at position 0.
' 'Bb' found at position 3.
' 'Cc' found at position 6.
' 'Dd' found at position 9.
' 'Ee' found at position 12.
' 'Ff' found at position 15.
string pattern = @"\b\w+?\b";
string input = "Aa Bb Cc Dd Ee Ff";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'Aa' found at position 0.
// 'Bb' found at position 3.
// 'Cc' found at position 6.
// 'Dd' found at position 9.
// 'Ee' found at position 12.
// 'Ff' found at position 15.
0 回または 1 回の繰り返しに一致 (最短一致): ???? 量指定子は、直前の要素の 0 回または 1 回の繰り返しに一致しますが、できるだけ少ない繰り返しを探します。 これは、最長一致の量指定子 ? に対応する、最短一致の量指定子です。 たとえば、正規表現 ^\s*(System.)??Console.Write(Line)??\(?? は、文字列 "Console.Write" または "Console.WriteLine" に一致します。 この文字列には、"Console" の前に "System." が含まれていてもよく、最後に左かっこがあってもかまいません。 文字列は行の先頭にある必要がありますが、その前に空白があってもかまいません。 以下の例はこの正規表現を示しています。
Dim pattern As String = "^\s*(System.)??Console.Write(Line)??\(??"
Dim input As String = "System.Console.WriteLine(""Hello!"")" + vbCrLf + _
"Console.Write(""Hello!"")" + vbCrLf + _
"Console.WriteLine(""Hello!"")" + vbCrLf + _
"Console.ReadLine()" + vbCrLf + _
" Console.WriteLine"
For Each match As Match In Regex.Matches(input, pattern, _
RegexOptions.IgnorePatternWhitespace Or RegexOptions.IgnoreCase Or RegexOptions.MultiLine)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'System.Console.Write' found at position 0.
' 'Console.Write' found at position 36.
' 'Console.Write' found at position 61.
' ' Console.Write' found at position 110.
string pattern = @"^\s*(System.)??Console.Write(Line)??\(??";
string input = "System.Console.WriteLine(\"Hello!\")\n" +
"Console.Write(\"Hello!\")\n" +
"Console.WriteLine(\"Hello!\")\n" +
"Console.ReadLine()\n" +
" Console.WriteLine";
foreach (Match match in Regex.Matches(input, pattern,
RegexOptions.IgnorePatternWhitespace |
RegexOptions.IgnoreCase |
RegexOptions.Multiline))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'System.Console.Write' found at position 0.
// 'Console.Write' found at position 36.
// 'Console.Write' found at position 61.
// ' Console.Write' found at position 110.
正規表現パターンは、次の表に示すように定義されています。 パターン | 説明 |
|---|
^ | 入力ストリームの先頭と一致します。 | \s* | 0 個以上の空白文字と一致します。 | (System.)?? | 文字列 "System." の 0 回または 1 回の繰り返しに一致します。 | Console.Write | 文字列 "Console.Write" と一致します。 | (Line)?? | 文字列 "Line" の 0 回または 1 回の繰り返しに一致します。 | \(?? | 左かっこの 0 回または 1 回の繰り返しに一致します。 |
ちょうど n 回の繰り返しに一致 (最短一致): {n}?{n}? 量指定子は、直前の要素のちょうど n 回の繰り返しに一致します (n は任意の整数)。 これは、最長一致の量指定子 {n}+ に対応する、最短一致の量指定子です。 次の例では、正規表現 \b(\w{3,}?\.){2}?\w{3,}?\b を使用して Web サイトのアドレスを識別します。 "www.microsoft.com" と "msdn.microsoft.com" には一致しますが、"mywebsite" や "mycompany.com" には一致しない点に注意してください。
Dim pattern As String = "\b(\w{3,}?\.){2}?\w{3,}?\b"
Dim input As String = "www.microsoft.com msdn.microsoft.com mywebsite mycompany.com"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'www.microsoft.com' found at position 0.
' 'msdn.microsoft.com' found at position 18.
string pattern = @"\b(\w{3,}?\.){2}?\w{3,}?\b";
string input = "www.microsoft.com msdn.microsoft.com mywebsite mycompany.com";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'www.microsoft.com' found at position 0.
// 'msdn.microsoft.com' found at position 18.
正規表現パターンは、次の表に示すように定義されています。 パターン | 説明 |
|---|
\b | ワード境界から開始します。 | (\w{3,}? \.) | 3 個以上の単語に使用される文字の後にドット (ピリオド) 文字が続くパターンに一致しますが、できるだけ少ない繰り返しを探します。 これが最初のキャプチャ グループです。 | (\w{3,}? \.){2}? | 最初のグループのパターンの 2 回の繰り返しに一致しますが、できるだけ少ない繰り返しを探します。 | \b | ワード境界で照合を終了します。 |
n 回以上の繰り返しに一致 (最短一致): {n,}?{n,}? 量指定子は、直前の要素の n 回以上の繰り返しに一致しますが (n は任意の整数)、できるだけ少ない繰り返しを探します。 これは、最長一致の量指定子 {n,} に対応する、最短一致の量指定子です。 説明のため、前のセクションの {n}? 量指定子の例を確認してください。 この例の正規表現では、{n,} 量指定子を使用して、3 つ以上の文字の後にピリオドが続く文字列を照合しています。 n 回以上 m 回以下の繰り返しに一致 (最短一致): {n,m}?{n,m}? 量指定子は、直前の要素の n 回以上 m 回以下の繰り返しに一致しますが (n と m は整数)、できるだけ少ない繰り返しを探します。 これは、最長一致の量指定子 {n,m} に対応する、最短一致の量指定子です。 次の例では、正規表現 \b[A-Z](\w*\s+){1,10}?[.!?] は、1 個以上 10 個以下の単語を含んだ文に一致します。 これは、18 個の単語が含まれている 1 つの文を除き、入力文字列中のすべての文に一致します。
Dim pattern As String = "\b[A-Z](\w*\s?){1,10}?[.!?]"
Dim input As String = "Hi. I am writing a short note. Its purpose is " + _
"to test a regular expression that attempts to find " + _
"sentences with ten or fewer words. Most sentences " + _
"in this note are short."
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'Hi.' found at position 0.
' 'I am writing a short note.' found at position 4.
' 'Most sentences in this note are short.' found at position 132.
string pattern = @"\b[A-Z](\w*?\s*?){1,10}[.!?]";
string input = "Hi. I am writing a short note. Its purpose is " +
"to test a regular expression that attempts to find " +
"sentences with ten or fewer words. Most sentences " +
"in this note are short.";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'Hi.' found at position 0.
// 'I am writing a short note.' found at position 4.
// 'Most sentences in this note are short.' found at position 132.
正規表現パターンは、次の表に示すように定義されています。 パターン | 説明 |
|---|
\b | ワード境界から開始します。 | [A-Z] | A から Z の大文字と一致します。 | (\w*\s+) | 0 個以上の単語に使用される文字の後に空白文字が 1 個以上続くパターンに一致します。 これが最初のキャプチャ グループです。 | {1,10}? | 最初のパターンの 1 回以上 10 回以下の繰り返しに一致しますが、できるだけ少ない繰り返しを探します。 | [.!?] | 区切り文字 "."、"!"、"?" のいずれか 1 文字と一致します。 |

最長一致の量指定子と最短一致の量指定子
いくつかの量指定子には以下の 2 つのバージョンがあります。 クレジット カード番号などの数値の列から最後の 4 つの数字を取り出す、単純な正規表現について考えてみます。 最長一致の量指定子 * を使用するバージョンの正規表現は、\b.*([0-9]{4})\b です。 しかし、2 つの数値が含まれている文字列の場合、この正規表現は、以下の例に示すように 2 番目の数値の最後の 4 つの数字だけに一致します。
Dim greedyPattern As String = "\b.*([0-9]{4})\b"
Dim input1 As String = "1112223333 3992991999"
For Each match As Match In Regex.Matches(input1, greedypattern)
Console.WriteLine("Account ending in ******{0}.", match.Groups(1).Value)
Next
' The example displays the following output:
' Account ending in ******1999.
string greedyPattern = @"\b.*([0-9]{4})\b";
string input1 = "1112223333 3992991999";
foreach (Match match in Regex.Matches(input1, greedyPattern))
Console.WriteLine("Account ending in ******{0}.", match.Groups[1].Value);
// The example displays the following output:
// Account ending in ******1999.
この正規表現は最初の数値の一致に失敗します。量指定子 * では、直前の要素をできるだけ頻繁に文字列全体に一致させようとし、文字列の最後まで一致と見なされるためです。 これは期待した動作ではありません。 この場合は、以下の例に示すように、代わりに最短一致の *? 量指定子を使用すると、両方の数値から数字を抽出できます。
Dim lazyPattern As String = "\b.*?([0-9]{4})\b"
Dim input2 As String = "1112223333 3992991999"
For Each match As Match In Regex.Matches(input2, lazypattern)
Console.WriteLine("Account ending in ******{0}.", match.Groups(1).Value)
Next
' The example displays the following output:
' Account ending in ******3333.
' Account ending in ******1999.
string lazyPattern = @"\b.*?([0-9]{4})\b";
string input2 = "1112223333 3992991999";
foreach (Match match in Regex.Matches(input2, lazyPattern))
Console.WriteLine("Account ending in ******{0}.", match.Groups[1].Value);
// The example displays the following output:
// Account ending in ******3333.
// Account ending in ******1999.
多くの場合、最長一致と最短一致の量指定子を使用した正規表現は、同じ一致結果を返します。 異なる結果を返すことが最も多いのは、どの文字にも一致するメタ文字のワイルドカード (.) を使用した場合です。

量指定子と空一致
量指定子 *、+、{n,m}、およびそれぞれに対応する最短一致の量指定子は、最小回数のキャプチャが見つかった場合、空一致の後では繰り返しを行いません。 この規則により、可能なグループ キャプチャの最大回数が無限またはほぼ無限のときに、量指定子が空の部分式一致の無限ループに入ることを回避できます。 たとえば、次のコードは、0 個以上の文字 "a" と 0 回以上一致する正規表現パターン (a?)* を指定して Regex..::.Match メソッドを呼び出した結果を表示します。 単一のキャプチャ グループによって各 "a" および String..::.Empty がキャプチャされますが、2 番目の空一致はないことに注意してください。これは、最初の空一致により量指定子が繰り返しを停止するためです。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim pattern As String = "(a?)*"
Dim input As String = "aaabbb"
Dim match As Match = Regex.Match(input, pattern)
Console.WriteLine("Match: '{0}' at index {1}",
match.Value, match.Index)
If match.Groups.Count > 1 Then
Dim groups As GroupCollection = match.Groups
For grpCtr As Integer = 1 To groups.Count - 1
Console.WriteLine(" Group {0}: '{1}' at index {2}",
grpCtr,
groups(grpCtr).Value,
groups(grpCtr).Index)
Dim captureCtr As Integer = 0
For Each capture As Capture In groups(grpCtr).Captures
captureCtr += 1
Console.WriteLine(" Capture {0}: '{1}' at index {2}",
captureCtr, capture.Value, capture.Index)
Next
Next
End If
End Sub
End Module
' The example displays the following output:
' Match: 'aaa' at index 0
' Group 1: '' at index 3
' Capture 1: 'a' at index 0
' Capture 2: 'a' at index 1
' Capture 3: 'a' at index 2
' Capture 4: '' at index 3
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = "(a?)*";
string input = "aaabbb";
Match match = Regex.Match(input, pattern);
Console.WriteLine("Match: '{0}' at index {1}",
match.Value, match.Index);
if (match.Groups.Count > 1) {
GroupCollection groups = match.Groups;
for (int grpCtr = 1; grpCtr <= groups.Count - 1; grpCtr++) {
Console.WriteLine(" Group {0}: '{1}' at index {2}",
grpCtr,
groups[grpCtr].Value,
groups[grpCtr].Index);
int captureCtr = 0;
foreach (Capture capture in groups[grpCtr].Captures) {
captureCtr++;
Console.WriteLine(" Capture {0}: '{1}' at index {2}",
captureCtr, capture.Value, capture.Index);
}
}
}
}
}
// The example displays the following output:
// Match: 'aaa' at index 0
// Group 1: '' at index 3
// Capture 1: 'a' at index 0
// Capture 2: 'a' at index 1
// Capture 3: 'a' at index 2
// Capture 4: '' at index 3
キャプチャの最大回数および最小回数を定義するキャプチャ グループとキャプチャの固定回数を定義するキャプチャ グループとの実際の違いを確認するために、正規表現パターンの (a\1|(?(1)\1)){0,2} と (a\1|(?(1)\1)){2} について検討します。 どちらの正規表現も、次の表に示すように定義される単一のキャプチャ グループで構成されています。 パターン | 説明 |
|---|
(a\1 | "a" と最初のキャプチャ グループの値に一致します。 | |(?(1) | または、最初のキャプチャ グループが定義されているかどうかをテストします ((?(1) 構成要素ではキャプチャ グループは定義されないことに注意してください)。 | \1)) | 最初のキャプチャ グループが存在する場合、その値と一致します。 グループが存在しない場合、そのグループは String..::.Empty と一致します。 |
最初の正規表現では、0 ~ 2 回、このパターンとの照合が行われます。2 番目の正規表現では、厳密に 2 回です。 最初のパターンは String..::.Empty の最初のキャプチャでキャプチャの最小回数に達するため、a\1 との照合は繰り返されません。{0,2} 量指定子では、最後の反復処理の空一致だけが許可されます。 一方、2 番目の正規表現では、2 回目の a\1 が評価されるため、"a" に一致します。反復の最小回数は 2 で、空一致の後でエンジンが繰り返さなければならない回数になります。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim pattern, input As String
pattern = "(a\1|(?(1)\1)){0,2}"
input = "aaabbb"
Console.WriteLine("Regex pattern: {0}", pattern)
Dim match As Match = Regex.Match(input, pattern)
Console.WriteLine("Match: '{0}' at position {1}.",
match.Value, match.Index)
If match.Groups.Count > 1 Then
For groupCtr As Integer = 1 To match.Groups.Count - 1
Dim group As Group = match.Groups(groupCtr)
Console.WriteLine(" Group: {0}: '{1}' at position {2}.",
groupCtr, group.Value, group.Index)
Dim captureCtr As Integer = 0
For Each capture As Capture In group.Captures
captureCtr += 1
Console.WriteLine(" Capture: {0}: '{1}' at position {2}.",
captureCtr, capture.Value, capture.Index)
Next
Next
End If
Console.WriteLine()
pattern = "(a\1|(?(1)\1)){2}"
Console.WriteLine("Regex pattern: {0}", pattern)
match = Regex.Match(input, pattern)
Console.WriteLine("Matched '{0}' at position {1}.",
match.Value, match.Index)
If match.Groups.Count > 1 Then
For groupCtr As Integer = 1 To match.Groups.Count - 1
Dim group As Group = match.Groups(groupCtr)
Console.WriteLine(" Group: {0}: '{1}' at position {2}.",
groupCtr, group.Value, group.Index)
Dim captureCtr As Integer = 0
For Each capture As Capture In group.Captures
captureCtr += 1
Console.WriteLine(" Capture: {0}: '{1}' at position {2}.",
captureCtr, capture.Value, capture.Index)
Next
Next
End If
End Sub
End Module
' The example displays the following output:
' Regex pattern: (a\1|(?(1)\1)){0,2}
' Match: '' at position 0.
' Group: 1: '' at position 0.
' Capture: 1: '' at position 0.
'
' Regex pattern: (a\1|(?(1)\1)){2}
' Matched 'a' at position 0.
' Group: 1: 'a' at position 0.
' Capture: 1: '' at position 0.
' Capture: 2: 'a' at position 0.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern, input;
pattern = @"(a\1|(?(1)\1)){0,2}";
input = "aaabbb";
Console.WriteLine("Regex pattern: {0}", pattern);
Match match = Regex.Match(input, pattern);
Console.WriteLine("Match: '{0}' at position {1}.",
match.Value, match.Index);
if (match.Groups.Count > 1) {
for (int groupCtr = 1; groupCtr <= match.Groups.Count - 1; groupCtr++)
{
Group group = match.Groups[groupCtr];
Console.WriteLine(" Group: {0}: '{1}' at position {2}.",
groupCtr, group.Value, group.Index);
int captureCtr = 0;
foreach (Capture capture in group.Captures) {
captureCtr++;
Console.WriteLine(" Capture: {0}: '{1}' at position {2}.",
captureCtr, capture.Value, capture.Index);
}
}
}
Console.WriteLine();
pattern = @"(a\1|(?(1)\1)){2}";
Console.WriteLine("Regex pattern: {0}", pattern);
match = Regex.Match(input, pattern);
Console.WriteLine("Matched '{0}' at position {1}.",
match.Value, match.Index);
if (match.Groups.Count > 1) {
for (int groupCtr = 1; groupCtr <= match.Groups.Count - 1; groupCtr++)
{
Group group = match.Groups[groupCtr];
Console.WriteLine(" Group: {0}: '{1}' at position {2}.",
groupCtr, group.Value, group.Index);
int captureCtr = 0;
foreach (Capture capture in group.Captures) {
captureCtr++;
Console.WriteLine(" Capture: {0}: '{1}' at position {2}.",
captureCtr, capture.Value, capture.Index);
}
}
}
}
}
// The example displays the following output:
// Regex pattern: (a\1|(?(1)\1)){0,2}
// Match: '' at position 0.
// Group: 1: '' at position 0.
// Capture: 1: '' at position 0.
//
// Regex pattern: (a\1|(?(1)\1)){2}
// Matched 'a' at position 0.
// Group: 1: 'a' at position 0.
// Capture: 1: '' at position 0.
// Capture: 2: 'a' at position 0.

参照

履歴の変更
日付 | 履歴 | 理由 |
|---|
2010 年 10 月
| 「量指定子と空一致」を追加。 |
情報の拡充
|
|