The following sections describe the .NET Framework regular expression classes.
The Regex class represents an immutable (read-only) regular expression. It also contains static methods that allow use of other regular expression classes without explicitly creating instances of the other classes.
The following code example creates an instance of the Regex class and defines a simple regular expression when the object is initialized. Note the use of an additional backslash as an escape character that designates the backslash in the \s matching character class as a literal character.
' Declare object variable of type Regex.
Dim r As Regex
' Create a Regex object and define its regular expression.
r = New Regex("\s2000")
// Declare object variable of type Regex.
Regex r;
// Create a Regex object and define its regular expression.
r = new Regex("\\s2000");
The Match class represents the results of a regular expression matching operation. The following example uses the Match method of the Regex class to return an object of type Match in order to find the first match in the input string. The example uses the Match.Success property of the Match class to indicate whether a match was found.
' Create a new Regex object.
Dim r As New Regex("abc")
' Find a single match in the input string.
Dim m As Match = r.Match("123abc456")
If m.Success Then
' Print out the character position where a match was found.
Console.WriteLine("Found match at position " & m.Index.ToString())
End If
' The example displays the following output:
' Found match at position 3
// Create a new Regex object.
Regex r = new Regex("abc");
// Find a single match in the string.
Match m = r.Match("123abc456");
if (m.Success)
{
// Print out the character position where a match was found.
Console.WriteLine("Found match at position " + m.Index);
}
// The example displays the following output:
// Found match at position 3
The MatchCollection class represents a sequence of successful non-overlapping matches. The collection is immutable (read-only) and has no public constructor. Instances of MatchCollection are returned by the Regex.Matches method.
The following example uses the Matches method of the Regex class to fill a MatchCollection with all the matches found in the input string. The example copies the collection to a string array that holds each match and an integer array that indicates the position of each match.
Dim mc As MatchCollection
Dim results As New List(Of String)
Dim matchposition As New List(Of Integer)
' Create a new Regex object and define the regular expression.
Dim r As New Regex("abc")
' Use the Matches method to find all matches in the input string.
mc = r.Matches("123abc4abcd")
' Loop through the match collection to retrieve all
' matches and positions.
For i As Integer = 0 To mc.Count - 1
' Add the match string to the string array.
results.Add(mc(i).Value)
' Record the character position where the match was found.
matchposition.Add(mc(i).Index)
Next i
' List the results.
For ctr As Integer = 0 To Results.Count - 1
Console.WriteLine("'{0}' found at position {1}.", _
results(ctr), matchposition(ctr))
Next
' The example displays the following output:
' 'abc' found at position 3.
' 'abc' found at position 7.
MatchCollection mc;
List<string> results = new List<string>();
List<int> matchposition = new List<int>();
// Create a new Regex object and define the regular expression.
Regex r = new Regex("abc");
// Use the Matches method to find all matches in the input string.
mc = r.Matches("123abc4abcd");
// Loop through the match collection to retrieve all
// matches and positions.
for (int i = 0; i < mc.Count; i++)
{
// Add the match string to the string array.
results.Add(mc[i].Value);
// Record the character position where the match was found.
matchposition.Add(mc[i].Index);
}
// List the results.
for(int ctr = 0; ctr <= results.Count - 1; ctr++)
Console.WriteLine("'{0}' found at position {1}.", results[ctr], matchposition[ctr]);
// The example displays the following output:
// 'abc' found at position 3.
// 'abc' found at position 7.
The GroupCollection class represents a collection of captured groups and returns the set of captured groups in a single match. The collection is immutable (read-only) and has no public constructor. Instances of GroupCollection are returned in the collection that the Match.Groups property returns.
The following example finds and prints the number of groups captured by a regular expression. For an example of how to extract the individual captures in each member of a group collection, see the Capture Collection example in the following section.
' Define groups "abc", "ab", and "b".
Dim r As New Regex("(a(b))c")
Dim m As Match = r.Match("abdabc")
Console.WriteLine("Number of groups found = " _
& m.Groups.Count)
' The example displays the following output:
' Number of groups found = 3
// Define groups "abc", "ab", and "b".
Regex r = new Regex("(a(b))c");
Match m = r.Match("abdabc");
Console.WriteLine("Number of groups found = " + m.Groups.Count);
// The example displays the following output:
// Number of groups found = 3
The CaptureCollection class represents a sequence of captured substrings and returns the set of captures done by a single capturing group. A capturing group can capture more than one string in a single match because of quantifiers. The Captures property, an object of the CaptureCollection class, is provided as a member of the Match and Group classes to facilitate access to the set of captured substrings.
For instance, if you use the regular expression ((a(b))c)+ (where the + quantifier specifies one or more matches) to capture matches from the string "abcabcabc", the CaptureCollection for each matching Group of substrings will contain three members.
The following example uses the regular expression (Abc)+ to find one or more matches in the string "XYZAbcAbcAbcXYZAbcAb". The example illustrates the use of the Captures property to return multiple groups of captured substrings.
Dim counter As Integer
Dim m As Match
Dim cc As CaptureCollection
Dim gc As GroupCollection
' Look for groupings of "Abc".
Dim r As New Regex("(Abc)+")
' Define the string to search.
m = r.Match("XYZAbcAbcAbcXYZAbcAb")
gc = m.Groups
' Display the number of groups.
Console.WriteLine("Captured groups = " & gc.Count.ToString())
' Loop through each group.
Dim i, ii As Integer
For i = 0 To gc.Count - 1
cc = gc(i).Captures
counter = cc.Count
' Display the number of captures in this group.
Console.WriteLine("Captures count = " & counter.ToString())
' Loop through each capture in the group.
For ii = 0 To counter - 1
' Display the capture and its position.
Console.WriteLine(cc(ii).ToString() _
& " Starts at character " & cc(ii).Index.ToString())
Next ii
Next i
' The example displays the following output:
' Captured groups = 2
' Captures count = 1
' AbcAbcAbc Starts at character 3
' Captures count = 3
' Abc Starts at character 3
' Abc Starts at character 6
' Abc Starts at character 9
int counter;
Match m;
CaptureCollection cc;
GroupCollection gc;
// Look for groupings of "Abc".
Regex r = new Regex("(Abc)+");
// Define the string to search.
m = r.Match("XYZAbcAbcAbcXYZAbcAb");
gc = m.Groups;
// Display the number of groups.
Console.WriteLine("Captured groups = " + gc.Count.ToString());
// Loop through each group.
for (int i=0; i < gc.Count; i++)
{
cc = gc[i].Captures;
counter = cc.Count;
// Display the number of captures in this group.
Console.WriteLine("Captures count = " + counter.ToString());
// Loop through each capture in the group.
for (int ii = 0; ii < counter; ii++)
{
// Display the capture and its position.
Console.WriteLine(cc[ii] + " Starts at character " +
cc[ii].Index);
}
}
}
// The example displays the following output:
// Captured groups = 2
// Captures count = 1
// AbcAbcAbc Starts at character 3
// Captures count = 3
// Abc Starts at character 3
// Abc Starts at character 6
// Abc Starts at character 9
The Group class represents the results from a single capturing group. Because Group can capture zero, one, or more strings in a single match (using quantifiers), it contains a collection of Capture objects. Because Group inherits from Capture, the last substring captured can be accessed directly (the Group instance itself is equivalent to the last item of the collection returned by the Captures property).
Instances of Group are returned by indexing the GroupCollection object returned by the Groups property. The indexer can be a group number or the name of a capture group if the "(?<groupname>)" grouping construct is used. For example, in C# code you can use Match.Groups[groupnum] or Match.Groups["groupname"], or in Visual Basiccode you can use Match.Groups(groupnum) or Match.Groups("groupname").
The following code example uses nested grouping constructs to capture substrings into groups.
Dim matchposition As New List(Of Integer)
Dim results As New List(Of String)
' Define substrings abc, ab, b.
Dim r As New Regex("(a(b))c")
Dim m As Match = r.Match("abdabc")
Dim i As Integer = 0
While Not (m.Groups(i).Value = "")
' Add groups to string array.
results.Add(m.Groups(i).Value)
' Record character position.
matchposition.Add(m.Groups(i).Index)
i += 1
End While
' Display the capture groups.
For ctr As Integer = 0 to results.Count - 1
Console.WriteLine("{0} at position {1}", _
results(ctr), matchposition(ctr))
Next
' The example displays the following output:
' abc at position 3
' ab at position 3
' b at position 4
List<int> matchposition = new List<int>();
List<string> results = new List<string>();
// Define substrings abc, ab, b.
Regex r = new Regex("(a(b))c");
Match m = r.Match("abdabc");
for (int i = 0; m.Groups[i].Value != ""; i++)
{
// Add groups to string array.
results.Add(m.Groups[i].Value);
// Record character position.
matchposition.Add(m.Groups[i].Index);
}
// Display the capture groups.
for (int ctr = 0; ctr < results.Count; ctr++)
Console.WriteLine("{0} at position {1}",
results[ctr], matchposition[ctr]);
// The example displays the following output:
// abc at position 3
// ab at position 3
// b at position 4
The following code example uses named grouping constructs to capture substrings from a string containing data in a "DATANAME:VALUE" format that the regular expression splits at the colon (:).
Dim r As New Regex("^(?<name>\w+):(?<value>\w+)")
Dim m As Match = r.Match("Section1:119900")
Console.WriteLine(m.Groups("name").Value)
Console.WriteLine(m.Groups("value").Value)
' The example displays the following output:
' Section1
' 119900
Regex r = new Regex("^(?<name>\\w+):(?<value>\\w+)");
Match m = r.Match("Section1:119900");
Console.WriteLine(m.Groups["name"].Value);
Console.WriteLine(m.Groups["value"].Value);
// The example displays the following output:
// Section1
// 119900
The Capture class contains the results from a single subexpression capture.
The following example loops through a Group collection, extracts the Capture collection from each member of Group, and assigns the variables posn and length to the character position in the original string where each string was found and the length of each string, respectively.
Dim r As Regex
Dim m As Match
Dim cc As CaptureCollection
Dim posn, length As Integer
r = New Regex("(abc)+")
m = r.Match("bcabcabc")
Dim i, j As Integer
i = 0
Do While m.Groups(i).Value <> ""
Console.WriteLine(m.Groups(i).Value)
' Grab the Collection for Group(i).
cc = m.Groups(i).Captures
For j = 0 To cc.Count - 1
Console.WriteLine(" Capture at position {0} for {1} characters.", _
cc(j).Index, cc(j).Length)
' Position of Capture object.
posn = cc(j).Index
' Length of Capture object.
length = cc(j).Length
Next j
i += 1
Loop
' The example displays the following output:
' abcabc
' Capture at position 2 for 6 characters.
' abc
' Capture at position 2 for 3 characters.
' Capture at position 5 for 3 characters.
Regex r;
Match m;
CaptureCollection cc;
int posn, length;
r = new Regex("(abc)+");
m = r.Match("bcabcabc");
for (int i=0; m.Groups[i].Value != ""; i++)
{
Console.WriteLine(m.Groups[i].Value);
// Capture the Collection for Group(i).
cc = m.Groups[i].Captures;
for (int j = 0; j < cc.Count; j++)
{
Console.WriteLine(" Capture at position {0} for {1} characters.",
cc[j].Index, cc[j].Length);
// Position of Capture object.
posn = cc[j].Index;
// Length of Capture object.
length = cc[j].Length;
}
}
// The example displays the following output:
// abcabc
// Capture at position 2 for 6 characters.
// abc
// Capture at position 2 for 3 characters.
// Capture at position 5 for 3 characters.
Reference
Other Resources