Regular Expression Classes
The following sections describe the .NET Framework regular expression classes.
The Regex class represents an immutable (read-only) regular expression. It also contains static methods that allow use of other regular expression classes without explicitly creating instances of the other classes.
The following code example creates an instance of the Regex class and defines a simple regular expression when the object is initialized. Note the use of an additional backslash as an escape character that designates the backslash in the \s matching character class as a literal character.
The Match class represents the results of a regular expression matching operation. The following example uses the Match method of the Regex class to return an object of type Match in order to find the first match in the input string. The example uses the Match.Success property of the Match class to indicate whether a match was found.
' cCreate a new Regex object. Dim r As New Regex("abc") ' Find a single match in the input string. Dim m As Match = r.Match("123abc456") If m.Success Then ' Print out the character position where a match was found. ' (Character position 3 in this case.) Console.WriteLine("Found match at position " & m.Index.ToString()) End If
The MatchCollection class represents a sequence of successful non-overlapping matches. The collection is immutable (read-only) and has no public constructor. Instances of MatchCollection are returned by the Regex.Matches method.
The following example uses the Matches method of the Regex class to fill a MatchCollection with all the matches found in the input string. The example copies the collection to a string array that holds each match and an integer array that indicates the position of each match.
Dim mc As MatchCollection Dim results(20) As String Dim matchposition(20) As Integer ' Create a new Regex object and define the regular expression. Dim r As New Regex("abc") ' Use the Matches method to find all matches in the input string. mc = r.Matches("123abc4abcd") ' Loop through the match collection to retrieve all ' matches and positions. Dim i As Integer For i = 0 To mc.Count - 1 ' Add the match string to the string array. results(i) = mc(i).Value ' Record the character position where the match was found. matchposition(i) = mc(i).Index Next i
The GroupCollection class represents a collection of captured groups and returns the set of captured groups in a single match. The collection is immutable (read-only) and has no public constructor. Instances of GroupCollection are returned in the collection that the Match.Groups property returns.
The following console application example finds and prints the number of groups captured by a regular expression. For an example of how to extract the individual captures in each member of a group collection, see the Capture Collection example in the following section.
Imports System Imports System.Text.RegularExpressions Public Class RegexTest Public Shared Sub RunTest() ' Define groups "abc", "ab", and "b". Dim r As New Regex("(a(b))c") Dim m As Match = r.Match("abdabc") Console.WriteLine("Number of groups found = " _ & m.Groups.Count.ToString()) End Sub Public Shared Sub Main() RunTest() End Sub End Class
This example produces the following output.
The CaptureCollection class represents a sequence of captured substrings and returns the set of captures done by a single capturing group. A capturing group can capture more than one string in a single match because of quantifiers. The Captures property, an object of the CaptureCollection class, is provided as a member of the Match and Group classes to facilitate access to the set of captured substrings.
For instance, if you use the regular expression ((a(b))c)+ (where the + quantifier specifies one or more matches) to capture matches from the string "abcabcabc", the CaptureCollection for each matching Group of substrings will contain three members.
The following console application example uses the regular expression (Abc)+ to find one or more matches in the string "XYZAbcAbcAbcXYZAbcAb". The example illustrates the use of the Captures property to return multiple groups of captured substrings.
Imports System Imports System.Text.RegularExpressions Public Class RegexTest Public Shared Sub RunTest() Dim counter As Integer Dim m As Match Dim cc As CaptureCollection Dim gc As GroupCollection ' Look for groupings of "Abc". Dim r As New Regex("(Abc)+") ' Define the string to search. m = r.Match("XYZAbcAbcAbcXYZAbcAb") gc = m.Groups ' Print the number of groups. Console.WriteLine("Captured groups = " & gc.Count.ToString()) ' Loop through each group. Dim i, ii As Integer For i = 0 To gc.Count - 1 cc = gc(i).Captures counter = cc.Count ' Print number of captures in this group. Console.WriteLine("Captures count = " & counter.ToString()) ' Loop through each capture in group. For ii = 0 To counter - 1 ' Print capture and position. Console.WriteLine(cc(ii).ToString() _ & " Starts at character " & cc(ii).Index.ToString()) Next ii Next i End Sub Public Shared Sub Main() RunTest() End Sub End Class
This example returns the following output.
The Group class represents the results from a single capturing group. Because Group can capture zero, one, or more strings in a single match (using quantifiers), it contains a collection of Capture objects. Because Group inherits from Capture, the last substring captured can be accessed directly (the Group instance itself is equivalent to the last item of the collection returned by the Captures property).
Instances ofare returned by indexing the object returned by the property. The indexer can be a group number or the name of a capture group if the "(?<groupname>)" grouping construct is used. For example, in C# code you can use Match.Groups[groupnum] or Match.Groups["groupname"], or in Visual Basic code you can use Match.Groups(groupnum) or Match.Groups("groupname").
The following code example uses nested grouping constructs to capture substrings into groups.
Dim matchposition(20) As Integer Dim results(20) As String ' Define substrings abc, ab, b. Dim r As New Regex("(a(b))c") Dim m As Match = r.Match("abdabc") Dim i As Integer = 0 While Not (m.Groups(i).Value = "") ' Copy groups to string array. results(i) = m.Groups(i).Value ' Record character position. matchposition(i) = m.Groups(i).Index i = i + 1 End While
This example returns the following output.
results(0) = "abc" matchposition(0) = 3 results(1) = "ab" matchposition(1) = 3 results(2) = "b" matchposition(2) = 4
The following code example uses named grouping constructs to capture substrings from a string containing data in a "DATANAME:VALUE" format that the regular expression splits at the colon (:).
Dim r As New Regex("^(?<name>\w+):(?<value>\w+)") Dim m As Match = r.Match("Section1:119900")
This regular expression returns the following output.
The Capture class contains the results from a single subexpression capture.
The following example loops through a Group collection, extracts the Capture collection from each member of Group, and assigns the variables posn and length to the character position in the original string where each string was found and the length of each string, respectively.
Dim r As Regex Dim m As Match Dim cc As CaptureCollection Dim posn, length As Integer r = New Regex("(abc)+") m = r.Match("bcabcabc") Dim i, j As Integer i = 0 While m.Groups(i).Value <> "" ' Grab the Collection for Group(i). cc = m.Groups(i).Captures For j = 0 To cc.Count - 1 ' Position of Capture object. posn = cc(j).Index ' Length of Capture object. length = cc(j).Length Next j i += 1 End While