Troubleshooting Regular Expressions in Visual Basic

This topic discusses some common problems that can occur when working with regular expressions, provides some tips for solving these problems, and provides a procedure for accessing captured strings.

Not Matching the Intended Pattern

Listed below are some common tasks you can perform with regular expressions, along with troubleshooting tips if you do not get the results you expect:

  • Validating strings. A string-validation regular expression must start with the ^ character. This instructs the regular-expression engine to start matching the specified pattern at the start of the string. For more information, see Constructing a Validation Function in Visual Basic. and Anchors in Regular Expressions.

  • Matching with quantifiers. The regular-expression quantifiers (*, +, ?, {}) are greedy, meaning they match the longest possible string. However, for some patterns, you may want to use lazy matching to get the shortest possible string. The lazy regular-expression quantifiers are *?, +?, ??, and {}?. For more information, see Quantifiers.

  • Matching with nested quantifiers. When you use nested quantifiers, make sure all the quantifiers are either greedy or lazy. Otherwise, the results of the match will be hard to predict.

Accessing the Captured Stings

Once a regular expression is finding your intended strings, you may want to capture those strings and then access what you have captured. You can use grouping constructs to capture the strings that match groups of subexpressions in your regular expression. For more information, see Grouping Constructs.

There are several steps involved in accessing the strings captured with nested groups.

To access captured text

  1. Create a Regex object for a regular expression.

  2. Call the Match method to get the Match object.

    The Match object contains information about how the regular expression matches the string.

  3. Iterate over the Group objects stored in the Match object's Groups collection.

    The Group objects contain information about the results from a single capturing group.

  4. Iterate over the Capture objects stored in each Group object's Captures collection.

    Each Capture object contains information about a single captured subexpression, including the matched substring and location.

For example, the following code demonstrates how to access the strings captured with a regular expression containing three capturing groups:

    ''' <summary>
    ''' Parses an e-mail address into its parts.
    ''' </summary>
    ''' <param name="emailString">E-mail address to parse.</param>
    ''' <remarks> For example, this method displays the following 
    ''' text when called with "someone@mail.contoso.com":
    ''' User name: someone
    ''' Address part: mail
    ''' Address part: contoso
    ''' Address part: com
    ''' </remarks>
    Sub ParseEmailAddress(ByVal emailString As String)
        Dim emailRegEx As New Regex("(\S+)@([^\.\s]+)(?:\.([^\.\s]+))+")
        Dim m As Match = emailRegEx.Match(emailString)
        If m.Success Then
            Dim output As String = ""
            output &= "User name: " & m.Groups(1).Value & vbCrLf
            For i As Integer = 2 To m.Groups.Count - 1
                Dim g As Group = m.Groups(i)
                For Each c As Capture In g.Captures
                    output &= "Address part: " & c.Value & vbCrLf
                Next
            Next
            MsgBox(output)
        Else
            MsgBox("The e-mail address cannot be parsed.")
        End If
    End Sub

See Also

Tasks

Troubleshooting Strings in Visual Basic

Concepts

Constructing a Validation Function in Visual Basic