正则表达式中的定位点

定位点(原子零宽度断言)指定字符串中必须出现匹配的位置。 在搜索表达式中使用定位点时,正则表达式引擎不在字符串中前进或使用字符,它仅在指定位置查找匹配。 例如, ^ 指定必须从行或字符串的开头开始匹配。 因此,正则表达式 ^http: 仅当 "http:" 出现在行开头时才与之匹配。 下表列出了 .NET 中正则表达式支持的定位点。

定位点 说明
^ 默认情况下,匹配必须出现在字符串的开头;在多行模式中,必须出现在该行的开头。 有关详细信息,请参阅 字符串或行的开头
$ 默认情况下,匹配必须出现在字符串的末尾,或在字符串末尾的 \n 之前;在多行模式中,必须出现在该行的末尾,或在该行末尾的 \n 之前。 有关详细信息,请参阅 字符串或行的末尾
\A 匹配必须仅出现在字符串的开头位置(无多行支持)。 有关详细信息,请参阅 仅字符串的开头
\Z 匹配必须出现在字符串的末尾,或出现在字符串末尾的 \n 之前。 有关详细信息,请参阅 字符串的末尾或结束换行之前
\z 匹配必须仅出现在字符串的末尾。 有关详细信息,请参阅 仅字符串的末尾
\G 匹配必须从上一个匹配结束的位置开始;如果以前没有匹配项,则从开始进行匹配的字符串中的位置开始。 有关详细信息,请参阅 连续匹配
\b 匹配必须出现在字边界。 有关详细信息,请参阅 字边界
\B 匹配不得出现在字边界上。 有关详细信息,请参阅 非字边界

字符串或行的开头:^

默认情况下,^ 定位点指定以下模式必须从字符串的第一个字符位置开始。 如果结合使用 ^RegexOptions.Multiline 选项(请参阅正则表达式选项),匹配必须出现在每行的开头。

以下示例在正则表达式中使用 ^ 定位点,可提取有关某些职业棒球队存在年限的信息。 该示例调用 Regex.Matches 方法的两个重载:

using System;
using System.Text.RegularExpressions;

public class Example
{
    public static void Main()
    {
        string input = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957\n" +
                     "Chicago Cubs, National League, 1903-present\n" +
                     "Detroit Tigers, American League, 1901-present\n" +
                     "New York Giants, National League, 1885-1957\n" +
                     "Washington Senators, American League, 1901-1960\n";
        string pattern = @"^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+";
        Match match;

        match = Regex.Match(input, pattern);
        while (match.Success)
        {
            Console.Write("The {0} played in the {1} in",
                              match.Groups[1].Value, match.Groups[4].Value);
            foreach (Capture capture in match.Groups[5].Captures)
                Console.Write(capture.Value);

            Console.WriteLine(".");
            match = match.NextMatch();
        }
        Console.WriteLine();

        match = Regex.Match(input, pattern, RegexOptions.Multiline);
        while (match.Success)
        {
            Console.Write("The {0} played in the {1} in",
                              match.Groups[1].Value, match.Groups[4].Value);
            foreach (Capture capture in match.Groups[5].Captures)
                Console.Write(capture.Value);

            Console.WriteLine(".");
            match = match.NextMatch();
        }
        Console.WriteLine();
    }
}
// The example displays the following output:
//    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
//
//    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
//    The Chicago Cubs played in the National League in 1903-present.
//    The Detroit Tigers played in the American League in 1901-present.
//    The New York Giants played in the National League in 1885-1957.
//    The Washington Senators played in the American League in 1901-1960.
Imports System.Text.RegularExpressions

Module Example
    Public Sub Main()
        Dim input As String = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957" + vbCrLf +
                              "Chicago Cubs, National League, 1903-present" + vbCrLf +
                              "Detroit Tigers, American League, 1901-present" + vbCrLf +
                              "New York Giants, National League, 1885-1957" + vbCrLf +
                              "Washington Senators, American League, 1901-1960" + vbCrLf

        Dim pattern As String = "^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+"
        Dim match As Match

        match = Regex.Match(input, pattern)
        Do While match.Success
            Console.Write("The {0} played in the {1} in",
                              match.Groups(1).Value, match.Groups(4).Value)
            For Each capture As Capture In match.Groups(5).Captures
                Console.Write(capture.Value)
            Next
            Console.WriteLine(".")
            match = match.NextMatch()
        Loop
        Console.WriteLine()

        match = Regex.Match(input, pattern, RegexOptions.Multiline)
        Do While match.Success
            Console.Write("The {0} played in the {1} in",
                              match.Groups(1).Value, match.Groups(4).Value)
            For Each capture As Capture In match.Groups(5).Captures
                Console.Write(capture.Value)
            Next
            Console.WriteLine(".")
            match = match.NextMatch()
        Loop
        Console.WriteLine()
    End Sub
End Module
' The example displays the following output:
'    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
'    
'    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
'    The Chicago Cubs played in the National League in 1903-present.
'    The Detroit Tigers played in the American League in 1901-present.
'    The New York Giants played in the National League in 1885-1957.
'    The Washington Senators played in the American League in 1901-1960.

正则表达式模式 ^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+ 的定义如下表所示。

模式 说明
^ 从输入字符串的开头开始匹配(如果在调用该方法时选择了 RegexOptions.Multiline 选项,则从行的开头开始匹配)。
((\w+(\s?)){2,} 匹配至少两次后跟零个或一个空格的一个或多个单词字符。 这是第一个捕获组。 此表达式还定义第二个和第三个捕获组:第二组包括捕获的单词,第三组包括捕获的空格。
,\s 匹配后跟一个空白字符的逗号。
(\w+\s\w+) 匹配后跟一个空格再后跟一个或多个单词字符的一个或多个单词字符。 这是第四个捕获组。
, 匹配逗号。
\s\d{4} 匹配后跟四个十进制数字的空格。
(-(\d{4}|present))? 匹配连字符后跟四个十进制数字或字符串“present”的零或一个匹配项。 这是第六个捕获组。 还包括第七个捕获组。
,? 匹配逗号的零个或一个匹配项。
(\s\d{4}(-(\d{4}|present))?,?)+ 匹配以下内容的一个或多个匹配项:空格、四个十进制数字、连字符后跟四个十进制数字或字符串“present”的零个或一个匹配项以及零个或一个逗号。 这是第五个捕获组。

字符串或行的末尾:$

$ 定位点指定前面的模式必须出现在输入字符串的末尾,或出现在输入字符串末尾的 \n 之前。

如果结合使用 $RegexOptions.Multiline 选项,则匹配也可能出现在行的末尾。 请注意,$\n 处符合条件,但在 \r\n 处(回车符和换行符的组合,或称 CR/LF)不符合条件。 若要处理 CR/LF 字符组合,请将 \r?$ 包括到正则表达式模式中。 请注意,\r?$ 将在匹配中包含 \r

以下示例将 $ 定位点添加到 字符串或行的开头 部分的示例中所使用的正则表达式模式中。 配合包括五行文本的原始输入字符串使用时, Regex.Matches(String, String) 方法找不到匹配项,因为第一行的末尾与 $ 模式不匹配。 当原始输入字符串被拆分为字符串数组时, Regex.Matches(String, String) 方法会成功匹配五行中的每一行。 如果调用 Regex.Matches(String, String, RegexOptions) 方法时将 options 参数设置为 RegexOptions.Multiline,则找不到匹配项,因为正则表达式模式不考虑回车符元素 \r。 但是,如果通过用将 $ 替换成 \r?$修改了正则表达式模式,则调用 Regex.Matches(String, String, RegexOptions) 方法并将 options 参数设置为 RegexOptions.Multiline 将再次找到五个匹配项。

using System;
using System.Text.RegularExpressions;

public class Example
{
    public static void Main()
    {
        string cr = Environment.NewLine;
        string input = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957" + cr +
                       "Chicago Cubs, National League, 1903-present" + cr +
                       "Detroit Tigers, American League, 1901-present" + cr +
                       "New York Giants, National League, 1885-1957" + cr +
                       "Washington Senators, American League, 1901-1960" + cr;
        Match match;

        string basePattern = @"^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+";
        string pattern = basePattern + "$";
        Console.WriteLine("Attempting to match the entire input string:");
        match = Regex.Match(input, pattern);
        while (match.Success)
        {
            Console.Write("The {0} played in the {1} in",
                              match.Groups[1].Value, match.Groups[4].Value);
            foreach (Capture capture in match.Groups[5].Captures)
                Console.Write(capture.Value);

            Console.WriteLine(".");
            match = match.NextMatch();
        }
        Console.WriteLine();

        string[] teams = input.Split(new String[] { cr }, StringSplitOptions.RemoveEmptyEntries);
        Console.WriteLine("Attempting to match each element in a string array:");
        foreach (string team in teams)
        {
            match = Regex.Match(team, pattern);
            if (match.Success)
            {
                Console.Write("The {0} played in the {1} in",
                              match.Groups[1].Value, match.Groups[4].Value);
                foreach (Capture capture in match.Groups[5].Captures)
                    Console.Write(capture.Value);
                Console.WriteLine(".");
            }
        }
        Console.WriteLine();

        Console.WriteLine("Attempting to match each line of an input string with '$':");
        match = Regex.Match(input, pattern, RegexOptions.Multiline);
        while (match.Success)
        {
            Console.Write("The {0} played in the {1} in",
                              match.Groups[1].Value, match.Groups[4].Value);
            foreach (Capture capture in match.Groups[5].Captures)
                Console.Write(capture.Value);

            Console.WriteLine(".");
            match = match.NextMatch();
        }
        Console.WriteLine();

        pattern = basePattern + "\r?$";
        Console.WriteLine(@"Attempting to match each line of an input string with '\r?$':");
        match = Regex.Match(input, pattern, RegexOptions.Multiline);
        while (match.Success)
        {
            Console.Write("The {0} played in the {1} in",
                              match.Groups[1].Value, match.Groups[4].Value);
            foreach (Capture capture in match.Groups[5].Captures)
                Console.Write(capture.Value);

            Console.WriteLine(".");
            match = match.NextMatch();
        }
        Console.WriteLine();
    }
}
// The example displays the following output:
//    Attempting to match the entire input string:
//
//    Attempting to match each element in a string array:
//    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
//    The Chicago Cubs played in the National League in 1903-present.
//    The Detroit Tigers played in the American League in 1901-present.
//    The New York Giants played in the National League in 1885-1957.
//    The Washington Senators played in the American League in 1901-1960.
//
//    Attempting to match each line of an input string with '$':
//
//    Attempting to match each line of an input string with '\r?$':
//    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
//    The Chicago Cubs played in the National League in 1903-present.
//    The Detroit Tigers played in the American League in 1901-present.
//    The New York Giants played in the National League in 1885-1957.
//    The Washington Senators played in the American League in 1901-1960.
Imports System.Text.RegularExpressions

Module Example
    Public Sub Main()
        Dim input As String = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957" + vbCrLf +
                              "Chicago Cubs, National League, 1903-present" + vbCrLf +
                              "Detroit Tigers, American League, 1901-present" + vbCrLf +
                              "New York Giants, National League, 1885-1957" + vbCrLf +
                              "Washington Senators, American League, 1901-1960" + vbCrLf

        Dim basePattern As String = "^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+"
        Dim match As Match

        Dim pattern As String = basePattern + "$"
        Console.WriteLine("Attempting to match the entire input string:")
        match = Regex.Match(input, pattern)
        Do While match.Success
            Console.Write("The {0} played in the {1} in",
                              match.Groups(1).Value, match.Groups(4).Value)
            For Each capture As Capture In match.Groups(5).Captures
                Console.Write(capture.Value)
            Next
            Console.WriteLine(".")
            match = match.NextMatch()
        Loop
        Console.WriteLine()

        Dim teams() As String = input.Split(New String() {vbCrLf}, StringSplitOptions.RemoveEmptyEntries)
        Console.WriteLine("Attempting to match each element in a string array:")
        For Each team As String In teams
            match = Regex.Match(team, pattern)
            If match.Success Then
                Console.Write("The {0} played in the {1} in",
                               match.Groups(1).Value, match.Groups(4).Value)
                For Each capture As Capture In match.Groups(5).Captures
                    Console.Write(capture.Value)
                Next
                Console.WriteLine(".")
            End If
        Next
        Console.WriteLine()

        Console.WriteLine("Attempting to match each line of an input string with '$':")
        match = Regex.Match(input, pattern, RegexOptions.Multiline)
        Do While match.Success
            Console.Write("The {0} played in the {1} in",
                              match.Groups(1).Value, match.Groups(4).Value)
            For Each capture As Capture In match.Groups(5).Captures
                Console.Write(capture.Value)
            Next
            Console.WriteLine(".")
            match = match.NextMatch()
        Loop
        Console.WriteLine()

        pattern = basePattern + "\r?$"
        Console.WriteLine("Attempting to match each line of an input string with '\r?$':")
        match = Regex.Match(input, pattern, RegexOptions.Multiline)
        Do While match.Success
            Console.Write("The {0} played in the {1} in",
                              match.Groups(1).Value, match.Groups(4).Value)
            For Each capture As Capture In match.Groups(5).Captures
                Console.Write(capture.Value)
            Next
            Console.WriteLine(".")

            match = match.NextMatch()
        Loop
        Console.WriteLine()
    End Sub
End Module
' The example displays the following output:
'    Attempting to match the entire input string:
'    
'    Attempting to match each element in a string array:
'    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
'    The Chicago Cubs played in the National League in 1903-present.
'    The Detroit Tigers played in the American League in 1901-present.
'    The New York Giants played in the National League in 1885-1957.
'    The Washington Senators played in the American League in 1901-1960.
'    
'    Attempting to match each line of an input string with '$':
'    
'    Attempting to match each line of an input string with '\r?$':
'    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
'    The Chicago Cubs played in the National League in 1903-present.
'    The Detroit Tigers played in the American League in 1901-present.
'    The New York Giants played in the National League in 1885-1957.
'    The Washington Senators played in the American League in 1901-1960.

仅字符串的开头:\A

\A 定位点指定匹配必须出现在输入字符串的开头。 它等同于 ^ 定位点,只不过 \A 忽略了 RegexOptions.Multiline 选项。 因此,在多行的输入字符串中,它只能匹配第一行的开头。

以下示例与 ^$ 定位点的示例类似。 它在正则表达式中使用 \A 定位点,可提取有关某些职业棒球队存在年限的信息。 输入字符串包括五行。 调用 Regex.Matches(String, String, RegexOptions) 方法仅找到输入字符串中与正则表达式模式匹配的第一个子字符串。 如示例所示, Multiline 选项不起作用。

using System;
using System.Text.RegularExpressions;

public class Example
{
    public static void Main()
    {
        string input = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957\n" +
                       "Chicago Cubs, National League, 1903-present\n" +
                       "Detroit Tigers, American League, 1901-present\n" +
                       "New York Giants, National League, 1885-1957\n" +
                       "Washington Senators, American League, 1901-1960\n";

        string pattern = @"\A((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+";

        Match match = Regex.Match(input, pattern, RegexOptions.Multiline);
        while (match.Success)
        {
            Console.Write("The {0} played in the {1} in",
                              match.Groups[1].Value, match.Groups[4].Value);
            foreach (Capture capture in match.Groups[5].Captures)
                Console.Write(capture.Value);

            Console.WriteLine(".");
            match = match.NextMatch();
        }
        Console.WriteLine();
    }
}
// The example displays the following output:
//    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
Imports System.Text.RegularExpressions

Module Example
    Public Sub Main()
        Dim input As String = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957" + vbCrLf +
                            "Chicago Cubs, National League, 1903-present" + vbCrLf +
                            "Detroit Tigers, American League, 1901-present" + vbCrLf +
                            "New York Giants, National League, 1885-1957" + vbCrLf +
                            "Washington Senators, American League, 1901-1960" + vbCrLf

        Dim pattern As String = "\A((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+"

        Dim match As Match = Regex.Match(input, pattern, RegexOptions.Multiline)
        Do While match.Success
            Console.Write("The {0} played in the {1} in",
                              match.Groups(1).Value, match.Groups(4).Value)
            For Each capture As Capture In match.Groups(5).Captures
                Console.Write(capture.Value)
            Next
            Console.WriteLine(".")
            match = match.NextMatch()
        Loop
        Console.WriteLine()
    End Sub
End Module
' The example displays the following output:
'    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.

字符串末尾或结束换行之前:\Z

\Z 定位点指定匹配项必须出现在输入字符串的末尾,或出现在输入字符串末尾的 \n 之前。 它等同于 $ 定位点,只不过 \Z 忽略了 RegexOptions.Multiline 选项。 因此,在多行字符串中,它只能符合最后一行的末尾,或 \n 前的最后一行。

请注意,\Z\n 处符合条件,但在 \r\n 处(CR/LF 字符组合)不符合条件。 若要将 CR/LF 视为 \n,请在正则表达式模式中包含 \r?\Z。 请注意,这将构成匹配的 \r 部分。

以下示例在正则表达式中使用 \Z 定位点,与 字符串或行的开头 部分的示例类似,可提取有关某些职业棒球队存在年限的信息。 正则表达式 ^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+\r?\Z 中的子表达式 \r?\Z 在字符串末尾符合条件,且在以 \n\r\n 结尾的字符串的末尾处符合条件。 因此,数组中的每个元素都与正则表达式模式匹配。

using System;
using System.Text.RegularExpressions;

public class Example
{
    public static void Main()
    {
        string[] inputs = { "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957",
                          "Chicago Cubs, National League, 1903-present" + Environment.NewLine,
                          "Detroit Tigers, American League, 1901-present" + Regex.Unescape(@"\n"),
                          "New York Giants, National League, 1885-1957",
                          "Washington Senators, American League, 1901-1960" + Environment.NewLine};
        string pattern = @"^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+\r?\Z";

        foreach (string input in inputs)
        {
            Console.WriteLine(Regex.Escape(input));
            Match match = Regex.Match(input, pattern);
            if (match.Success)
                Console.WriteLine("   Match succeeded.");
            else
                Console.WriteLine("   Match failed.");
        }
    }
}
// The example displays the following output:
//    Brooklyn\ Dodgers,\ National\ League,\ 1911,\ 1912,\ 1932-1957
//       Match succeeded.
//    Chicago\ Cubs,\ National\ League,\ 1903-present\r\n
//       Match succeeded.
//    Detroit\ Tigers,\ American\ League,\ 1901-present\n
//       Match succeeded.
//    New\ York\ Giants,\ National\ League,\ 1885-1957
//       Match succeeded.
//    Washington\ Senators,\ American\ League,\ 1901-1960\r\n
//       Match succeeded.
Imports System.Text.RegularExpressions

Module Example
    Public Sub Main()
        Dim inputs() As String = {"Brooklyn Dodgers, National League, 1911, 1912, 1932-1957",
                              "Chicago Cubs, National League, 1903-present" + vbCrLf,
                              "Detroit Tigers, American League, 1901-present" + vbLf,
                              "New York Giants, National League, 1885-1957",
                              "Washington Senators, American League, 1901-1960" + vbCrLf}
        Dim pattern As String = "^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+\r?\Z"

        For Each input As String In inputs
            Console.WriteLine(Regex.Escape(input))
            Dim match As Match = Regex.Match(input, pattern)
            If match.Success Then
                Console.WriteLine("   Match succeeded.")
            Else
                Console.WriteLine("   Match failed.")
            End If
        Next
    End Sub
End Module
' The example displays the following output:
'    Brooklyn\ Dodgers,\ National\ League,\ 1911,\ 1912,\ 1932-1957
'       Match succeeded.
'    Chicago\ Cubs,\ National\ League,\ 1903-present\r\n
'       Match succeeded.
'    Detroit\ Tigers,\ American\ League,\ 1901-present\n
'       Match succeeded.
'    New\ York\ Giants,\ National\ League,\ 1885-1957
'       Match succeeded.
'    Washington\ Senators,\ American\ League,\ 1901-1960\r\n
'       Match succeeded.

仅字符串末尾:\z

\z 定位点指定匹配必须出现在输入字符串末尾。 与 $ 语言元素类似, \z 忽略了 RegexOptions.Multiline 选项。 与 \Z 语言元素不同,\z 不满足于字符串末尾的 \n 字符。 因此,它只能匹配输入字符串的末尾。

以下示例在正则表达式中使用 \z 定位点,与上一部分的示例中使用的定位点在其他方面相同,用于提取有关某些职业棒球队存在年限的信息。 此示例尝试使用正则表达式模式 ^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+\r?\z匹配字符串数组中五个元素的每一个。 两个字符串以回车符和换行符结尾,一个字符串以换行符结尾,另外两个既不以回车符也不以换行符结尾。 如输出所示,只有不包含回车符或换行符的字符串与模式匹配。

using System;
using System.Text.RegularExpressions;

public class Example
{
    public static void Main()
    {
        string[] inputs = { "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957",
                          "Chicago Cubs, National League, 1903-present" + Environment.NewLine,
                          "Detroit Tigers, American League, 1901-present\n",
                          "New York Giants, National League, 1885-1957",
                          "Washington Senators, American League, 1901-1960" + Environment.NewLine };
        string pattern = @"^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+\r?\z";

        foreach (string input in inputs)
        {
            Console.WriteLine(Regex.Escape(input));
            Match match = Regex.Match(input, pattern);
            if (match.Success)
                Console.WriteLine("   Match succeeded.");
            else
                Console.WriteLine("   Match failed.");
        }
    }
}
// The example displays the following output:
//    Brooklyn\ Dodgers,\ National\ League,\ 1911,\ 1912,\ 1932-1957
//       Match succeeded.
//    Chicago\ Cubs,\ National\ League,\ 1903-present\r\n
//       Match failed.
//    Detroit\ Tigers,\ American\ League,\ 1901-present\n
//       Match failed.
//    New\ York\ Giants,\ National\ League,\ 1885-1957
//       Match succeeded.
//    Washington\ Senators,\ American\ League,\ 1901-1960\r\n
//       Match failed.
Imports System.Text.RegularExpressions

Module Example
    Public Sub Main()
        Dim inputs() As String = {"Brooklyn Dodgers, National League, 1911, 1912, 1932-1957",
                              "Chicago Cubs, National League, 1903-present" + vbCrLf,
                              "Detroit Tigers, American League, 1901-present" + vbLf,
                              "New York Giants, National League, 1885-1957",
                              "Washington Senators, American League, 1901-1960" + vbCrLf}
        Dim pattern As String = "^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+\r?\z"

        For Each input As String In inputs
            Console.WriteLine(Regex.Escape(input))
            Dim match As Match = Regex.Match(input, pattern)
            If match.Success Then
                Console.WriteLine("   Match succeeded.")
            Else
                Console.WriteLine("   Match failed.")
            End If
        Next
    End Sub
End Module
' The example displays the following output:
'    Brooklyn\ Dodgers,\ National\ League,\ 1911,\ 1912,\ 1932-1957
'       Match succeeded.
'    Chicago\ Cubs,\ National\ League,\ 1903-present\r\n
'       Match failed.
'    Detroit\ Tigers,\ American\ League,\ 1901-present\n
'       Match failed.
'    New\ York\ Giants,\ National\ League,\ 1885-1957
'       Match succeeded.
'    Washington\ Senators,\ American\ League,\ 1901-1960\r\n
'       Match failed.

连续匹配:\G

\G 定位点指定,匹配必须在上一个匹配结束的位置进行;如果以前没有匹配项,则从开始进行匹配的字符串中的位置进行。 将此定位点与 Regex.MatchesMatch.NextMatch 方法配合使用时,它可确保所有匹配项是连续的。

提示

通常,将 \G 定位点放置在模式的左端。 在不常见的情况下,需要执行从右到左的搜索,此时请将 \G 定位点放置在模式的右端。

以下示例使用正则表达式从一个以逗号分隔的字符串中提取啮齿类动物的名称。

using System;
using System.Text.RegularExpressions;

public class Example
{
    public static void Main()
    {
        string input = "capybara,squirrel,chipmunk,porcupine,gopher," +
                       "beaver,groundhog,hamster,guinea pig,gerbil," +
                       "chinchilla,prairie dog,mouse,rat";
        string pattern = @"\G(\w+\s?\w*),?";
        Match match = Regex.Match(input, pattern);
        while (match.Success)
        {
            Console.WriteLine(match.Groups[1].Value);
            match = match.NextMatch();
        }
    }
}
// The example displays the following output:
//       capybara
//       squirrel
//       chipmunk
//       porcupine
//       gopher
//       beaver
//       groundhog
//       hamster
//       guinea pig
//       gerbil
//       chinchilla
//       prairie dog
//       mouse
//       rat
Imports System.Text.RegularExpressions

Module Example
    Public Sub Main()
        Dim input As String = "capybara,squirrel,chipmunk,porcupine,gopher," +
                              "beaver,groundhog,hamster,guinea pig,gerbil," +
                              "chinchilla,prairie dog,mouse,rat"
        Dim pattern As String = "\G(\w+\s?\w*),?"
        Dim match As Match = Regex.Match(input, pattern)
        Do While match.Success
            Console.WriteLine(match.Groups(1).Value)
            match = match.NextMatch()
        Loop
    End Sub
End Module
' The example displays the following output:
'       capybara
'       squirrel
'       chipmunk
'       porcupine
'       gopher
'       beaver
'       groundhog
'       hamster
'       guinea pig
'       gerbil
'       chinchilla
'       prairie dog
'       mouse
'       rat

正则表达式 \G(\w+\s?\w*),? 可以解释为下表中所示内容。

模式 说明
\G 从上次匹配结束的位置开始。
\w+ 匹配一个或多个单词字符。
\s? 匹配零个或一个空格。
\w* 匹配零个或多个单词字符。
(\w+\s?\w*) 匹配后跟零个或一个空格再后跟零个或多个单词字符的一个或多个单词字符。 这是第一个捕获组。
,? 匹配文本逗号字符的零个或一个匹配项。

字边界:\b

\b 定位符指定匹配必须出现单词字符( \w 语言元素)和非单词字符( \W 语言元素)之间的边界上。 单词字符包括字母数字字符和下划线;非单词字符包括不为字母数字字符或下划线的任何字符。 (有关详细信息,请参阅字符类。)匹配项也可以出现在字符串开头或结尾处的字边界上。

\b 定位点经常用于确保子表达式与整个单词而不仅与单词的开头或结尾匹配。 以下示例中的正则表达式 \bare\w*\b 阐释了这种用法。 它与任何以子字符串“are”开头的单词匹配。 该示例的输出也阐释了 \b 与输入字符串的开头和结尾均匹配。

using System;
using System.Text.RegularExpressions;

public class Example
{
    public static void Main()
    {
        string input = "area bare arena mare";
        string pattern = @"\bare\w*\b";
        Console.WriteLine("Words that begin with 'are':");
        foreach (Match match in Regex.Matches(input, pattern))
            Console.WriteLine("'{0}' found at position {1}",
                              match.Value, match.Index);
    }
}
// The example displays the following output:
//       Words that begin with 'are':
//       'area' found at position 0
//       'arena' found at position 10
Imports System.Text.RegularExpressions

Module Example
    Public Sub Main()
        Dim input As String = "area bare arena mare"
        Dim pattern As String = "\bare\w*\b"
        Console.WriteLine("Words that begin with 'are':")
        For Each match As Match In Regex.Matches(input, pattern)
            Console.WriteLine("'{0}' found at position {1}",
                              match.Value, match.Index)
        Next
    End Sub
End Module
' The example displays the following output:
'       Words that begin with 'are':
'       'area' found at position 0
'       'arena' found at position 10

正则表达式模式可以解释为下表中所示内容。

模式 描述
\b 在单词边界处开始匹配。
are 匹配子字符串“are”。
\w* 匹配零个或多个单词字符。
\b 在单词边界处结束匹配。

非字边界:\B

\B 定位符指定匹配不得出现在单词边界上。 它与 \b 定位点截然相反。

以下示例使用 \B 定位点定位单词中的子字符串“qu”匹配项。 正则表达式模式 \Bqu\w+ 与以“qu”开头(但“qu”并不位于单词之首)且延续到单词末尾的子字符串匹配。

using System;
using System.Text.RegularExpressions;

public class Example
{
    public static void Main()
    {
        string input = "equity queen equip acquaint quiet";
        string pattern = @"\Bqu\w+";
        foreach (Match match in Regex.Matches(input, pattern))
            Console.WriteLine("'{0}' found at position {1}",
                              match.Value, match.Index);
    }
}
// The example displays the following output:
//       'quity' found at position 1
//       'quip' found at position 14
//       'quaint' found at position 21
Imports System.Text.RegularExpressions

Module Example
    Public Sub Main()
        Dim input As String = "equity queen equip acquaint quiet"
        Dim pattern As String = "\Bqu\w+"
        For Each match As Match In Regex.Matches(input, pattern)
            Console.WriteLine("'{0}' found at position {1}",
                              match.Value, match.Index)
        Next
    End Sub
End Module
' The example displays the following output:
'       'quity' found at position 1
'       'quip' found at position 14
'       'quaint' found at position 21

正则表达式模式可以解释为下表中所示内容。

模式 说明
\B 不在单词边界处开始匹配。
qu 匹配子字符串“qu”。
\w+ 匹配一个或多个单词字符。

请参阅