역참조 구문

아티클
08/09/2011

역참조는 문자열 내에서 반복되는 문자나 하위 문자열을 식별하는 편리한 방법을 제공합니다. 예를 들어, 입력 문자열에 임의의 부분 문자열이 여러 개 있으면 캡처 그룹을 사용하여 첫 번째로 나타나는 부분 문자열을 찾은 다음 역참조를 사용하여 이후에 나타나는 부분 문자열을 찾을 수 있습니다.

참고
별도의 구문은 교체 문자열에서 명명되고 번호가 매겨진 캡처링 그룹을 참조하는 데 사용됩니다.자세한 내용은 대체을 참조하십시오.

.NET Framework는 번호 매기기 및 명명된 캡처링 그룹을 참조하는 별도의 언어 요소를 정의합니다. 캡처링 그룹에 대한 자세한 내용은 그룹화 구문를 참조하십시오.

번호가 지정된 역참조.

번호 매기기 역참조는 다음 구문을 사용합니다.

\number

number는 정규식에 있는 캡처링 그룹의 서수 위치입니다. 예를 들어, \4은 4번째 캡처링 그룹의 내용을 찾습니다. 번호가 정규식 패턴에 정의되지 않은 경우 구문 분석 오류가 발생하고 정규식 엔진이 ArgumentException를 throw합니다. 예를 들어, 정규식 (\w+)이 식에 있는 첫 번째이자 유일한 캡처링 그룹이기 때문에 정규식 \b(\w+)\s\1는 유효합니다. 반면에 \b(\w+)\s\2는 유효하지 않으며 \2 번호 지정된 캡처링 그룹이 없기 때문에 인수 예외를 throw합니다.

8진수 이스케이프 코드(예: \16) 및 동일한 표기법을 사용하는 \number 역참조 간의 모호성에 유의하십시오. 이러한 모호성은 다음과 같이 해결됩니다.

식 \1 ~ \9는 항상8진수 코드가 아닌 역참조로 해석됩니다.
두 자릿수 이상으로 된 식이 8 또는 9(예: \80 또는 \91)인 경우 식은 리터럴로 해석됩니다.
해당 번호에 해당하는 역참조가 있을 경우 \10보다 큰 식은 역참조로 간주되고, 그렇지 않으면, 8진수 코드로 해석됩니다.
정의되지 않은 그룹 숫자에 대한 역참조가 정규식에 포함되어 있으면 구문 분석 오류가 발생하고 정규식 엔진이 ArgumentException를 throw합니다.

모호성이 문제가 되는 경우에는 명백하고 8진법 문자 코드와 혼동되지 않는 \k<이름> 표기법을 사용할 수 있습니다. 마찬가지로, \xdd 같은 16진수 코드는 모호하지 않으며 역참조와 혼동을 일으키지 않습니다.

다음 예제에서는 문자열에서 중복된 단어 문자를 찾습니다. 다음과 같은 요소로 구성되는 정규식 (\w)\1을 정의합니다.

요소	설명
(\w)	단어 문자를 찾고 첫 번째 캡처링 그룹에 할당합니다.
\1	첫 번째 캡처 그룹의 값과 동일한 다음 문자를 찾습니다.

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "(\w)\1"
      Dim input As String = "trellis llama webbing dresser swagger"
      For Each match As Match In Regex.Matches(input, pattern)
         Console.WriteLine("Found '{0}' at position {1}.", _
                           match.Value, match.Index)
      Next   
   End Sub
End Module
' The example displays the following output:
'       Found 'll' at position 3.
'       Found 'll' at position 8.
'       Found 'bb' at position 16.
'       Found 'ss' at position 25.
'       Found 'gg' at position 33.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"(\w)\1";
      string input = "trellis llama webbing dresser swagger";
      foreach (Match match in Regex.Matches(input, pattern))
         Console.WriteLine("Found '{0}' at position {1}.", 
                           match.Value, match.Index);
   }
}
// The example displays the following output:
//       Found 'll' at position 3.
//       Found 'll' at position 8.
//       Found 'bb' at position 16.
//       Found 'ss' at position 25.
//       Found 'gg' at position 33.

명명된 역참조.

명명된 역참조는 다음 구문을 사용하여 정의됩니다.

\k<name>

또는

\k'name'

name은 정규식 패턴에 정의된 캡처링 그룹의 이름입니다. 이름이 정규식 패턴에 정의되지 않은 경우 구문 분석 오류가 발생하고 정규식 엔진이 ArgumentException를 throw합니다.

다음 예제에서는 문자열에서 중복된 단어 문자를 찾습니다. 다음과 같은 요소로 구성되는 정규식 (?<char>\w)\k<char>을 정의합니다.

요소	설명
(?<char>\w)	단어 문자를 찾고 char라는 캡처링 그룹에 할당합니다.
\k<char>	char 캡처 그룹의 값과 동일한 다음 문자를 찾습니다.

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "(?<char>\w)\k<char>"
      Dim input As String = "trellis llama webbing dresser swagger"
      For Each match As Match In Regex.Matches(input, pattern)
         Console.WriteLine("Found '{0}' at position {1}.", _
                           match.Value, match.Index)
      Next   
   End Sub
End Module
' The example displays the following output:
'       Found 'll' at position 3.
'       Found 'll' at position 8.
'       Found 'bb' at position 16.
'       Found 'ss' at position 25.
'       Found 'gg' at position 33.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"(?<char>\w)\k<char>";
      string input = "trellis llama webbing dresser swagger";
      foreach (Match match in Regex.Matches(input, pattern))
         Console.WriteLine("Found '{0}' at position {1}.", 
                           match.Value, match.Index);
   }
}
// The example displays the following output:
//       Found 'll' at position 3.
//       Found 'll' at position 8.
//       Found 'bb' at position 16.
//       Found 'ss' at position 25.
//       Found 'gg' at position 33.

이름은 숫자의 문자열 표현이 될 수도 있습니다. 예를 들어, 다음 예제에서는 정규식 (?<2>\w)\k<2>를 사용하여 문자열에서 중복된 단어 문자를 찾습니다.

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "(?<2>\w)\k<2>"
      Dim input As String = "trellis llama webbing dresser swagger"
      For Each match As Match In Regex.Matches(input, pattern)
         Console.WriteLine("Found '{0}' at position {1}.", _
                           match.Value, match.Index)
      Next   
   End Sub
End Module
' The example displays the following output:
'       Found 'll' at position 3.
'       Found 'll' at position 8.
'       Found 'bb' at position 16.
'       Found 'ss' at position 25.
'       Found 'gg' at position 33.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"(?<2>\w)\k<2>";
      string input = "trellis llama webbing dresser swagger";
      foreach (Match match in Regex.Matches(input, pattern))
         Console.WriteLine("Found '{0}' at position {1}.", 
                           match.Value, match.Index);
   }
}
// The example displays the following output:
//       Found 'll' at position 3.
//       Found 'll' at position 8.
//       Found 'bb' at position 16.
//       Found 'ss' at position 25.
//       Found 'gg' at position 33.

역참조가 검색하는 내용

역참조는 가장 최근의 그룹 정의(왼쪽으로 오른쪽으로 검색하는 경우 가장 왼쪽에 있는 정의)를 참조합니다. 그룹에서 여러 개의 캡처를 만드는 경우 역참조는 가장 최근의 캡처를 참조합니다.

다음 예제에서는 이름이 \1인 그룹을 다시 정의하는 정규식 패턴 (?<1>a)(?<1>\1b)*가 포함됩니다. 다음 표에서는 정규식의 각 패턴을 설명합니다.

패턴	설명
(?<1>a)	문자 "a"를 찾고 1이라는 캡처링 그룹에 결과를 할당합니다.
(?<1>\1b)*	"b"와 함께 1이라는 그룹의 0 또는 1 발생을 일치시키고 결과를 1이라는 캡처링 그룹에 할당합니다.

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "(?<1>a)(?<1>\1b)*"
      Dim input As String = "aababb"
      For Each match As Match In Regex.Matches(input, pattern)
         Console.WriteLine("Match: " + match.Value)
         For Each group As Group In match.Groups
            Console.WriteLIne("   Group: " + group.Value)
         Next
      Next
   End Sub
End Module
' The example display the following output:
'          Group: aababb
'          Group: abb

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"(?<1>a)(?<1>\1b)*";
      string input = "aababb";
      foreach (Match match in Regex.Matches(input, pattern))
      {
         Console.WriteLine("Match: " + match.Value);
         foreach (Group group in match.Groups)
            Console.WriteLine("   Group: " + group.Value);
      }
   }
}
// The example displays the following output:
//          Group: aababb
//          Group: abb

정규식을 입력 문자열("aababb")과 비교할 때 정규식 엔진은 다음 작업을 수행합니다.

문자열의 시작 부분에서 시작하여 (?<1>a) 식을 사용하여 "a"를 성공적으로 찾습니다. 1 그룹의 값은 이제 "a"입니다.
두 번째 문자 앞으로 이동하고 식 \1b를 사용하여 문자열 "ab" 또는 "ab"를 추가합니다. 그런 다음 결과 "ab"를 \1에 할당합니다.
네 번째 문자 앞으로 이동합니다. 식 (?<1>\1b)는 0번 이상 일치되므로 문자열 "abb"를 식 \1b와 성공적으로 일치됩니다. 결과 "abb"를 다시 \1 에 할당합니다.

이 예제에서 *는 루핑 수량자입니다. 정규식 엔진이 정의하는 패턴을 일치시킬 수 없을 때까지 반복적으로 계산됩니다. 반복 수량자는 그룹 정의를 제거하지 않습니다.

그룹에서 부분 문자열을 캡처링하지 않은 경우 해당 그룹에 대한 역참조는 정의되지 않으며 검색되지 않습니다. 이는 정규식 패턴 \b(\p{Lu}{2})(\d{2})?(\p{Lu}{2})\b에 나와 있으며 다음과 같이 정의됩니다.

패턴	설명
\b	단어 경계에서 일치 항목 찾기를 시작합니다.
(\p{Lu}{2})	두 개의 대문자를 찾습니다. 이 그룹은 첫 번째 캡처링 그룹입니다.
(\d{2})?	두 개의 10진수 0개 또는 한 개를 찾습니다. 이 그룹은 두 번째 캡처링 그룹입니다.
(\p{Lu}{2})	두 개의 대문자를 찾습니다. 이 그룹은 세 번째 캡처링 그룹입니다.
\b	단어 경계에서 일치 항목 찾기를 끝냅니다.

입력 문자열은 두 번째 캡처링 그룹에 의해 정의된 두 소수 자릿수가 없는 경우에도 이 정규식을 일치시킬 수 있습니다. 다음 예제에서는 일치가 성공하더라도 두 성공적인 캡처링 그룹 사이에서 빈 캡처링 그룹이 발견되는 것을 보여줍니다.

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "\b(\p{Lu}{2})(\d{2})?(\p{Lu}{2})\b"
      Dim inputs() As String = { "AA22ZZ", "AABB" }
      For Each input As String In inputs
         Dim match As Match = Regex.Match(input, pattern)
         If match.Success Then
            Console.WriteLine("Match in {0}: {1}", input, match.Value)
            If match.Groups.Count > 1 Then
               For ctr As Integer = 1 To match.Groups.Count - 1
                  If match.Groups(ctr).Success Then
                     Console.WriteLine("Group {0}: {1}", _
                                       ctr, match.Groups(ctr).Value)
                  Else
                     Console.WriteLine("Group {0}: <no match>", ctr)
                  End If      
               Next
            End If
         End If
         Console.WriteLine()
      Next      
   End Sub
End Module
' The example displays the following output:
'       Match in AA22ZZ: AA22ZZ
'       Group 1: AA
'       Group 2: 22
'       Group 3: ZZ
'       
'       Match in AABB: AABB
'       Group 1: AA
'       Group 2: <no match>
'       Group 3: BB

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"\b(\p{Lu}{2})(\d{2})?(\p{Lu}{2})\b";
      string[] inputs = { "AA22ZZ", "AABB" };
      foreach (string input in inputs)
      {
         Match match = Regex.Match(input, pattern);
         if (match.Success)
         {
            Console.WriteLine("Match in {0}: {1}", input, match.Value);
            if (match.Groups.Count > 1)
            {
               for (int ctr = 1; ctr <= match.Groups.Count - 1; ctr++)
               {
                  if (match.Groups[ctr].Success)
                     Console.WriteLine("Group {0}: {1}", 
                                       ctr, match.Groups[ctr].Value);
                  else
                     Console.WriteLine("Group {0}: <no match>", ctr);
               }
            }
         }
         Console.WriteLine();
      }      
   }
}
// The example displays the following output:
//       Match in AA22ZZ: AA22ZZ
//       Group 1: AA
//       Group 2: 22
//       Group 3: ZZ
//       
//       Match in AABB: AABB
//       Group 1: AA
//       Group 2: <no match>
//       Group 3: BB

참고 항목

개념

정규식 언어 요소

Share via

역참조 구문

번호가 지정된 역참조.

명명된 역참조.

역참조가 검색하는 내용

참고 항목

개념

추가 리소스