|Important||This document may not represent best practices for current development, links to downloads and other resources may no longer be valid. Current recommended version can be found here.|
Updated: January 2010
Examples in previous topics in this section have only been concerned with finding chapter headings. Any occurrence of the string Chapter followed by a space and a number could be an actual chapter heading, or it could also be a cross-reference to another chapter. Since true chapter headings always appear at the beginning of a line, it may be useful to devise a way to find only the headings and not the cross-references.
Anchors provide that capability. Anchors allow you to fix a regular expression to either the beginning or end of a line. They also allow you to create regular expressions that occur within a word, at the beginning of a word, or at the end of a word. The following table contains the list of regular expression anchors and their meanings:
Matches the position at the beginning of the input string. If the m (multiline search) character is included with the flags, ^ also matches the position following \n or \r.
Matches the position at the end of the input string. If the m (multiline search) character is included with the flags, $ also matches the position preceding \n or \r.
Matches a word boundary, that is, the position between a word and a space.
Matches a nonword boundary.
You cannot use a quantifier with an anchor. Since you cannot have more than one position immediately before or after a newline or word boundary, expressions such as ^* are not permitted.
To match text at the beginning of a line of text, use the ^ character at the beginning of the regular expression. Do not confuse this use of the ^ with the use within a bracket expression.
To match text at the end of a line of text, use the $ character at the end of the regular expression.
To use anchors when searching for chapter headings, the following regular expression matches a chapter heading that contains no more than two following digits and that occurs at the beginning of a line:
Not only does a true chapter heading occur at the beginning of a line, it is also the only text on the line. It occurs at beginning of the line and also at the end of the same line. The following expression ensures that the specified match only matches chapters and not cross-references. It does so by creating a regular expression that matches only at the beginning and end of a line of text.
Matching word boundaries is a little different but adds a very important capability to regular expressions. A word boundary is the position between a word and a space. A nonword boundary is any other position. The following expression matches the first three characters of the word Chapter because the characters appear following a word boundary:
The position of the \b operator is critical. If it is at the beginning of a string to be matched, it looks for the match at the beginning of the word. If it is at the end of the string, it looks for the match at the end of the word. For example, the following expression matches the string ter in the word Chapter because it appears before a word boundary:
The following expression matches the string apt as it occurs in Chapter but not as it occurs in aptitude:
The string apt occurs on a nonword boundary in the word Chapter but on a word boundary in the word aptitude. For the \B nonword boundary operator, position is not important because the match is not relative to the beginning or end of a word.