Dial Plan Regular Expression Engine - Language Summary

Windows Mobile 6.5
A version of this page is also available for
4/8/2010

This section summarizes the regular expression language that is used to create Dial Plan rule patterns and formatting templates. A rule pattern is a regular expression which specifies the set of strings that a user, or network provided, string will be compared against to find a match. A formatting template specifies the format of the string to be sent to the SIP server or to be displayed to the user.

The parser recognizes the following operations in order of precedence:

Item Description

Quantifiers

A quantifier occurring immediately after an expression specifies the number of times the expression may occur. The following list shows the accepted quantifiers:

  • * (Asterisk)Indicates that the preceding expression can occur zero or more times. For example a*b represents b, ab, aab, aaaaa…b, etc.
  • + (Plus)Indicates that the preceding expression can occur one or more times. For example a+b represents ab, aab, aaab…
  • ? (Question Mark)Indicates the preceding expression can occur exactly zero or one time. For example a?b matches ab and b
  • {} (Braces)Braces are used to indicate an interval for the preceding expression. Repeat or interval expressions are of the form:
    • {count} - indicates the preceding expression must repeat exactly «count» times. For example: ab{3} matches abbb
    • {min, max} - indicates the preceding expression must have at least «min» but no more than «max» occurrences. For example a{1,3}b matches the following expressions ab, aab, aaab
    1. «count», «min» and «max» must be in the range of zero to EG_EX_MAX_OCCURRENCES (127)
    2. «max» must be greater than «min» and cannot be equal to zero. For example: \d{3,5} would match any 3 digit, 4 digit or 5 digit string.
    3. Interval expressions are not supported for groups

Concatenation

Expressions placed directly next to each other indicates that they are to be concatenated to form a new expression. For example, placing the expression abc directly after the expression xyz creates a new expression xyzabc.

Alternation

Two expressions separated by a vertical bar specifies alternation. For example aba | bbc, (read "aba" or "bbc") can match the expression aba or the expression bbc.

Grouping

Parentheses are used to refine the scope and precedence of other operations and to make the boundaries of expressions explicit. For example (ab)* forces the star quantifier to apply to the whole expression ‘ab’ as opposed to just ‘b’.

Parentheses may be nested within other groups. For example: (a(b|c)d)+ represents (abd | acd)+

To determine a match, the parser does a character by character comparison of the input string and the rule pattern. However, the parser recognizes a set of meta-characters which cause it to alter this behavior. For example, you can embed a meta-character in the pattern expression which will cause the parser to accept any numeric character at a specified position in the input string.

The dot character ‘.’ Matches any single character. For example ab.de matches abcde, abdde, ab1de, ab3de,...,etc

Characters preceded by a forward-slash ‘\’ are known as delimited and have special meaning. The following list shows the delimited characters:

Item Description

\d

Matches any digit 0-9

\D

Matches any non digit character

\w

Matches any word character; same as [a-zA-Z_0-9]

\W

Matches any non word character

\s

Matches any whitespace character (space, tab, newline, etc.)

\S

Matches any non whitespace character

\t

Matches a tab character

\\, \(, \), \[, \., \+, \*, \?

Matches the literal value of the character. For example: \+ matches the + character.

Matches any character contained inside square brackets []. For example [abc] matches either "a", "b" or "c".

Item Description

Literals

All characters inside the square braces are treated as literals – meaning [\d] matches the two characters ‘\’ and ‘d’. This implies there can be no grouping or quantification inside the character class.

Example: ab[cd]e matches "abce" or "abde"

Negation

The ^ character specified at the beginning of the character class, negates the whole value of the class. Meaning [^abc] matches every character except "abc".

Example: a[^b]cd matches all strings of the form a#cd where # is not equal to "b" ("afcd", "accd", "azcd",...etc).

Ranges

A range is indicated by 2 characters separated by a ‘-‘ sign (e.g. a-z). One or more ranges can be specified inside the character class. This notation matches all characters within the range (inclusive). For example: a-z matches all lowercase characters and a-zA-CE-Z matches all lower case and uppercase characters, except for D

Exceptions

If you want to express the characters “[“ , “-“, or “]” as literal values you must place them at the beginning of the character class. For example:

Correct: [-abc], []abc], [[abc]

Incorrect:[abc[], [a-bc]

Match Groups enable you to identify and extract portions of a matched string. A match group is an expression enclosed in a set of parentheses. For example, the expression ab(cd)ef(g*) contains two match groups "cd" and "g*".

Aa921921.note(en-us,MSDN.10).gifNote:
Nested Match Groups are not supported currently. While nested sets of parentheses are supported for precedence and scoping, these nested expressions inside a parent match group are all part of the same match group. For example: ab(cd(e|f)g)+ contains one match group: “cd(e|f)g”

A Match Group is referenced by a number which indicates its order, relative to other Match Groups, within a larger expression. In a dial plan rule the reference number is prefixed by a "\".

In the following example, the rule pattern contains 3 Match Groups - "\d{3}", "\d{3}" and "\d{4}". They are referenced in the dial and display templates as "\1", "\2", "\3" respectively. The table in the example illustrates the result of applying the rule to an input string that is a member of the set defined by the pattern. Assume $host$ is replaced by "157.54.9.161"

<rule pattern='1\s*-?\s*(\d{3})\s*(\d{3})\s*-?\s*(\d{4})'
        dial='sip:91\1\2\3@$host$'
        display='1 (\1) \2-\3'
        />

Community Additions

Show: