.NET Framework Regular Expressions
Regular expressions provide a powerful, flexible, and efficient method for processing text. The extensive pattern-matching notation of regular expressions allows you to quickly parse large amounts of text to find specific character patterns; to extract, edit, replace, or delete text substrings; or to add the extracted strings to a collection in order to generate a report. For many applications that deal with strings (such as HTML processing, log file parsing, and HTTP header parsing), regular expressions are an indispensable tool.
Microsoft .NET Framework regular expressions incorporate the most popular features of other regular expression implementations such as those in Perl and awk. Designed to be compatible with Perl 5 regular expressions, .NET Framework regular expressions include features not yet seen in other implementations, such as right-to-left matching and on-the-fly compilation.
The .NET Framework regular expression classes are part of the base class library and can be used with any language or tool that targets the common language runtime, including ASP.NET and Visual Studio 2005.
In This Section
- Regular Expressions as a Language
-
Provides an overview of the programming language aspect of regular expressions.
- Regular Expression Language Elements
-
Provides information on the set of characters, operators, and constructs that you can use to define regular expressions.
- Regular Expression Classes
-
Provides information and code examples illustrating how to use the regular expression classes.
- Details of Regular Expression Behavior
-
Provides information about the capabilities and behavior of .NET Framework regular expressions.
- Regular Expression Examples
-
Provides code examples illustrating typical uses of regular expressions.
Reference
- System.Text.RegularExpressions
-
Provides class library reference information for the .NET Framework System.Text.RegularExpressions namespace.
A warning about the following: By default, the Decimal Digit Character \d matches decimal digits from other cultures (i.e. non-ASCII)!
Therefore according to the following list, '໘໘໘໘໘' is a valid US Zip Code.
To prevent this, you must specify the RegexOptions.ECMAScript option in your Match().
Some Common Regular Expressions:
- Phone Number
^\d{3}-\d{3}-\d{4}$
Example: 123-456-7890 - US Zip Code
^\d{5}$
Example: 12345 - US Zip Code +4
^\d{5}-\d{4}$
Example: 12345-7890 - Email Address
^([0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9})$
Example: william.rawls@gmail.com - Email Address
\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*
This is another example and it works great. Also it can be used for JAVA. - USA Social Security Number (SSN)
^\d{3}-\d{2}-\d{4}$
Example: 555-22-9999
- 6/9/2006
- William M. Rawls
- 5/26/2011
- James McCormack
<string> -match <regular expression>
For example:
"Thomas Lee" -match "Lee"
The result of this would, of course, be $true.
In pages within this section, you'll find additional examples of regular expressions with Powershell.
- 5/14/2008
- Thomas Lee
- 5/21/2008
- Thomas Lee
Regular expressions are one of those places where people "re-invent the wheel" a lot. The <a href="http://msdnwiki.microsoft.com/en-us/mtpswiki/kweb790z.aspx">Regular Expression Examples</a> page is nice, but what I would really like to see is a page with a lengthy, categorized, list of commonly needed regular expressions. E.G. parsing a time string into hours, minutes, and seconds. Matching floating-point numbers, splitting a string on whitespace, etc.
There are some good websites out there with regular expression libraries, but with the number of different Regex engines out there in the world, it can be difficult to know whether a regex you find that's advertised as doing what you want will actually do that same thing when used in the .NET Regex class. This topic seems like the right place for Microsoft to establish a library of .NET-compatible regular expressions.
-- JasonBl, MSFT