5 out of 12 rated this helpful - Rate this topic

System.Text Namespace

The System.Text namespace contains classes representing ASCII, Unicode, UTF-7, and UTF-8 character encodings; abstract base classes for converting blocks of characters to and from blocks of bytes; and a helper class that manipulates and formats String objects without creating intermediate instances of String.

  Class Description
Public class ASCIIEncoding Represents an ASCII character encoding of Unicode characters.
Public class Decoder Converts a sequence of encoded bytes into a set of characters.
Public class DecoderExceptionFallback Throws DecoderFallbackException if an encoded input byte sequence cannot be converted to a decoded output character. This class cannot be inherited.
Public class DecoderExceptionFallbackBuffer Throws DecoderFallbackException when an encoded input byte sequence cannot be converted to a decoded output character. This class cannot be inherited.
Public class DecoderFallback Provides a failure-handling mechanism, called a fallback, for an encoded input byte sequence that cannot be converted to an output character.
Public class DecoderFallbackBuffer Passes a string to a decoding operation that is emitted instead of an output character because an input byte sequence cannot be decoded.
Public class DecoderFallbackException The exception that is thrown when a decoder fallback operation fails. This class cannot be inherited.
Public class DecoderReplacementFallback Provides a failure-handling mechanism, called a fallback, for an encoded input byte sequence that cannot be converted to an output character. The fallback emits a user-specified replacement string instead of a decoded input byte sequence. This class cannot be inherited.
Public class DecoderReplacementFallbackBuffer Represents a substitute output string that is emitted when the original input byte sequence cannot be decoded. This class cannot be inherited.
Public class Encoder Converts a set of characters into a sequence of bytes.
Public class EncoderExceptionFallback Throws an EncoderFallbackException if an input character cannot be converted to an encoded output byte sequence. This class cannot be inherited.
Public class EncoderExceptionFallbackBuffer Throws EncoderFallbackException when an input character cannot be converted to an encoded output byte sequence. This class cannot be inherited.
Public class EncoderFallback Provides a failure-handling mechanism, called a fallback, for an input character that cannot be converted to an encoded output byte sequence.
Public class EncoderFallbackBuffer Passes a substitute string to an encoding operation. The string is used in place of any input character that cannot be encoded.
Public class EncoderFallbackException The exception that is thrown when an encoder fallback operation fails. This class cannot be inherited.
Public class EncoderReplacementFallback Provides a failure handling mechanism, called a fallback, for an input character that cannot be converted to an output byte sequence. The fallback provides a user-specified replacement string in place of the original input character. This class cannot be inherited.
Public class EncoderReplacementFallbackBuffer Represents a substitute input string that is used when the original input character cannot be encoded. This class cannot be inherited.
Public class Encoding Represents a character encoding.
Public class EncodingInfo Provides basic information about an encoding.
Public class MLangCodePageEncoding  
Public class StringBuilder Represents a mutable string of characters. This class cannot be inherited.
Public class UnicodeEncoding Represents a UTF-16 encoding of Unicode characters.
Public class UTF32Encoding Represents a UTF-32 encoding of Unicode characters.
Public class UTF7Encoding Represents a UTF-7 encoding of Unicode characters.
Public class UTF8Encoding Represents a UTF-8 encoding of Unicode characters.
  Enumeration Description
Public enumeration NormalizationForm Defines the type of normalization to perform.
Did you find this helpful?
(1500 characters remaining)
Community Content Add
Annotations FAQ
Use Unicode

The Encoding classes are primarily intended to convert between different encodings (code pages) and Unicode.  Often one of the Unicode Encodings is the "right" choice:

  • Encoding.UTF8
  • Encoding.Unicode

Another consideration is how your applications should respond to data errors.  System.Text.Encoding classes can allow errors to silently change to ? or a "best-fit" character or a behavior of your choice (using the EncoderFallback and DecoderFallback classes).  You can also choose to throw exceptions on data errors, either by using a throwonerror flag in some classes or by using the EncoderExceptionFallback and DecoderExceptionFallback classes.

If you are concerned about the integrity of the data stream, throwing on an exception is recommended.  Otherwise replacing with ? or similar replacements may be acceptable.  You may also want to consider an EncoderReplacementFallback/DecoderReplacementFallback with the U+FFFD Unicode Replacement Character.  Best Fit fallback is often not recommended because it can cause data loss & confusion and is slower than simple ? character replacements.  For ANSI code pages the best fit behavior is however the default.

 

Blogging about System.Text

For more discussion about System.Text, you can visit http://blogs.msdn.com/shawnste/archive/category/9711.aspx.