Export (0) Print
Expand All

EDI Character Sets

BizTalk uses a character set to validate an entire EDI interchange. The character sets used for an X12-encoded message and an EDIFACT- or KEDIFACT-encoded message are determined in different ways.

EDIFACT Character Set

An EDIFACT-encoded interchange is self-describing in terms of its character set. The UNB1 data element is used. EDIFACT requires that tag names and separators/delimiters are ASCII types; as a result, locating UNB1 to apply the relevant code page for the remaining interchange is possible.

When processing an incoming EDIFACT message, BizTalk Server determines the character set to use for that message from the UNB1 data element. No setting in the party or global properties is necessary.

When processing an outgoing EDIFACT message, BizTalk uses the character set in the party or global properties. You set the UNB1 data element in the UNB Segment Definition page of either the EDI Properties dialog box (if a party has been established for an interchange) or the EDI Global Properties dialog box (if no party has been established). UNB1.1 is a mandatory composite data element called the Syntax Identifier. UNB1.2 is the version of the EDIFACT character set. The UNB1 data element is also used to validate the values entered for properties in the Partner Agreement Manager when the entire property set is saved (not when you tab out of a field or display a different page).

The available character sets are KECA, UNOA, UNOB, UNOC, UNOD, UNOE, UNOF, UNOG, UNOH, UNOI, UNOJ, UNOK, UNOX, and UNOY. The default value is UNOB. The full character set for these levels is specified in ISO 9735 EDIFACT Syntax Rules.

Bb246115.note(en-us,BTS.20).gifNote
If the UNOC character set is encountered on an inbound or outbound interchange, the EDI Disassembler or EDI Assembler will use the Latin-1 code page, instead of the UTF-8 code page. This is required because UTF-8 is not a superset of UNOC. Some characters that are acceptable in UNOC will cause an interchange to be suspended when processed as UTF-8.

Characters in some EDIFACT character sets may be double-byte characters, whereas in other EDIFACT character sets they may be single-byte characters. Because of this, when you set the release criteria for batches based upon the number of characters in the interchange, the number of bytes in the interchange may differ depending on the character set used.

The UNA Segment and Segment Name UNB are limited to the values in the ASCII character set.

KEDIFACT Character Set

As with EDIFACT, the character set for a KEDIFACT-encoded interchange is established in the UNB1 data element. The character set to be applied by BizTalk Server when processing a KEDIFACT interchange is established in data element UNB1 of the UNB Segment Definition page of the EDI Properties dialog box, as for EDIFACT. The value for the UNB1.1 element must be set to KECA.

X12 Character Set

When the BizTalk receive pipeline or send pipeline performs EDI validation of an X12-encoded message, it will use the X12 character set selected in the CharacterSet property of the pipeline. To set this property, open the Properties dialog box for the receive location or send port, click the ellipses next to the receive or send pipeline, and then set the CharacterSet property for the Disassembler or Assembler.

The CharacterSet property of the pipeline is used to validate an X12 interchange because unlike EDIFACT or KEDIFACT, an X12-encoded interchange is not self-describing in terms of its character set. Reading the ISA header with ISO or UTF encoding may lead to different values for party lookup. As a result, BizTalk must know the applicable character set to be used in processing the message prior to party lookup (when it would obtain the applicable character set for the party).

You specify the X12 character set to be used for party property validation in the X12 Interchange Envelope Generation page of either the EDI properties dialog box (if a party has been established for an interchange) or the EDI Global Properties dialog box (if no party has been established). However, BizTalk only uses these settings to validate the values entered for the related properties when the entire property set is saved (not when you tab out of a field or display a different page). The receive pipeline or send pipeline will ignore these character set properties.

Bb246115.note(en-us,BTS.20).gifNote
If the character set selected in the party's EDI properties or the global properties does not match the character set selected for the receive or send pipeline, message validation errors could result. An example would be if the X12 character set property in the EDI Properties dialog box is set to Extended while the X12 character property in the pipeline properties is set to Basic.

The available character sets are Basic and Extended (as documented in the X12 Specifications/Implementation Guides), and UTF8/Unicode.

Bb246115.note(en-us,BTS.20).gifNote
The values entered for the data-element separator, component-element separator, and segment terminator in party or global properties are limited to the values in the ASCII character set. These properties are not validated against the X12 character set.

The Basic character set includes the following uppercase letters, digits, space, and special characters: A through Z, 0 through 9, ! “ & ’ ( ) * + , - . / : ; ? = “ ” (space).

The Extended character set includes the characters in the Basic character set, and lowercase letters, select language characters, and other special characters: a through z, % @ [ ] _ { } \ | < > ~ # $.

Other Resources

EDI Schemas

Community Additions

ADD
Show:
© 2014 Microsoft