EDT only ever permits character sets which are supported by the current XHCS installation. This applies to both batch mode and interactive mode.
The communication character set must be compatible with the terminal. However, it can be different from the data character set.
The necessary conversions are handled via XHCS and the properties of the characters (uppercase/lowercase, special characters) are provided by XHCS.
By default, the Unicode character sets, all the EBCDIC character sets and the ISO character sets are permitted.
Handling invalid characters
Illegal byte sequences may occur in Unicode character sets. Thus, for example, in UTFE
or UTF8
several multibyte start characters may occur in sequence. EDT always rejects the entry of this type of illegal byte sequence both when reading files or variables and in the case of input in hex mode.
In UTF16
, only characters from the surrogate range (xD800-xDFFF
) are illegal.
All other characters, i.e. 2-byte sequences are accepted even if they cannot be depicted at the terminal. Special EDT semantic considerations are also ignored, e.g. the requirement that a character should not cause any line feed. In particular, even a Byte Order Mark (BOM) has no effect but is simply transferred as a character.
In the case of 7-bit character sets or incompletely defined 8-bit character sets all bytes are accepted and are taken over unchanged for reasons of compatibility. However, these undefined characters can never be converted into another character set.
Handling of national 7-bit character sets
In Unicode mode, all the 7-bit character sets that are defined in XHCS are permitted (for the handling of national 7-bit character sets, see section “The character set EDF03DRV”).
The @SHOW CCS statement can be used to query the currently supported character sets.