In Unicode mode, EDT permits the entry of characters which are, for example, not defined in the character set used in the source of the input, in the form of a substitute representation in which the UTF16
code of the character is specified directly. To do this, the
@PAR ESCAPE-CHARACTER is used to declare a global or work file-specific escape character which initiates the substitute representation. By default, no substitution is performed (ESCAPE-CHARACTER=*NONE
). The DATA-REPLACEMENT
operand in the @PAR statement can be used to define the context in which substitution takes place.
By default, the substitute representation is only evaluated within statements and there only in literals (DATA-REPLACEMENT=OFF
). Setting DATA-REPLACEMENT=ON
causes this to be performed in data input as well.
The substitute representation has the form specUxxxx
, i.e. the escape character is followed by a U
or u
(for Unicode) and exactly 4 hexadecimal numbers which specify the code of the character. If, for example, the escape character %
has been specified using @PAR ESCAPE-CHARACTER='%', then the Greek Ω can be entered in the form %U03A9
or %u03a9
(the input is not case-sensitive).
If @PAR ESCAPE-CHARACTER=*NONE (default setting) has been declared either globally or for the current work file, if the substitute representation is formally incorrect or if no valid UTF16
character corresponds to the entered code then the substitute representation is treated as a normal string. The substitute representation is also not converted on data entry in F mode if the string for the substitute representation exceeds a column position for which a hardware tab has been defined.
If despite the specification of a valid UTF16
character, this cannot be converted into the target character set, then the procedure is the same as if an invalid character had been entered directly (e.g. via a corresponding keyboard).
The interpretation is independent of whether the characters entered using the substitute representation can be displayed at the screen or not. As described in the previous section, characters which cannot be displayed are depicted by means of the smudge character.