PERCON V3.0 (en)

&pagelevel(4)&pagelevel

The encoding of a base character with a diacritic can vary in Unicode. A diacritic is an additional character (e.g. an accent) used to define how a letter is pronounced or stressed. Consequently several encodings can exist for one character in Unicode. The character “Ä”, for example, can also be written as a string consisting of “A” and “°”. Under certain circumstances this characteristic of Unicode can prove to be a hindrance for programming. To permit a uniform format to be assigned to the same characters with different encoding, PER-CON offers the normalization function COMPOSED. COMPOSED combines a base character with the associated diacritic to form a single character. However, normalization can take place only if the input file and/or the output file is assigned the Unicode variant UTF-16.

The following format combinations are possible:

The Unicode variant UTF-16 is only assigned to the input file.
When normalization is requested, first normalization takes place and then conversion.
The Unicode variant UTF-16 is only assigned to the output file.
When normalization is requested, first conversion takes place and then normalization.
The Unicode variant UTF-16 is assigned to both the input file and the output file. Conversion serves only for the purpose of normalization.
The Unicode variant UTF-16 is assigned to neither the input file nor the output file. The requested normalization is ignored.

Note

Normalization does not take place automatically; it must always be requested by the user (see UNICODE-NORMALIZE in the ASSIGN-OUTPUT-FILE statement). The normalization procedure is very time-consuming, and the user should consequently only request it when it is really necessary.

Your Browser is not longer supported

Please use Google Chrome, Mozilla Firefox or Microsoft Edge to view the page correctly

Normalization