Application icon

Re-encode

This statement re-encodes the contents of one or more fields or a named variable. When a named variable is specified, the function is only performed once, regardless of the execution mode. The named variable field may contain any of the escape sequences described in Escape Sequences.

For all functions the source field is not modified if for any reason function fails.

The following functions are available:

Cyrillic
Greek
ISO Latin-2
Turkish
WinLatin-1
WinLatin-2

The ID3 specification uses ISO-Latin-1 as its 8 bit text encoding. In the past before UTF8 was supported, many people specified their mp3 fields in a variety of languages which contained characters not supported in ISO Latin-1. When these files are read by Yate, fields which specify an encoding of ISO Latin-1 may not display the correct characters if in fact they were not ISO Latin-1 characters.

This statement allows you to specify the original encoding and attempt to re-encode to the actual encoding. Modifications will be made wherever possible. Note that if a field currently contains characters which cannot be represented in ISO Latin-1, no modifications will occur.

The algorithm essentially re-encodes the Mac's internal representation of a string back to ISO Latin-1 and then encodes the raw data using your specified encoding.

Unicode UNFC

Unicode supports the encoding of most accented characters as precomposed single characters or decomposed sequences. É, precomposed has a string length of 1. When decomposed it has a string length of 2. The string displays correctly regardless of the encoding. When Unicode UNFC is selected, the associated fields are converted to their precomposed encoding. UNFC stands for Unicode Normalization Form C. Note that this transformation should rarely be required.

Force ISO Latin-1

This function attempts to ensure that every character in the result can be represented as an ISO Latin-1 character. It does so by changing various characters to their similar ISO Latin-1 equivalents, removing accents if necessary and as a last resort by changing characters which cannot be represented as ISO Latin-1 to underscore characters. Unicode UNFC, and Fold Characters are applied.

Remove Accents

This function re-encodes all accented characters to their baseline unaccented characters, wherever possible.

Fold Characters

This function changes various characters to their similar Latin-1 equivalents. Currently this includes single and double quote equivalents as well as dash/hyphen equivalents. Unicode UNFC is applied. A complete list of the current substitutions can be found here.

ASCII (Lossy)
This function re-encodes as ASCII discarding characters which cannot be represented.

ISO Latin-1 (Lossy)
This function re-encodes as ISO Latin-1 discarding characters which cannot be represented.

Remove RTF Formatting
If the data is properly structured RTF, the formatting will be removed leaving only the text.

Note that the lossy variants should be used as opposed to the Lossy Encode As statement which has been deprecated as of v3.14.