Switches character set on all XML files in a fileset to the, desired encoding. The transcoding is performed using XSLT, so any encoding supported by the XSLT processor (currently saxon8) should be supported. Characters that can't be represented by the output encoding will be converted to a numeric entity. For example, '√' will be converted to √ if not supported by the output character set.
It is also possible to specify if the XML documents shall use unix, dos or mac line breaks.
The transformer is written to work on any file/fileset that can
be represented by the org.daisy.util.fileset package.
Character set transcoding will only be done on XML members of the input fileset; all other types of members pass through untouched.
If no file in the fileset is of type XML, then the whole fileset will pass through untouched. It is therefore safe to place this transformer in contexts whose dataflow varies considerably.
A file/fileset whose XML members has been transcoded, and optionally has had certain characters substituted by replacement strings. See parameters
No specific recovery scheme. On error, this transformer will send a fatal message, then throw an exception and abort.
utf-8 is used as default.unix,
dos, mac and default. The default value is (unsurprisingly)
default.Most of the functionality of this transformer could also be performed using the int_daisy_unicodeTranscoder. This transformer can probably be deprecated when some third party packages used by the int_daisy_unicodeTranscoder become more stable.
Currently the Saxon8 XSLT processor is used to perform the actual transcoding.
Linus Ericson, TPB
LGPL