How XLIFF/XLF Files are Handled as Source Files in XTM Cloud (v.1.2)

How XLIFF/XLF Files are Handled as Source Files in XTM Cloud (v.1.2)

Introduction

This article provides information about how XTM Cloud handles XLIFF/XLF files.

Note that Source XLIFF files and Offline XLIFF files are two different things! To learn more about Offline XLIFF files, read the following article: How to translate offline (use of XLIFF files).


XLIFF/XLF files in XTM Cloud

XTM Cloud processes source XLIFF files in accordance with the XLIFF 1.2 Specification. This means that, when an XLIFF source file is being read in XTM Workbench, the following can be imported:

  1. Translation contained within target element in a trans-unit.

IMPORTANT!

The target element always has the highest priority when it comes to populating translations in XTM Workbench.

  1. The State attribute of a target element in a trans-unit is imported as a segment state in XTM Workbench. By default, signed-off is imported as completed. Other State attributes are imported as incomplete, but other mapping is possible.

  2. The State-qualifier attribute of a target element in a trans-unit can be used to show which TM match type has been used to translate the segment. Supported values:

  • leveraged-inherited → for Repetition.

  • fuzzy-match → for a Fuzzy match.

  • leveraged-tm → for a Leveraged match.

  • x-fuzzy-forward → for a Fuzzy repetition.

  • exact-match → for an ICE match.

  • mt-suggestion → for a Machine translation match.

  • x-alphanum → for a Non-translatable alphanumeric segment.

  • x-numeric → for a Non-translatable numeric segment.

  1. TM matches from alt-trans elements, with their quality from the match-quality attribute and match type from the extype attribute. Supported values:

  • The match-quality attribute takes a percentage value (e.g. "95%"):

    • If it is missing, XTM Cloud assumes no matching.

    • If it is below 75%, XTM Cloud also assumes no matching.

  • The extype attribute takes one of the following values:

    • leveraged-inherited → for a Repetition.

    • fuzzy-match → for a Fuzzy match (it will be shown as 100% match in XTM Workbench if no match-quality is set).

    • leveraged-match → for a Leveraged match.

    • fuzzy-forward → for a Fuzzy repetition (it will be shown as 100% match in XTM Workbench if no match-quality is set).

    • exact-match → for an ICE match.

    • MACHINE-TRANSLATION → for Machine translation, but it is also possible to add the state-qualifier attribute with themt-suggestion value to the <target> element of a<trans-unit>.

  1. Comments from the note element in a trans-unit.

  2. Length limitation from the maxwidth or minwidth attributes of a trans-unit element.

  3. Segment IDs and additional custom columns can be created, with additional configuration.

If the source XLIFF files store some metadata in elements such as iws:segment-metadata, it is not read by default. The XTM International Support team can try and create a configuration to extract these attributes as custom columns, to make them visible to Linguists.

Additional information about TM matches cannot be displayed, as custom columns are extracted per segment and it would not be possible to see which TM match the information is for if there was more than one TM match present.

A segment in an XLIFF file is only excluded from translation if the translate attribute is set to no. An alternative approach would be to use the state attribute that would cause the segment to be marked as Completed in XTM Workbench.


What can be customized within the XLF/XLIFF parser?

As was already mentioned in the previous section, XTM Cloud enables you to decide on which specific XLF/XLIFF elements should be extracted and which should not:

  • XTM Cloud standard XLF/XLIFF parser can be customized to extract segment IDs or custom columns.

  • The comments (from note elements) and maxlength and minlength (from maxwidth and minwidth attributes of trans-unit) are extracted by default.

  • The language-based segmentation can also be adjusted (sentence-based or paragraph-based segmentation).

  • The content-based (on inline tags) segmentation and translation rules are not customizable.

  • The XLIFF state attribute can be mapped to a preferred XTM Workbench status, and the XTM Workbench status can be mapped to a preferred target state attribute for XLIFF.

  • The target language code can be mapped to a preferred one for a particular language.

XTM Cloud advanced XLF/XLIFF parser allows for everything the previous one did, and adds the possibility to exclude something from translation, break segments on inline elements, and hide or lock already translated segments. For more information, do not hesitate to contact XTM International Support.

Custom segmentation

It is usually not possible to change the segmentation for XLIFF source files due to a technical limitation inherent to the bilingual nature of those files.

In the vast majority of cases, if XLIFF has already been pre-translated, with each <trans-unit> containing both <source> and <target> elements, the segmentation cannot be altered, as the translation within the <target> element has already been aligned with the corresponding <source> element. Modifying the segmentation would result in discrepancies between the provided translation and the source text, making it impossible to properly match them. This is a standard behavior of XLIFF files.

In general, the custom segmentation works only if:

  1. There is no <seg-source> element in a particular <trans-unit>.

  2. There is no <target> element or it is equal to <source> in a particular <trans-unit>.

  3. There are no <mrk> elements inside <source> or <target> elements.


How to obtain target elements written in the target files which contain the same values as the source

Sometimes, your XLIFF source file might contain some translation units with the attribute "translate = no", and so are not displayed in XTM Workbench.

Nonetheless, you would still like to include such units in the generated target files, with values of the source and target elements being the same.

Unfortunately, it is not possible for a target XLIFF file to contain the <target> element within a <trans-unit> that has the attribute translate="no" and only contains the <source> element.
This limitation stems from the XLIFF file specification. If a <trans-unit> contains only the <source> element and that element has not been modified in any way, no <target> element will be created.

There are two potential ways to achieve the behavior you're looking for:

  1. Remove the translate="no" Attribute: This would allow the relevant elements to appear in Workbench. You could then "skip" these elements, which would result in the creation of a <target> element, where the <source> and <target> are the same.

  2. Create a Configuration to Display the translate="no" Elements in Workbench: XTM Support can set up a configuration that would display <trans-unit> elements with the translate="no" attribute in XTM Workbench. Once displayed, you would again be able to "skip" them, which would trigger the creation of the <target> element.

Both options require the relevant <trans-unit> elements to be displayed in XTM Workbench and confirmed in order for the <target> elements to be created.


XLIFF files from other CAT tools

  • Trados files are for the most part not recognizable in XTM Cloud.

  • .sdlxliff and .sdlproj files are analyzable in XTM Cloud.

  • .sdlppx, .mdf, .mtf, and .sdltm file formats are not supported in XTM Cloud.