Translation memory (TM) matching and its expected behavior

Introduction

This article provides you with detailed information about translation memory (TM) matching and its expected behavior in XTM Cloud.


Priority of TM matching

Whenever segment ID is used, it takes priority over the ‘'normal matching rules’'. Segment ID matching can be set up in a particular filter template, per customer or at project template level. ICE matches can be penalized to to that matches are only leveraged at project level. Otherwise, depending on whether or not a particular TM match exists, the default order in which matching is performed in a project is as follows:

  1. Segment ID.

  2. In-context exact (ICE) matches.

  3. Repetitions.

  4. Leveraged matches.

  5. Cross-file repetitions.

  6. Fuzzy matches.

  7. Machine translation (MT) matches.

  8. Fuzzy repetitions.

Keep in mind that this order can always be changed if required, by means of, for example, TM settings (global/customer-level/project-level), or TM penalty profiles.

IMPORTANT!

If there are multiple matches of the exact same score available for a particular segment, for instance 3 Leveraged matches, the TM match that is prioritized in this case is the one with a most recent Date modified. This can be found in the metadata of a particular TM record in your TM base in the XTM Cloud UI.

2024-10-24 12_04_27-Window.png

Internal repetitions

Segments containing internal repetitions are populated automatically when no match has been inserted and one of the repetitions is translated for the first time. Depending on the settings in XTM Workbench, these segments can be populated with updated versions of the relevant TM matches if the internal repetition translation is changed.

In such a scenario, provided that the repetition settings are correct in XTM Workbench, if a Linguist performs some changes in the first occurrence of a text, the other identical segments will be updated: a repetition translation will be inserted in them.


Fuzzy match instead of an ICE match

If you find that an ICE match has been converted into a Fuzzy match in XTM Workbench, this is due to the Set matches with inline differences as fuzzy setting, which you can enable when you select ConfigurationSettingsTranslationTMMatches - general.

This kind of situation is usually caused by the fact that there is some sort of difference between the inline tags in a file.

If there are inline tags in the original TM record, the TM match has been downgraded from 100% match (or ICE match) to 99% Fuzzy match, due to the difference.


Unmatched or MT-matched segments turn up as Leveraged (100%) or ICE matches

The parent of a repetition (the first occurrence of the segment in the file) is not flagged as a repetition. Because of this, it cannot be included in repetition ("R") matching. Keep in mind that the matching for the segment is refreshed every time the segment is activated in XTM Workbench. For instance, when it is activated in the next workflow step, the matching is performed again. Then, the TM will already contain the entries from the previous workflow step in which the file was actually translated. This can cause new TM matches to be found and displayed in XTM Workbench.

Furthermore, the segment cannot be matched from itself (unless reanalysis is performed). It can, however, be matched to other segments originating from the same file. This is shown in the example in the screenshots below – the parent of the repetitions has been matched from the TM entry that was created from one of its “children”. Here, segment 72 was a parent for segments 74, 76 and 78. Since it has a different context from its children, they have been saved as separate TM entries. This has resulted in a TM match being found when segment 72 was revisited later.

In the case of some segments, this behavior is not caused by the 100% repetitions but by the setting which caused the Fuzzy Matches to change to 100% or ICE matches, if the segment contained a number and this was the only difference between the segment and its TM match. The setting in question is If segment has one number, which is different, substitute number & try to promote fuzzy to leveraged match. To access the relevant checkbox, select ConfigurationSettingsTranslationTMFuzzy matches.

IMPORTANT!

Keep in mind that the If segment has one number, which is different, substitute number & try to promote fuzzy to leveraged match setting only works when:

  • there is only one number in a segment and this number differs the current source segment from the one from the translation memory.

  • the decimals in the current source segment are separated with full stop or comma, e.g. 10,5 or 10.5.

  • the currency or measurement units come after or before a number in the current source segment and are written without space, e.g. 10cm, $15, $10,50.

The setting does NOT work when:

  • there are two or more numbers in a segment.

  • a number in the current source segment is a combination of digits and letters, e.g. 43AB32.

  • a different character is used to separate digits within a number, e.g. 1'000'000, 1'000'000,50, 10 000.


TM import with statuses

When nothing is selected from the Import or set approved/not approved status dropdown (TMTM importImport TM), XTM Cloud will apply the Set as not approved option by default. To fix this issue, the TM would have to be imported again, with the Import statuses and set as not approved where missing option selected. If you then import the TM again, the statuses will be overwritten.


Rollback in XTM Cloud project workflow

In an XTM Cloud project’s workflow, there is an option called Rollback. You can use it to revert the workflow to the previous step. If you do so, the progress made after the step to which the workflow is being reverted is lost irretrievably.

In the case of the workflow above, rolling back to the Translate step will remove all the data, LQA errors, comments, etc. that were created in the Correct and Review steps.


Duplicated segments in TM

To avoid saving duplicates in the TM, remember that it is not possible to avoid saving segments with the same source and target. Nonetheless, the most recent translation saved will overwrite the one you currently have saved in the TM if the following settings are enabled:

  1. Select ConfigurationSettingsTranslationTMMatches - general.

  2. Modify the existing TM record if the project segment has the same setting: select the Context and Tags checkboxes.

  3. Modify the existing TM record if the project segment with Segment ID has the same setting: select the Segment ID and Tags checkboxes.


Do not save segments in the TM

In XTM Workbench, there is an option a particular segment is not to be saved in the TM: Do not save translation in TM.

If you select this option, the segment will be marked distinctively with a crossed out TM (in the Status column).


Workaround to set not approved TM matches to Done

You can specify that not approved 100% matches are only to be confirmed in the first step but ICE matches are to be confirmed in all steps. To do so:

  1. Select ConfigurationSettingsTranslationTMICE matches/Leveraged matches.

  2. Modify the existing TM record if the project segment has the same setting: select the Context and Tags checkboxes.

  3. Modify the existing TM record if the project segment with Segment ID has the same setting: select the Segment ID and Tags checkboxes.


Saving the same TM with a different XTM Cloud customer

Follow the steps below to streamline the process for saving the same TM with a different XTM Cloud customer:

  1. Export the required TM from one XTM Cloud customer and then import it into the new customer. You can choose the project rom which that you want to export this TM. To exclude some of the TM entries, you can delete them directly, in the file.

  2. If you only want this TM to used in the relevant project, and exporting the TM from the original project would require too many changes in the file, you can also change the relevant segments directly in XTM Workbench. Once the change has been made, they will be saved for the new customer. You can easily change the statuses for all segments in the XTM Workbench settings:

  3. Select FileChange status.

  4. Select the Change segment status option.

  5. In the Change segment status dialog, change the statuses of segments from Incomplete (so that XTM Cloud will register a change) and then to Completed.


Translation and alternative translations

The way alternative translations are handled is determined by the global settings for your XTM Cloud instance:

  1. Select ConfigurationSettingsTranslationMatches - general.

  2. Modify the existing TM record if the project segment has the same setting: select the Context and Tags checkboxes.

  3. Modify the existing TM record if the project segment with Segment ID has the same setting: select the Segment ID and Tags checkboxes.

The translation and alternative translation have the same source, inlines, context and tags, so ​both of them will modify the existing TM if they are changed/confirmed in XTM Workbench. When the alternative translation is changed/confirmed after the main translation, it will override the translation.

The only way to disable this behavior is by disabling the settings mentioned above. As a result a separate TM will be created for the alternative translations. The consequence might be that, if XTM Cloud provides different translations for the same segment, the segment will not be modified. Instead, a new TM will be created, potentially causing unwanted creation of very similar TM entries.

The current translation can be changed from the alternative translation to the original one by changing to a different TM (by selecting TMManage, or by making changes directly in XTM Workbench. The change can be small, such as adding a letter, approving the change, and then reversing it. Then the original target will be saved, as it was the most recent one for this source, which has been changed.


Use not approved memory option

If the Use not approved memory option is selected in the global TM settings (ConfigurationSettingsTranslationTMMatches - general), not approved TM matches will be suggested in the Matches docked panel in XTM Workbench. if this option is not selected, these matches will not be suggested. It is advisable to enable this setting per project: the assigned Linguist can then decide whether to make use of it or not.

You can tell that a particular TM match is not approved when it has the following symbol in XTM Workbench (on the bottom right):


Repetitions and ICE matches

The first occurrence of a repetition is not marked as a repetition. This because a file is analyzed from top, the very beginning of the file, so the first occurrence is not a repetition at the time it is analyzed. When the analysis reaches the second occurrence of the same segment (and every subsequent occurrence) , that segment is marked as a repetition.

When the first occurrence of the repeated segment (the so-called parent) is translated, it is propagated to every other repetition. The repetitions are then assigned the segment status R to indicate that they have been populated on the basis of repetition matching. If the translation is changed, the change will either be propagated to other repetitions, or not, depending on the Linguist’s individual repetition update settings (SettingsSegmentsGeneralUpdate text and status of repeated segments in the current file).

If the parent is not translated and the Linguist translates any of the other repetitions instead, the parent will be assigned the segment status ICE or 100% status, depending on the context. This is caused by the fact that the parent was not marked as a repetition when the file was analyzed.

The best way to approach a file with repetitions would be to translate it from the first segment rather than from the middle. Furthermore, you should instruct your Linguists to ensure that they pay special attention to checking the parents of repetitions. Parents can be found easily, using a suitable filter – see screenshot below. To set the filter, a Linguist can select FilterApply filtersMatchRepeated segments: first occurrence and repetitions.


ICE match matched to the same segment

When an ICE match has been modified in a different project, its docid and tuid parameters are changed to match the file in which they were changed, enabling them to be matched to the segment in which they were originally created.

To configure this:

  1. Select ConfigurationSettingsTranslationTMICE matches/Leveraged matches.

  2. Set ICE matches from not approved TM to done setting: select the Default setting checkbox and All steps radio button.

  3. Set leveraged matches from not approved TM to done setting: do not select the Default setting checkbox but do select the First step only radio button.

If the project is ongoing:

  1. In XTM Workbench, select FilterApply filtersMatchSpecific match typesLeveraged, to only show Leveraged matches.

  2. Apply the filter.

  1. Select FileChange status on the top left-hand side of the screen.

  2. Change the status for Completed segments to Incomplete or Draft. Keep in mind that only the filtered segments that we applied in step 1 will have their status changed. The status will also be changed for “approved 100% matches”. Do not forget to apply the changes.