Introduction
This article describes all the specifics of how ICU plural syntax can be processed in XTM Cloud, what is possible, and what is not.
What is ICU plurals syntax?
While the English language only has two forms (one – month, other – months ), other languages can have up to 6 forms (e.g. Polish: one – miesiąc, few – miesiące, many – miesięcy, other – miesiąca). The ICU plurals syntax allows you to encode multiple versions of a particular sentence, depending on the value of a number variable. This is especially useful in the localization industry.
Example syntax looks as follows:
{num, plural, zero {Selected {num} items} one {Selected {num} item} two {Selected {num} items} few {Selected {num} items} many {Selected {num} items} other {Selected {num} items}}
The first argument (“numeral” in this example) is the name of a variable. Based on the value of this variable, the application interpreting the syntax will select the appropriate version of the sentence.
The second argument (“plural” in this example) informs that the variable holds a numeric value. Alternatively, a “select” argument can be used, which specifies that the variable holds a text value corresponding to a keyword.
Following the two arguments there is a list of keywords: zero, one, two, few, many, other when using plural, or a custom set of keywords when using select.
{gender, select, female {She is here} male {He is here} other {They are here}}
Keyword “other” must always be present, as it is the one used by default.
Each keyword is followed by a sentence in curly brackets. These sentences can have variables in curly brackets, but do not have to have them. You can use '#
' as a number variable in plural (instead of writing the full name in curly brackets). If you do not want something (e.g. curly brackets) to be treated as a syntax element, you can escape them using an apostrophe.
You can find out which plural form keyword corresponds to which numbers in a given language here: Language Plural Rules.
While this syntax allows nesting, it is recommended to limit it if possible.
It is also recommended to always have a full sentence inside plural syntax, instead of just the word that changes in your source language. This is because the target languages may need to change more words within the sentence to properly translate it.
These recommendations (explained in more detail) can be found here: Formatting Messages.
When creating an ICU plural message, you can use Online ICU Message Editor to check if it works as you expect.
How does XTM Cloud handle it?
A ticket for the XTM Support team needs to be created to request a configuration that processes the syntax efficiently, optimizing the translation process for you.
JSON files
In JSON files, we can activate our special ICU plurals parser, which interprets the syntax and adds or removes plural forms for translation, depending on the target file.
The parser (usually a project level filter template) needs to be selected before uploading a file (during the project creation, or before uploading a new file to the project), as the number of plural forms is being adjusted during the initial analysis of a given file.
The number of target plural forms is preconfigured based on the documentation for cardinal plural rules for languages: Language Plural Rules. If needed, the default plural forms can be changed for any given language.
To help with translation, we can extract the plural (or select) form name as part of the segment ID, which improves matching and ensures the linguists know which form they are translating.
As with any JSON file, other metadata can also be extracted.
As of now, when JSON is processed with ICU parser enabled, you need to avoid the following:
Putting variables in a double curly bracket (
{{name}}
), even outside ICU syntax.Using characters other than letters, numbers, underscores, and commas inside variables in curly brackets.
This is because the whole file is being read with ICU parser, and these kinds of variables cause syntax errors during the analysis.
XTM Cloud does not yet support numbered plural forms (a plural version chosen for a specific number, e.g. =7 days
can be called a week, or =0 days
can be called “no days” instead of “0 days”).
When a numbered form is used in the source file, it will not be returned into the target file.
Files other than JSON
In files other than JSON, there is no ICU syntax parser available yet in XTM Cloud. The XTM Support team can only set up custom variables to convert the syntax elements into inline tags.
Thanks to this, the syntax is partially protected from modification, but the inline tags must be carefully placed, to ensure everything works as expected.
The target file will have exactly the same plural forms as the source file has, which can only be fixed by linguists adding the missing forms manually (requiring syntax knowledge). Furthermore, the segmentation will not be ideal, as it often leaves all the forms translated within one segment.