Inter-language Vector Space
Introduction
Inter-language Vector Space is the advanced neural network-based technology that is the foundation for XTM Cloud’s AI strategy, along with Neural MT. Inter-language Vector Space is a unique mathematical algorithmic approach to the advancement of language technology based on massive neural networks. Basically, it indicates the approximate closeness between distinct source and target words within a segment.
The technology is being used to enhance translator, reviewer and post-editor productivity when they work in XTM Workbench.
Inter-Language Vector Space supports these XTM Cloud-related operations:
Auto-insertion of inline tags.
Automatic bilingual term extraction.
Automatic subsegment matching.
Automatic fuzzy correction.
How does the Inter-Language Vector Space technology work?
Inter-Language Vector Space enables direct alignment at word and phrase level when a particular segment is translated in XTM Workbench. It is built on extensive resources obtained from Google and Facebook, with XTM Cloud contributing key elements from the multilingual L10N point of view.
The Inter-Language Vector Space technology is AI-based and draws on massive Big Data resources, including the resources of the entire Internet and XTM Cloud’s massive bilingual dictionaries, to calculate the probability of a particular target language word being the correct translation of a source word, for over 250 language pairs. The purpose of this technology is to aid Linguists in performing simple tasks by offering algorithm-driven automation, to improve their productivity and user experience.
Manual transfer of inline elements to target segments or bilingual term extraction take an inordinate amount of time and effort, negatively impacting Linguist productivity and creativity. For this reason, automatic insertion of inline elements and automatic bilingual term extraction, supported by AI-based Inter-Language Vector Space, help eliminate these mundane tasks in translation processes. This, in turn, brings quicker turnaround times, significant cost reduction and a marked increase in quality.
Finding equivalents of source language words and phrases if no dictionaries or existing translations are available is also a laborious and time-consuming activity. Since Inter-language Vector Space is based on extensive neural network analysis of the entire Internet, encompassing 150+ languages, it uses a new approach, according to which language is represented as a set of relationships between words and vector space points from one word to another, producing simple answers to relationships between words, such as king is to ‘man’ as ‘woman’ is to queen. As a result, Inter-Language Vector Space is able to automatically work out how words relate to one another.
It knows, for instance, that if king relates to ‘man’, the equivalent word for ‘woman’ is queen. The unique point of this technology is that it can identify these relationships across different languages.
Multilingual example:
王 (king in Japanese) is to ‘man’, as ‘woman’ is to 女王 (queen in Japanese).