What affects the time that the project analysis takes? An issue with projects getting stuck in analysis

Introduction

Project analysis is a complex process in which multiple actions take place consecutively in the background. To learn about it, we recommend that you read this article: A step-by-step description of project analysis.

You might sometimes wonder why analysis of your newly-created project takes so long, in the sense that you might be under the impression that the project in question has been stuck in the process. Keep in mind that there are quite a few factors that directly contribute to how long the analysis process takes. See all of them listed in the next section.


Project analysis factors

Number of source files

To state the obvious, the more source files are added to a single project, the longer it takes to process them, which causes a significant increase in project analysis time.

If you deal with a multitude of source files (hundreds or even thousands of files), we always recommend that you break them up into smaller chunks and create numerous separate projects. This will not only help project analysis to finish more quickly but will also prevent individual jobs from piling up in the project workflow in a project, which might ultimately lead to the entire project editor freezing in the XTM Cloud UI.

Large source files

Although the number of source files determines the overall size of the content that needs to be processed, an individual file can also be enormous and it affects the time taken for project analysis in the same way as a huge number of files do.

Following the same steps as in the case of multiple source files, it is best to minimize the number of large single files in a single project, simply by keeping them in separate projects. Alternatively, a file like this can be split into smaller parts and uploaded to a project separately.

Use of machine translation engines (MT)

Many clients are unaware that use of machine translation at the project creation stage can also prolong its analysis time. This is because, after the analysis filter has been used to extract the file’s translatable content, each segment is also sent off to an external provider for MT matching. That just adds to the overall analysis time.

There is even an information message in the project workflow in the XTM Cloud UI at that time:

There is no particular way to speed things up other than just to wait patiently until the whole process is finished. You might also want to use the solutions suggested in the above cases, i.e. splitting up numerous source files into smaller projects.

Large translation memory (TM)

Similarly to the sending segments for MT matching, they are also matched internally against the client's TM, and this adds yet another action that is performed during the analysis stage. The more extensive a translation memory is (i.e. the more TM entries exist in a database created for the customers whose resources are applied to a project), the longer it takes to actually match all the segments against TM entries.

It is good practice to minimize the number of TM entries that can be included in the matching process during project analysis by carefully specifying which match types are to apply in segments at global or project level.

Another solution is to reduce the number of customer TM and terminology resources during project creation:

Adobe InDesign files

While the majority of files are immediately converted into XML format in XTM Cloud, for IDML & INDD files, the process is a bit different. Generally speaking, they are converted on an InDesign server. An InDesign server is a dedicated, modern, standalone date server in Windows systems. The majority of clients use InDesign, which is connected to XTM Cloud.

That being said, as that many clients might be translating InDesign files at the same time, those files are queued, and the server resources might not be able to process all of them as seamlessly as might be expected. This in turn, results in longer analysis times in the XTM Cloud UI.

Number of target languages

An increase in the number of target languages increases the project analysis time exponentially. Keep in mind that, with every new target language added to a single project, the number of source files to be analyzed increases proportionately to the number of target languages present in that project. For instance, if there is a project with two source files and two target languages, the total number of files to be analyzed is four (two files for each target language).

In the case of a project with multiple source files and target languages, the solution is to create multiple smaller projects instead, each with only one target language.

XTM server variant

The project analysis time also depends on the type of server in which a particular client is installed (see Public Cloud, Private Cloud, Suite – server differences for more information).

For example, project analysis might take longer for a Cloud-based client because the XTM Cloud server is a multi-tenant system in which resources are shared across all clients. This might lead to a variation in performance, and also project analysis, at peak times. The XTM Cloud system administration team constantly monitor and optimize the performance of the servers.

PVC clients, on the other hand, benefit from faster analysis completion due to exclusive system resources.

Multiple projects created at the same time

Last but not least, the prolonged analysis time might stem from the fact that the project in question is simply “waiting” its turn in the analysis queue, especially if multiple projects were created at the same time. Therefore, if it is possible, and projects are not created via API but in the XTM Cloud UI, we highly recommend creating projects one by one – waiting for the first project to finish analyzing, and then creating the second one.

Analysis/decoration of terminology

The time analysis of a particular XTM project needs to take might also depend on the actual volume of terminology that you apply to your project. Keep in mind that the process of analysis and subsequent decoration of terminology in XTM Workbench is as important as the analysis of translation memory and might as well take as much XTM resources as the latter.