The life cycle of a translation resource
One of the core features Transifex provides is handling files with translatable content (resources) in various localization formats, like XML or PO files.
Part of that functionality is to be able to import such files to its internal storage and export them, whenever the user requests them, either to ship them with his software or to translate the file to another language with his local computer. Although both operations might use some customized code to handle certain formats (especially for importing resources), there are specific steps that are followed in each case.
Importing a file
Whenever you upload a file with strings in it, Transifex will try to parse it, extract the necessary information and then store that information in the database.
Parsing
Since each format is different, there are specialized parsers for each
one. In some cases, Transifex uses a third-party parser, like
polib for PO files. In other
cases, we have developed custom parsers.
Extracting the information
The main responsibility of a parser is to extract the necessary
information from the imported file.
In case the file is the source file (that is, it is the file with
the strings in the source language), we are interested in three
things:
- The keys for the translatable strings (like the
msgid
entries in
a PO file). The keys are used to uniquely match the strings in the
source language with those in translations. We also generate a
unique hash for each key as an identifier. - The translatable strings in the source language, if there are any
(like themsgstr
entries in PO file). These are the actual strings
of the source language. - The template of the file. The template is a skeleton of the source
file: it is mostly the same, except that the translatable
strings have been replaced with the hashes of the corresponding
keys, acting as placeholders. This is necessary for the export
operation.
In case the file is a translation of the resource in a language, we
are only interested in the translations (this means that any changes
in the file are ignored).
Storing
As soon as we have the necessary information from the previous step,
we store it in the database as source entities, translations and
templates.
Exporting a file
Whenever a user asks to download a translation file in a particular
language, the file has to be exported from the database.
The procedure is quite standard for all formats. After fetching the template and the
translation strings in the requested language, we do a
search-&-replace in the template, replacing the hashes in it with the
actual strings that correspond to each hash.
Next, any format-specific operations are performed (like adding the
translator copyrights in PO files) and the result is delivered to the
user.
You can find more details for the storage engine of Transifex in the
docs.