Organisations around the world are expected to spend around $1.5bn per year on machine translation by 2024, and machine translation adoption is on an upward trend. With all the hype, it is important to understand the key concepts related to machine translation, and this article aims to explain the most frequently used terms.
As a freelance linguist, you might want to ask some questions before accepting a post-editing project from one of your clients, so you know exactly what you are signing up for. The problem is that the answers can often be filled with machine translation jargon, which is not always easy to understand.
As a first step, it is useful to understand some general principles:
- Machine translations are generated by machine translation engines or models.
- Providers like Google, Microsoft and DeepL offer generic machine translations via well known public-facing services.
- There are lots of services and technologies available that allow you to load up your own data, in the form of translation memories, and create custom machine translation engines. Google and Microsoft both offer this as well as the generic service.
- The generally accepted principle is that if you have large enough translation memories in a specific subject area and language pair, it is possible to use these custom services to create domain-specific machine translation engine that will outperform the publicly available generic services from Google, etc.
With the above in mind, one of the first questions you might have for any client asking you to post-edit machine translations, is how the machine translations were generated – by a generic system, or a custom system trained in the specific subject matter?
Types of MT systems
It’s also useful to understand the different types of machine translation systems that your clients might be using to generate machine translations. There are two dominant kinds currently used – statistical machine translation (SMT) and neural machine translation (NMT).
Statistical machine translation systems are trained to translate by being fed large amounts of bilingual human translations, typically voluminous translation memories. The vast majority of modern SMT systems are phrase-based, which means they focus on analysing sequences of words that they find in the bilingual sentences they analyse, using statistical techniques.
Training of SMT systems can take some time, and can be calibrated in detail. Once trained, the SMT engine will “guess” the translation output, based on statistical probability.
SMT systems typically perform well in terms of accuracy and allow significant control over terminology, but the MT output produced by them is prone to sound less fluent compared to neural machine translation systems. Therefore, output from an SMT engines might need more polishing in terms of fluency and style.
Neural machine translation is a newer approach but it also heavily relies on bilingual corpora for initial training. The structure of NMT engines is based on the human brain, where information is sent to various “layers” to be processed before output. Those engines use deep learning techniques to teach themselves to translate different texts based on existing statistical models.
The key difference between SMT and NMT is that the former requires regular human manipulation and manual re-training of the engines in order to update, while the latter is able to use algorithms to learn linguistic rules on its own based on newly added data. That means that unlike SMT engines, NMT ones do not need to undergo lengthy re-trainings when there are new translations to learn from.
NMT engines can be improved on the fly, quickly, and automatically.
Output from NMT systems often sounds more natural than that from SMT systems but can contain issues such as omissions or additions – it is usually more fluent, but that might trick a reader into believing it is a good translation when it is wildly different from the original language in terms of content.
You can read more about the differences between statistical and neural machine translation, in this post.
You may also hear machine translation being described as adaptive. If an MT model is adaptive, it means that it has the ability to learn automatically and in real time from the corrections being made by the post-editor and also adapt to the content being translated using any translation memories provided by the users.
It is crucial for linguists to know if they are working with adaptive or non-adaptive MT systems as it will directly impact the suitability of the MT output in terms of terminology, style and tone of voice required by each client. If the MT system is not adaptive, they may need to make the same corrections over and again inside the same document.
When talking to LSPs, linguists might often notice different metrics that are designed to provide an indication of the quality of the machine translation output. BLEU score is not only one of the most popular but also one of the oldest ones in use.
BLEU score is an automated form of evaluation calculated for individual translated segments, usually sentences, by comparing them with a set of good quality reference translations. Then the scores are averaged for the whole corpus to provide a global idea of the translation’s estimated quality.
BLEU score is always a number between 0 and 1 and it indicates how close the assessed machine translated sentences are to the reference texts. Therefore, values closer to 1 will mean more similarity between the MT output and the reference translations. The higher the score, the better the quality of the machine translation is estimated to be.
This automated evaluation method is quick, cost-effective and can be easily applied to vast amounts of data. However it does not always fully align with human editors’ subjective perception of the quality.
A useful productivity measurement is post-editing distance, which can be particularly helpful when machine translation is used by linguists as a productivity tool. It is typically expressed in a percentage value between 0 and 100% applied to entire documents. The percentage indicates what portion of the original machine translation output required some level of editing to be brought to publishable quality, as assessed by human editors. The lower the post-editing distance, the less editing and therefore less effort was required.
If you ever have any doubts about the machine translation-related terminology that Toppan Digital Language uses, please feel free to reach out to our Project Managers or Account Managers and they will happily explain.