In 2016, Jaap van der Meer, founder of TAUS, published a provocative piece titled “The future does not need translators.” He argued that as machine translation (MT) software gets better, its output will be good enough for most applications.
Even if machine translation isn’t yet perfect, it can already help humans translate much faster. The quickest way to more efficiently build localized products is to use MT that only requires modest editing by humans.
Note: This article won’t be addressing the disruption machine translation will bring to jobs or defining MT quality. What we will be doing is taking a look at the machine translation landscape to help you decide when MT is worth considering and which engines will work for your business.
What is machine translation?
As the name implies, machine translation (MT) is a process involving some kind of algorithm to perform translation and localization automatically rather than hiring a human specialist. Some even say that we don’t really need human translators anymore, but I’d say that’s a little far-fetched. MT has pros, but of course it has cons, too.
So, here are some good things about machine translation:
- It is very fast: you don’t need to wait hours and days for a specialist to finish their work.
- It is ridiculously cheap. You can translate small and medium texts free of charge using online services like Google Translate. Pricing for longer texts is roughly $0.001 per word, which means you’ll only have to pay 10 bucks for a 10,000-word article. That’s a nice deal in my book.
- There’s no need to spend time finding a suitable candidate and negotiating with them. All you have to do is press a button and wait a while.
- MT engines support many languages, including some that are not very widespread.
However, MT engines do have their drawbacks and you should be aware of them:
- The first and the biggest problem is translation quality. Unfortunately, while neural networks are becoming more “clever” and powerful, the final result might be very far from ideal, especially if you need to translate a complex text with lots of special terms. Machine-generated content is usually suitable for translating short and simple texts, or to translate low-visibility or low-importance content. On the one hand, it is better to have at least some translation than no translation at all. On the other hand, the translated text might convey a totally different message than the original, so you should always keep that in mind.
- Some MT engines have strict limitations and cannot translate texts beyond these. For example, Google MT currently doesn’t support texts that are longer than 5000 characters.
- If your source text contains placeholders or HTML tags, neural networks might break these elements or even translate them. For example, Google MT might add extra spaces, remove markup elements, and so on.
A brief history of machine translation
In the 1950s, Léon Dostert, the lead scientist at IBM, predicted that a fully realized MT could be achieved within just three to five years. More than 70 years have passed, and while we’ve seen remarkable advances, we have yet to experience a complete MT engine.
Here’s how MT developed slowly, step by step, and then all at once:
Notice how MT has gone from terrible to usable faster than ever in the last decade thanks to neural machine translation (NMT).
The three main approaches to machine translation |
|
The beginning (1950s–1980s) | Rule-based machine translation (RBMT): Also known as the “classic approach, it uses algorithms (semantic analysis) to create grammar, syntax, and phrases from one language to another. |
The evolution (1980s–2015) | Statistical machine translation (SMT):Finds existing translations using parallel texts and pattern-match references. |
The paradigm (2016 and beyond) | Neural machine translation (NMT): This is the most sophisticated of the three and where AI truly makes an appearance. This type of software becomes more sophisticated the more text it translates. You’ll see that most machine translation services today use this technology. |
With the growing power of processors, access to increasing volumes of data, and techniques to mimic our brains through deep learning and neural networks, Ray Kurzweil, Director of Engineering at Google, predicts that by 2029 MT will be good enough to replace most human translators.
The machine translation landscape today
The machine translation software market is growing. Due to the demand for companies to localize their products and improve customer experience, the market is expected to reach a value of $7.5 Bn by 2030. In this new economy, there’s no doubt that MT will be an integral part of any global product.
Why are more companies increasing investment in MT vendors and in-house MT programs? The obvious reasons are speed and cost efficiency, but here’s a more comprehensive list:
- Reduce costs by 30–70%
- Faster turnaround for high-volume translations (a translator can double output while maintaining good quality if he or she switches to post-editing of machine translated texts)
- Increase the number of supported languages
- Reach zero backlogs in the localization department
- Make end users happier with higher quality machine-translated content within customer support, user-generated content, and more
- Increase the share of raw MT projects vs. human translation or MT post-editing (MTPE)
- Help translators achieve equally high productivity and transparency on their MT usage
When it comes to machine translation, not all texts are created equal. Below are the MT engines worth considering (and when).
The best machine translation software
Spoiler: There’s no “best” MT system. MT performance depends on how similar your data (language pairs and domains) is to the data used to train the vendors’ models.
However, with a growing number of MT software engines on the market, you’ll be wondering: ‘which are actually worth considering?’
Luckily, Intento just released their “The State of Machine Translation 2022” report. It’s 63 pages long. Do we read 63-page PDFs? That’s a resounding “No way!”
Do we let the good people of Lokalise’s content team read it and then summarize it into a few bullet points? “Yes we do!” Alright. Here’s what they said that got our attention:
1. They evaluated 31 MT engines and found 16 statistically significant leaders in the market.
Diving deeper into performance for specific content domains and language pairs, they found that:
2. Most engines perform best in English to Spanish (LATAM), Portuguese (Brazil), French (European), and Chinese (Simplified).
3. Google and DeepL dominate more or less all language pairs in the survey.
But just looking at language pairs isn’t enough. The report also looks at how well each MT engine performs across nine different domains.
4. MT engines performed best for user-generated content, product information, technical manuals, documentation, live chat, customer support material, knowledge bases, and software UI.
5. In the healthcare, legal, IT, and financial domains, very few MT vendors perform well. These domains have a lower tolerance for translation mistakes, so a poorly performing MT engine might end up costing you more time and money than anticipated. If you need to use MT in these domains, consider a custom engine and use translation memories combined with heavy human post-editing.
6. Domains like colloquial texts and entertainment show even lower scores across the board. It’s obvious why – computers lack a sense of humor.
So what’s the most effective machine translation software?
Translation quality is a hot topic in localization. But when you think of the word quality, what does it mean to you?
You might say “it works or it’s effective.” Someone else might say it’s “well built.” Another might say quality = perfect. You get the picture. The truth is, quality means striking a balance between resources and results while meeting requirements. With that in mind, here are the two most effective MT solutions:
1. Google Translate has got a bad rap in the past – and it’s not entirely fair. While you should never fully rely on translations from Google, it does still offer a simple and affordable way to translate digital content into more than 100 languages.
2. DeepL is a popular tool that currently supports neural machine translation into 29 languages. The translations often sound more native than some other MT options. You can always go back and forth between DeepL and Google MT to see which option suits you better.
At Lokalise, we integrate with Google Translate and DeepL. Combine any of these general MT engines with translation memories (which we’ll get into in the next section) and human post-editing and you’ll get a lightning-fast, cost-efficient workflow.
Note: If machine translation isn’t an option for your business, we have a framework for selecting language service providers, a list of reliable translation companies, and tips for effective collaboration.
When should you use machine translation software and why?
Companies often spend thousands of dollars on copywriting and content creation in English (or another original language), but then neglect customer experience in foreign markets by not prioritizing translations that live up to the same expected standard.
If you want to create content that resonates with your foreign audiences, you need to ensure that translators have the skills and tools needed to make the right choice when it comes to wording, sentence structure, and overall flow. That can mean direct human translation or at least some level of post-editing for transcreation.
Here are the three main content buckets that will give you a general sense of when to use machine versus human translation:
1. Low visibility + low importance → Machine translation
Content like international documentation, customer support articles, and FAQs don’t require perfect style and consistency. They need to be effective, meaning your customer can access the necessary information to solve their problem.
For most content of low visibility and importance, MT will likely suffice to get the job done. We’ll cover how you can incorporate it into your workflow below.
2. High visibility + high importance → Human translation
For high-visibility/high-priority content like website copy, marketing or advertising copy, and sales collateral, you’ll want to make sure that you pay attention to every single detail and consider your company’s specific style guide and tone of voice.
This type of localization requires translators to demonstrate creativity as the words need to be just as impactful for the target audience as they are in the source text. These “few words” could make or break a promotional campaign if not translated properly.
Note: For content in highly regulated environments like fintech and life sciences, where information, wording, and accuracy are vital, there is simply no wiggle room in the margin of error. Use professional human review.
3. All other content → Machine translation + human translation / post-editing
For (most of) your other content, the key is to balance your budget, technical capabilities, and quality standards.
Do you have a low pass score for language quality to meet your company’s standards?
If so, you can use MT and in-house resources to get feedback from internal stakeholders.
Are you primarily translating simple text such as buttons or other product UI needs?
If yes, then MT is a cost-effective solution that can work well when combined with partial human QA.
Prioritize time in areas that build foundations, such as optimizing tooling, and areas that evaluate the results of that work, like receiving feedback.
Integrating machine translation into your translation workflow
Cloud-based translation management systems have made it easier than ever to integrate MT into your translation workflow while ensuring quality translations.
We’re not bold enough to make the claim that your brand can do without human translators. That said, Lokalise makes it easier to manage your translation workflow by providing the necessary functionality to automate the bulk of your translation work using MT, translation memories, and glossaries. Before we get into the first steps to using MT in Lokalise, let’s define what these are.
- Translation memory (TM): A database of sentences, or segments of text, and their translations that can be automatically reused when translating similar or identical content. Everything that you (or any other team members) type in the editor, upload, or set via an API is saved automatically for future use.
- Glossary: You’re likely familiar with the term glossary—it’s a list of words, and their meanings, relating to a specific subject. Along with your style guide, a glossary is a core component of the language assets that you will need to keep terminology consistent and lower the risk of incorrect translations. With Lokalise, you can set up a glossary for your projects to define and describe certain words. You can also set whether a term is translatable or case sensitive.
First steps to machine translation in Lokalise
To start a machine translation project in Lokalise:
1. Choose a machine translation engine.
2. Upload the file you want to translate.
3. Start translating. You will see the MT suggestions, and the MT result will be automatically inserted into the text box where you can edit it.
Note: Everything that you type from MT in the editor is automatically saved into the translation memory for future use.
4. You can post-edit the machine translation suggestion, rewrite it entirely, or simply accept it.
5. Once you’re done translating a key, Lokalise will automatically save this translation option to the translation memory. Later, when you’re translating the same phrase for the same language, Lokalise will give you a handful of inline suggestions from the TM.
Pro tip: If you’re using design tools like Figma or Sketch, you can use our integration and populate your designs with MT to check how the design changes based on target language(s). This is a simple way to test the waters with design-stage localization (a powerful way to continuously release fully localized products like mobile apps, web apps, and games).
Beyond localization: MT for customer support, user-generated content, and more
Customer support
It’s been proven again and again. The benefits of supporting your customer in their own language are undeniable. Not only do 42% of Europeans say that they won’t buy products or services in other languages, but 86% of buyers state that they’re willing to pay more for great customer experience. And there is plenty more to support these claims.
One strategy is to handle customer support requests in other languages by using machine translation tools – copying and pasting customer questions, copying and pasting agent replies – but that doesn’t little more than disrupt the agent workflow and increase average interaction handle times.
But if you’re a small team, how will you be able to handle all this? Here’s how our small support team answers every query in 1 minute, in 108 languages.
User-generated content
Nevertheless, there are more ways to add value to the user experience.
Airbnb collaborated with Translated (a Lokalise partner) to build a machine translation engine that automatically translated user-generated content, such as reviews, at scale. Airbnb said in their press release that MT had improved the quality of more than 99% of the listings available for the platform’s top 10 languages.
By making information available at scale to all customers, you ensure that they make informed decisions and, ultimately, are happier with their choices. What was once inaccessible to a large part of the customer base, has now been made accessible thanks to localization.