Tutorials

A Beginner’s Guide to Python I18n

Internationalization, shortly known as i18n, is the process of adapting your software to support various linguistic and cultural settings. It should not...

Written by Shaumik Daityari · 6 min read >
Beginner's Guide to Python I18n

Internationalization, shortly known as i18n, is the process of adapting your software to support various linguistic and cultural settings. It should not only translate instructions and messages but also take into account varying user experience across societies. This beginner’s guide on Python i18n helps you get started with popular modules and common tasks to internationalize your application.

Internationalization is a continual process, which forms an integral part of software development. If your end-users belong to a variety of cultures, you should incorporate i18n processes in your software development life cycle, like security and accessibility. Here are the typical considerations during the process of Python i18n:

  • Ability to switch the locale in your application
  • Achieve Unicode compliance
  • Translate strings depending on the locale
  • Support for bi-directional languages
  • Handle time zones with respect to the end-user

Let us begin working on this Python i18n tutorial.

Prerequisites for Python i18n

There are certain prerequisites that you should be aware of before making your application friendly to the process of i18n. This section is specifically important if you are translating an existing application. These guidelines should also be followed if you are starting a project, which complies with i18n principles.

First and foremost, make sure that the output of no string is hard-coded, but rather returned through a special function. This special function will determine any pre-processing and translation that is applicable to the text before display. We will shortly look at this process when we will be discussing packages.

Next, you should enable Unicode in your application. This is done by default in Python 3. Here is a guide to working with UTF in Python 2. Unicode is the set of all characters, numbers, currencies, and symbols in the world. It goes beyond the ASCII standard for character representation and is necessary to support different languages.

Finally, a majority of tasks in this tutorial would be solved through the Python gettext module. Make sure that you have the package installed by importing it in Python:

import gettext

If the above import throws up an error, you need to install the package through an installer:

pip install gettext

Let us begin with our basic gettext Python tutorial.

Simple Translations With the gettext Python Module

To use various forms of the same string in your application, import the necessary function to wrap your string around from the gettext module. This is the special function that we highlighted in the previous section, that changes your text based on the locale.

# Import gettext module
import gettext

# Set the local directory
localedir = './locale'

# Set up your magic function
translate = gettext.translation('appname', localedir, fallback=True)
_ = translate.gettext

# Translate message
print(_("Hello World"))

We first set up the local directory in the example, which would contain all translated strings. The special function of gettext, renamed as _(), looks up for the given string in your translation catalog stored in the locale directory.

Working With POT Translators in Python

Once you have reformatted the output of all strings, you need to provide translations too. Typically, the first step in this process is to generate a master list of all strings to translate. This master list is stored in a text file called Portable Object Template (POT) file, also known as a translator. Here’s how a typical POT file looks like:

#: src/main.py:12
msgid "Hello World"
msgstr "Translation in different language"

To create translations in different languages, add multiple directories in your locale directory and save the corresponding POT files in them for the gettext module to search from.

Handling 8-bit Strings in Python

If you are migrating from an earlier version of Python, you may face a mismatch between ASCII and UTF encoded strings. Here is a simple example:

>>> unicode_string = u"Fuu00dfbu00e4lle"
>>> unicode_string
Fußbälle
>>> type(unicode_string)
<type 'unicode'>
>>> utf8_string = unicode_string.encode("utf-8")
>>> utf8_string
'Fuxc3x9fbxc3xa4lle'
>>> type(utf8_string)
<type 'str'>

Python would throw an error if you mix objects of str and unicode type. The correct way to deal with this issue is to use the .decode() method of the str class to match them:

>>> unicode_decoded = utf8_string.decode("utf-8")
>>> unicode_string == unicode_decoded
True

Handle Time Zones in Python

The easiest way to handle time zones in Python is through the pytz package. At the same time, a web framework like Django may have an built-in time zone handler. To get started with pytz, install the package using a package installer.

pip install pytz

Once you have successfully installed the package, you can use it to convert date-time objects to various timezones.

from datetime import datetime
from pytz import timezone

format = "%Y-%m-%d %H:%M:%S %Z%z"

timezones = ['America/Los_Angeles', 'Europe/Madrid', 'America/Puerto_Rico']

for zone in timezones:
    now_time = datetime.now(timezone(zone))
    zone_time = now_time.strftime(format)

The timezone() function of the pytz package converts any datetime object to a different timezone. Additionally, the pytz package also provides translations for the timezones into local languages.

Pluralization With Python-i18n Package

While the gettext module provides a good starting point to get started with Python i18n, you may want to explore additional packages to support more features. The python-i18n is a library that provides similar functionality as the Rails i18n library.

Use pip to install the python-i18n library:

pip install python-i18n

The simplest, though not efficient, way to use this package is the following:

import i18n
i18n.add_translation('String in Language 1', 'String in Language 2')
i18n.t('String in Language 1') # String in Language 2

Use the add_translation() function to store a translation, and use the t() function to translate it. YAML and JSON files can be used to permanently store these translations.

en:
  str1: String in Language 1

Save it as translate.en.yml. Next, you can load it using the i18n library and utilize the translation:

i18n.load_path.append('path_to_translation')
i18n.t('translate.str1') # String in Language 1

The python-i18n package provides an easy way to implement pluralization, as explained in its documentation:

i18n.add_translation('unread_number', {
    'zero': 'You do not have any unread mail.',
    'one': 'You have a new unread mail.',
    'few': 'You only have %{count} unread mails.',
    'many': 'You have %{count} unread mails.'
})

i18n.t('unread_number', count=0) # You do not have any unread mail.
i18n.t('unread_number', count=1) # You have a new unread mail.
i18n.t('unread_number', count=2) # You only have 2 unread mails.
i18n.t('unread_number', count=15) # You have 15 unread mails.

Bring It All Together: Switch Between Multiple Languages

This tutorial has enabled you to handle specific translation tasks in Python. In this section, we will work to bring it all together and handle switching between multiple languages.

Ideally, you should define a handler, which sets the state of certain variables. These state variables should then power each Python i18n task that we have discussed earlier. Let us demonstrate this with an example where the timezone is changed as we switch between British English and Indian English.

LOCALES = ['en-uk', 'en-in']

TIMEZONES = {
    'en-uk': 'Europe/London',
    'en-in': 'Asia/Kolkata'
}

def change_locale(locale):
    if locale not in LOCALES:
        raise NameError
    LOCALE = locale
    TIMEZONE = TIMEZONES[LOCALE]

First, define your resources for each locale in Python dictionaries. When a locale change event is triggered, call the function change_locale(), which will set your timezone to the corresponding locale. The function also raises an exception if the locale isn’t available in your application. You can similarly change other settings such as translation location within this function.

Challenges in Python i18n

While we have seen a basic implementation of i18n in Python in this beginner’s guide, making your Python application compliant comes with its fair share of difficulties.

A challenge that you may have faced earlier in this tutorial is to automatically create a master list of strings to translate. You will need to use a tool such as Babel to help you automatically collect all strings in your application that need to be translated.

Yet another concern that you may face is when you have to switch between left to right (English) and right to left (Arabic) languages. The only workaround to this issue is using key-binders and cursor repositioning.

Translation With Lokalise in Python

While managing your Python i18n tasks completely from scratch is doable, it is advisable to use a translation management system. Lokalise allows you to manage your translations through a central dashboard. Here are the steps that you need to follow:

  • Before using the services, sign up for a free trial.
  • Login to your account.
  • Create a new project and set a base language.
  • Upload your translation files, and edit the translations as required.
  • Lokalise provides translation services too, go to the order page to get a translation quote.

While Lokalise allows you to manage your translations through a GUI, it has a CLI tool that you may use to automate the process of managing your translations. While a pre-compiled library for Python is unavailable, you may still use the library as a command-line tool from your server. The documentation lists the complete lists of commands that you can use to manage your translation project on Lokalise.

Final Thoughts on Python I18n

In this guide, we helped you get up to speed on Python i18n and explained how you can internationalize your Python application. We covered the built-in gettext Python module and explored the changes that you would need to make to your application to adhere to i18n principles. Further, we explore the python-i18n package, with emphasis on its support for pluralization. We concluded the tutorial with a discussion of some challenges that one may face during the process of Python i18n.

What package do you use for Python i18n? Do let us know in the comments below!

Written by Shaumik Daityari
Shaumik is a data analyst by day, and a comic book enthusiast by night (or maybe, he's Batman?) Shaumik has been writing tutorials and creating screencasts for over five years. When not working, he's busy automating mundane daily tasks through meticulously written scripts! Profile