Multi-language Production

25 Feb 2016

Production, ePub3, Digital Content, Multi-language

Multi-language digital content production with languages by paragraph is now a real thing.

Multi-language digital content production with languages by paragraph is now a real thing. For publishers interested in the potential and possibilities of multi-language digital books (and textbooks in particular) we have created a first test/demonstration ePub3 book.

As mobile devices and the Internet penetrate the globe LOTE (Languages Other Than English/European) digital strategies become very important both in production and content delivery.

Looking Back

We had created a multi-language book in 2012 with each language in a separate section just to understand the ability of AZARDI in particular (the first ePub3 reading system available) and other reading systems in general. You can download that test book here if you are interested:

Around the World in 28 Languages

The content for this is sourced from the Universal Declaration of Human Rights translation site. The interactive map on this page is particularly revealing on language diversity around the world and at 400+ they have only translated 10% of all languages.

Time moves forward. Needs change. We have now assembled the content using the new IGP:Digital Publisher multi-language parallel paragraph production tools as test cases.

The Process

Govind Satpute (he reads and writes three languages and speaks four) has been working hard developing, using and testing the interface and process with the development team.

We now have the new optional multi-language tools built right in IGP:Digital Publisher. The multi-language set-up heart looks like this. How does it work?

The language processing interface. The first language is processed and then paragraph stubs are generated for each additional languages. Any language can be the stub source for any new language. There is no source or master language.

Start with any language in IGP:Digital Publisher. So first, a single language is fully tagged and ready for release. It can be any language. It doesn't have to have any language attributes.

Next the source language is processed with IDs and language attributes. It now looks like this. 

Everyone has the right to life, liberty and the security of person.

Now any other language can be processed in as a set of stems with HTML5 [lang] attributes and language pattern IDs

Everyone has the right to life, liberty and the security of person.

Chinese stem

Marathi Stem

Finally a production editors drop the "other language" content into place, or if appropriate a translator works directly in the interface. All languages inherit the elements, blocks and styling from the source language so there is no further styling or work to be done. IE. do the production in one language, then add any number of languages and they inherit the first language production tagging. This makes it very fast and easy to produce multi-language documents.

Everyone has the right to life, liberty and the security of person.


प्रत्येकास जगण्याचा, स्वातंत्र्य उपभोगण्याचा व सुरक्षित असण्याचा अधिकार आहे.

(Note: No language fonts are provided in the web page with this demo. If you can't see the languages you do not have suitable fonts on your operating system.)

As an example it takes only 10-15 minutes to get a full language into the Universal Declaration of Human Rights from the source. (Of course the translations already exist!)

A screen grab of IGP:Writer with three parallel language paragraphs assembled together. Language colours are used to make it easier for production and Quality Control inspection.

In IGP:Writer we apply different colours to each language to make it easier for the production editor and Quality Control editor to see the paragraph language stack. There is no limit to the number of languages that can be included in the parallel paragraph stack.

IGP:FoundationXHTML is the powerful heart of the XHTML5 tagging of IGP:Digital Publisher. New demands like multi-language textbooks makes it clear that HTML5 is the answer as digital content production challenges become more complex. To even attempt to do this with an XML first or CMS system would cost a development fortune and probably never see the light of day.

Output Processing

Of course production exists to create output formats. With IGP:Digital Publisher the content can be produced as PDFs, eBooks, Static Sites and more. Our language processors enable this multi-language parallel paragraph content to be produced in any way required for any format.

  • Multi-language interactive books (the main purpose of the system)
  • Separate language books in any format package.
  • Books with any combination of languages (not all languages have to be packaged) in any format package
  • Content processed as separate language sections or separate language books in one package.

Get the ePub3 Book Here

You can view and download the ePub3 package of this book here:

Living in a Multi-language World Volume 1

You can also view it in AZARDI ePub3 Online Reader here.

AZARDI Bookstore. ReadAnywhere

To read online: Select Books | Multi language | Click the cover | Click Preview Online

In any AZARDI ePub3 Reading System you can use the language selector at the start of any multi-language section to select your language preference.

Each section that has multi-language parallel paragraphs will display the Language Selector at the top of the section. Just select your language preference and get reading. Note that we have left English metadata in the title block. This is a test book!

Unfortunately while ePub3 and the draft ePub3.1support different languages there is no strategy for multi-language book metadata. This means you have to choose "A" language for the metadata to be presented in Reading Systems. We will customize multi-language metadata for AZARDI but it is unlikely the IDPF will be interested in this content delivery model any time soon as it does not affect major commercial trade publisher or US education publisher models.


We have a set of test book Volumes planned focusing on languages by continent/region. The reason for these books is to assess production and delivery challenges for each language; and learn to address the challenges of different combinations of language forms such as alphabets, agubida scripts, logograms and calligraphics.

We are now building more templates and automation into the system, moving ahead from the lessons learned to day. We also have independent translation interfaces on the drawing board so that can take place from anywhere around the globe by suitable translators and reviewers.

Look out for more information

Posted by Govind Satpute, Richard Pipe

Related Articles

Multi-language e-Books

comments powered by Disqus