New Google Cloud service allows you to translate documents while preserving the formatting

by Pisana Ferrari – cApStAn Ambassador to the Global Village

Google Cloud has recently announced the general availability of Document Translation, a new feature of Translation API Advanced, that lets you translate documents in formats such as Docx, PPTx, XLSx, and PDF while preserving document formatting. This includes right-to-left language support for PDFs, preservation of font size, font color, font style, and hyperlinks for native PDFs.

“Formatting matters”, reads the Google Cloud article announcing the new service. Google points out that in many cases, the layout of a document dictates how it should be interpreted. For example, readers navigate text and discern meaning based on formatting, like bold or italicized text, or markups for headers, paragraphs, and columns. Until now the translation of documents required that text was separated from the layout attributes, with the document’s structure either lost or recreated after the text translation. This required translation teams to do a lot of extra work and maintain a lot of additional code. But now, those steps are unnecessary, Google claims. Formatting can be retained throughout the translation process, handled directly by the Translation API Advanced.

cApStAn co-founder Steve Dept shared the news on his LinkedIn blog saying he did not feel too comfortable sharing it as he still advocates complete separation of text from layout attributes as good practice in translation of web-based materials: “let translators focus on the text, not on getting the layout right”. If exporting from a platform in say XLIFF or JSON allows processing text only (layout attributes are tagged) and then importing the translation seamlessly into the native platform, he says, you will still need an optical check, but you will have used your resources efficiently. Well, let’s go and test this an see, Steve adds.


 “Google Document Translation Now Generally Available”, Renato Losio, InfoQ, November 14, 2021

“New features for translating content globally”, Sarah Weldon, Google Cloud, November 1, 2021


Photo credit Shutterstock