Preparing & Translating Supported File Types

PDF Files

Extension .pdf
Smartling Identifier .pdf
Resources Not applicable

It's best to acquire the file in its original format, such as one of the supported file types (most commonly, InDesign or Microsoft Word), and translate that file. Smartling handles a PDF file by automatically converting it to a Microsoft Word Document. No integration directives are available.

Preparing PDF Files for Translation

Use the original file when possible

For the best results, Smartling recommends using the original file if it is available in a format that Smartling supports. For example, if the original document was actually a Microsoft Word or InDesign document, using the file in its original format typically produces better translation results than using a PDF version of the same document.

If translating infographics, user interfaces, or other highly stylized or formatted content, it’s always better to use a supported file format. Smartling has supported a number of design tools for such use cases.

Sometimes the PDF documents you want to translate were not created using a supported application or the original file just is not available.  Some PDF documents might even be "scans" or "images" of written documents for which there is no original digital file at all. This is a good time to take advantage of Smartling's support for PDF documents.

As with any other document, make sure the language the document is written in aligns with the source language of the Smartling project where it is uploaded.  When you upload a PDF document here is what you and your translators can expect:

Translating PDF Files

Ensure to create a Files Project for file translation management.

Once you're ready to translate the file, create a Job. To get an idea of what the layout and display of the translated file will be, you can download a pseudo translated file. From here you can decide if any adjustments are necessary to the source content. 

When translations are complete, download the published translations to your locale drive.

Smartling handles PDF documents by first converting them to Microsoft Word format. Translation then follows the standard flow for Word documents. As such, when the translation is completed you will get back a Word document, not a PDF.

As a convenience, we will “attach” the PDF file that you upload as a reference to the converted document, allowing translators to download and review it. The converted document will use the same file name as the PDF with a  “docx” extension appended to it. Visual context in the Smartling CAT tool will depend on the layout of the converted Word document. 

Native PDFs vs. scanned documents

A native PDF is one that is created by an application using digital source content and is ‘saved’, ‘printed’, or ‘exported’ as PDF from a software application.  A scanned document is produced by a document scanner or digital camera; it’s effectively an “image”.

The strings and formatting of native documents should be highly accurate after conversion. The formatting and layout of the converted document should be fairly similar to what you visually see in the PDF. This includes headings and titles, lists, paragraphs and even tables. As with all our standard supported file formats, text that is embedded in images will not be available for translation. Only the native text in these files is extracted when they are converted to Word documents.

Scanned documents will be processed using OCR to extract the text. Layout is not retained for such documents. The extracted content will be simply formatted as a series of paragraphs.  The strings in the document may not be accurate compared to what you believe the content is in the original PDF file.

Ensure the PDF page size is within the 22-inch max on each side of the page. You make need to resize larger files, such as posters.

Previewing the strings and formatting

After you upload a PDF and Smartling has converted the content into a Word document you may want to review the strings or the Word document before authorizing translation.  This is an opportunity to review the quality and accuracy of the strings as well as the formatting are suitable to begin translation.  If you find that the content is not ready for translation you can download and edit the converted Word document then re-upload it before authorizing it to be translated. You can do this in the Smartling project or job. Download the converted Word document in the source language. Alternatively, you can review the strings in the source language in the Smartling strings view, but if you want to make changes you'll need to download and edit the file.

Post translation formatting - AKA: Desktop Publishing (DTP)

While not unique to PDF documents, DTP may be more important than for other file formats after translation is completed.  As a best practice for PDF; first make sure the content strings that are extracted from your document are accurate and complete before authorizing translation, as noted above.  Don’t worry too much about formatting at that point. After the translations are complete is a good time to make final adjustments to formatting and layout if it’s important for your document.

Was this article helpful?