This article is for anyone who is using the Global Delivery Network.
The Smartling Global Delivery Network (GDN) is a translation proxy tool that enables you to offer translated websites and web applications without the need to internationalize your site or host and manage translated content within your systems. It also allows you to modify translations and have those changes reflected on your production site within seconds. To understand how it works, let’s first look at how web browsers display websites.
How Browsers Display Websites
Simple Web Page
When you ask your browser to display a web page by entering the page address (URL), the browser opens a communication channel with the server where the website is hosted, then sends a request asking for the page specified in the URL path (e.g. ‘/products/product_A.html’). Upon receipt of the request, the server sends a response back to the browser over the same communication channel. The response contains the web page content, which the browser then displays on your computer.
Simple Web Page with GDN
With the GDN, the same steps are followed with one key difference: the communication channel passes through the GDN. The browser sends the request to the GDN, which forwards it to the web server. The server sends the response back to the GDN, which then forwards it back to the browser.
This enables the GDN to modify both the request and response as they pass through it. In particular, it enables the GDN to substitute in translations in place of the original language content sent in the response from the server. Thus, a web page that leaves the web server in English might arrive at the web browser in French.
The diagram below depicts a typical GDN setup for an English (not translated), and French (translated) website:
This approach, in which the GDN sits between the browser and the web server is called proxying. Only translated pages go through the GDN translation proxy.
Link Rewriting
Once a user has reached a translated site delivered by the GDN, you need to keep the user on the translated version. To do this, the GDN uses a technique called "link rewriting". Links pointing to the source site are updated automatically to point to the translated version of the site.
For example, a link to www.company.com/about might be updated to fr.company.com/about on your French site that's using the GDN. This ensures that if the user clicks the link, the request will be sent through the GDN, and they will receive the translated version of the "About" page. The GDN has default link rewriting behaviors, and these can be customized to your specific needs.
Content Capture and Translation
When the GDN is processing a response from the server, its main job is to replace the source-language content with the corresponding translations. To do this, it must extract the translatable content from the source page, retrieve the associated translations for this content, and insert the translations into the page before sending it on to the browser.
For HTML content, the GDN automatically extracts certain content from the page and breaks it into separate translatable pieces called "strings", based on block-level tags within the HTML. In the example below, the two sentences will be captured as two separate strings for translation in Smartling.
<div>
This will be ingested as <span>one</span> string with
some <a href="/a.html">inline tagging</a>.
</div>
<p>
This will be a second string.
</p>
This is because the <div></div> and <p></p> tags are both considered "block-level" tags in HTML, whereas the <span></span> and <a></a> tags found in the first sentence are considered "inline" tags. Inline tags will be visible to Translators as part of the content to be translated, and the translation tool will ensure that these tags are appropriately included in the translation as well. Details on which tags are considered block versus inline, and how to change this behavior can be found in our article on Changing content segmentation.
In addition to the content within block-level tags, the GDN will automatically capture separate strings for translation from the following attributes:
'title' for all tags
'alt' for <img> & <input> tags
'label' for <optgroupt> tag
'content' for <meta name="keywords"...>
'content' for <meta name="description"...>
Additional attributes can be added to this list via a configuration change in your Smartling dashboard.
For translatable content stored in JavaScript, JSON or other formats, the GDN will extract content that is explicitly identified for translation through the use of special directives. This approach is taken so as not to break any code that is dependent on the values of this content. Additional details on processing this type of content is found in Capturing JSON Content and Capturing JavaScript Content.
Once the content is extracted, the GDN searches its translation memory for matching translations in the language being served. If a match is found, it is put into the page in place of the original source language content.
If no matching translation is found, the new content without translations is automatically sent into the Smartling translation workflow to be translated (usually by human Translators). As soon as the translation is published in the Smartling workflow, it will automatically be included in any responses that contain it, and thus will appear on the translated web page.
Images, CSS, and other Resources
Most web pages incorporate other types of content outside of HTML, such as images, style information (CSS) and JavaScript, which are also required for the browser to display the page, but which are not included directly in the first response. Instead, the first response back to the browser usually contains some HTML content along with a list of additional content that is also needed before the page can be displayed. The locations of this content are included as URLs in the initial page response, allowing the browser to follow the same process of opening a connection to the server where the resource is located and requesting each additional resource required to display the page.
Since these additional items often do not contain translatable content, they do not need to pass through the GDN, and by default will bypass it. When they do contain translatable content (for example, in JavaScript or JSON files), the requests to retrieve them, and the associated responses are routed through the GDN so that that content can be translated. The bypassing of the GDN for resources which don’t require translation is achieved by "absolutizing" the URLs for those resources. For example, a relative image URL like '/images/im.jpg', which would normally be requested from the same location as the translated page in which it's found, would be modified to 'www.company.com/images/im.jpg'. This would cause it to be requested from the source site, bypassing the GDN.
Thus, the initial page and all required resources can be loaded by the browser, using the GDN where needed, such that all required content is translated.
How Requests Get Sent to the GDN
There are two ways in which requests can be made to pass through the GDN: one is based on the Domain Name Service (DNS), and the other is based on having an additional proxy sitting in front of the GDN. Let’s look at the DNS approach first.
In order for the browser to establish a communication channel with a web server, it must first obtain the unique address of the server. This address is known as the IP address, and it is obtained using an internet service called a DNS, which associates the name of the server with its IP address. When the browser wants to open a connection to www.company.com, it first asks the DNS for the IP address of www.company.com, and then opens a connection to the server with that address. In order for the browser to open a connection with the GDN instead, the address stored in the DNS for the server name must be set to that of a GDN server.
The DNS-based approach is used when the translated websites can have a different server name from the source-language site. For example, if the French site were called fr.company.com, then the DNS would be configured to associate fr.company.com with the IP address of a GDN server, and when requests sent to this server are received by the GDN, they would then be forwarded to the appropriate source-language server, e.g., www.company.com. Having a different name for the translated server allows requests for untranslated pages to go directly to the source site, bypassing the GDN.
When the translated websites have the same name as the source website, such as when a sub-folder is used to specify the language (e.g., http://www.company.com/fr-FR), then the DNS-based approach is not as suitable. This is because the browser will send all requests to the same server, including requests for both translated and untranslated pages. While the GDN does support such a configuration, it is usually preferable to only send requests for translated web pages to the GDN, and to send requests for untranslated web pages directly to the source site.
When the translated and untranslated pages use the same server name, an additional server needs to sit between the browser and the GDN to route requests for translated pages to the GDN. This additional server decides on which requests to forward to the GDN based on some other information in the request, such as the language folder included in the URL. If a Content Delivery Network (CDN) is in use, it can often perform this task. Alternatively, many web servers can provide the same functionality. Depending on the type of server used by the source site, that server itself could route the request to the GDN, which in turn routes it back to the source site, requesting the source-language page rather than the translated page. In each of these scenarios, the server in front of the GDN is also acting as a proxy.
Additional information on choosing among these options is available in Choosing a Domain or Routing Strategy and Localized Domain and Routing Options.
Avoiding Source-Language Bleed Through
While content is still being translated, the source language content will be displayed on the translated page instead of the translation, which can result in the translated page having a mixture of translated content and source-language content. A number of approaches are available to avoid source language content appearing on a live translated web page. The most common approach is to access the content on a staging server through the GDN before it goes live. This allows the new content to be discovered and submitted for translation so that a translation is available by the time the content goes live. The other approach is to use the GDN’s translation caching feature, which continues to display the last fully translated version of a page until all translations are available for the new version of the page.
Customizing the Behavior of the GDN
While the default behavior of the GDN may address all your needs, there could be areas where you need to customize the behavior to match your requirements more precisely. Here are some examples of when you might want to modify the default behavior of the GDN:
- Excluding certain parts of your site from translation, for example, user-contributed content.
- Disabling link rewriting so that certain links always point to the source site.
- Translating specific content found in JavaScript or JSON files.
- Removing portions of your content from the translated versions or the site.
The GDN provides a number of different approaches to this type of customization. For more information, see Methods to Control Global Delivery Network Behavior.
Visual Context for Translators
Visual context is a powerful capability of Smartling, which enables Translators to produce higher-quality translations in the first translation step. With the GDN, visual context can be captured automatically for HTML content. For content stored in JSON and other formats, Smartling provides a variety of additional methods for capturing the related visual context for your web content.
Search Engine Optimization (SEO)
The GDN is effectively invisible to search engines, and thus it does not have any special effect on SEO. Content delivered via the GDN is indeed fully accessible by search engines. The various translated sites served through the GDN look like separate sites to search engines, each are indexed and are shown in search engine results.
However, it is still important to give careful consideration to SEO to ensure you obtain satisfactory SEO results for the translated versions of your website. In addition, it may be necessary to tune GDN behavior to meet your SEO requirements, such as choosing the appropriate server name strategy and ensuring that certain page metadata is appropriately translated.
The GDN fully supports sitemaps.xml and robots.txt, which provide additional instructions to search engines on how to browse the site, including what to ignore. It is possible for your web administrator to explicitly set certain pages on the GDN-powered site to be ignored by search engines.
Furthermore, the GDN properly handles "hreflang" tags, which tell the search engines that specific sites are the multilingual versions of sites.
If your SEO practitioners have specific thoughts on how subdomains (fr.example.com), ccTLDs (example.fr), and subdirectories (www.example.com/fr) impact SEO, the good news is that any of the above can be configured for individual language sites with the GDN. The GDN also serves multilingual sites via cookie, without a unique URL. All of this is configurable in the dashboard. Usually, this decision is less about SEO, and more about what ccTLD domains you already own, what branding strategies you want to use, for instance.
Tip: For more information, read our documentation on optimizing your GDN sites for SEO.
Performance
Localized page requests are broken down and delivered in approximately 40 milliseconds on average - a delay which is not noticeable to end users.
Thinking of using the Global Delivery Network in your international strategy? See Planning a Global Delivery Network Strategy.