Methods for Capturing Visual Context

JavaScript Context Capture Library

When you upload resource files to Smartling there isn’t any visual context for the strings that are extracted. If your strings are for a web site or application you can quickly and easily add HTML context by integrating the JavaScript Context Capture Library. Not familiar with visual context or don’t understand its value? Read more about visual context.

How the JS Library Works

Internet Explorer 11 and earlier versions are not supported.

Smartling’s Context Capture Library is a JavaScript library, which can be embedded in a web page to automatically send HTML snapshots of the current page state to a Smartling project.

The library is triggered when a page is rendered in a browser and also as the page UI is updated from user interactions (the DOM changes). You will need to integrate and use it in an environment that is generating page requests and interactions. The library has been calibrated to minimize any impact on page performance. You can use it on your development, staging or production environment.

The unique Org Id will be visible to anyone who inspects your page.

The content you want to contextualize with the Context Capture Javascript Library must exist in the Smartling platform before context can be associated with the strings using this tool.

You can upload your content via Smartling's API, Global Delivery Network, Connectors, or manually by dragging and dropping files into a project.

Configure the JS Library

The library uses a unique Organization Identifier (orgId) to upload context for your project. The identifier is available as follows:

  • In the New Experience, from a project, go to the gear icon and select Project Details
  • In the Classic Experience, go to Project Settings > API.

Embedding the Library

JavaScript Embedding

For a Single-Page Application or any other JavaScript-based page, use the following snippet to load the Context Capture Library:

JavaScript:

    (function (w, o) {
      try {
        var h = document.getElementsByTagName('head')[0];
        var s = document.createElement('script');
        s.type = 'text/javascript';
        s.async = 1;
        s.crossOrigin = 'anonymous';
        s.src = '//d2c7xlmseob604.cloudfront.net/tracker.min.js';
        s.onload = function () {
          w.SmartlingContextTracker.init({ orgId: o });
        };
        h.insertBefore(s, h.firstChild);
      } catch (ex) {
      }
    })(window, 'xbPHx3Meq7RDCMuPKxeb7w')


Here is a full HTML example:

<!DOCTYPE html>
<html lang="en">
<head>
     <meta charset="UTF-8">
     <title>Example HTML Snippet</title>
     <!-- Include script that creates SmartlingContextTracker object in global scope -->
     <script src="//http://d2c7xlmseob604.cloudfront.net//tracker.min.js"></script>
     <!-- Initialize the SmartlingContextTracker -->
     <script>
            SmartlingContextTracker.init({
                 orgId: 'jq9ftDO23Kn5U8GMnfAX0w',
            });
     </script>
</head>
<body>
     <p>This is a content</p>
</body>
</html>

HTML Embedding

For an HTML-based page, just reference the script, and initialize the SmartlingContextTracker object:

JavaScript: 

<script type="text/javascript" src="//d2c7xlmseob604.cloudfront.net/tracker.min.js"></script>
<script>
           SmartlingContextTracker.init({
              orgId: '[your orgId]'
          });
</script>

Here is a full HTML example:

<!DOCTYPE html>
<html lang="en">
<head>
     <meta charset="UTF-8">
     <title>Example HTML Snippet</title>
     <!-- Include script that creates SmartlingContextTracker object in global scope -->
     <script src="//d2c7xlmseob604.cloudfront.net/tracker.min.js"></script>
     <!-- Initialize the SmartlingContextTracker -->
     <script>
             SmartlingContextTracker.init({
                 orgId: 'jq9ftDO23Kn5U8GMnfAX0w',
                 host: 'api.smartling.com'
             });
     </script>
</head>
<body>
     <p>This is a content</p>
</body>
</html>

Parameter Reference

Context Capture Library can be configured for a variety of use cases through the use of additional parameters, specified in the initialization function.

    SmartlingContextTracker.init({
      orgId: 'xbPHx3Meq7RDCMuPKxeb7w',
      host: 'api.smartling.com',
      mode: 'manual | auto',
      requestTimeout: 5000,
      snapshotSizeLimitBytes: 5000000,
      resourceFetchLimit: 50,
      instrumentationProcessingEnabled: true,
      overrideContextOlderThanDays: 1,
      clearInputFields: true,
      domContentFilter: (dom) => {
        const inputs = dom.querySelectorAll("input");
        inputs.forEach(input => input.value = "");
        return dom;
      },
      calculateHashsumsSignatureDisabled: false
    });

All parameters except orgId are optional.

Parameter Purpose Valid Values
orgId Smartling project to submit context to Smartling orgID
host API host name, defaults to api.smartling.com valid host name
mode Operation mode, defaults to auto auto or manual
requestTimeout Context submission request timeout, defaults to 2000ms positive number
snapshotSizeLimitBytes If captured HTML page is larger than this number, submission will be skipped, defaults to 2Mb in bytes max 40mb in bytes
instrumentationProcessingEnabled Removes the invisible characters which may appear as squares in some browsers on Windows operating systems. If true, removes instrumentation marks and transforms HTML into WrappedHTML with <smartling:edit /> tags, defaults to false. true or false
overrideContextOlderThanDays If specified, submitted context will override any existing context for matched strings, which is older than specified number of days positive number
clearInputFields Whether to clear form input fields before submitting context, defaults to false true or false
domContentFilter If specified, should be a function that returns it's first parameter. The parameter passed is the cloned DOM, which can be manipulated and filtered, if needed. Function (HTMLElement) => HTMLElement
calculateHashsumsSignatureDisabled  If true, JS Library stops calculating page signatures, potentially increasing traffic for similar pages, defaults to false true or false 

Control the Library

The following methods can be used to temporarily enable/disable the library. Call them from
JS console or your code.

  • SmartlingContextTracker.disable() - Disable capture indefinitely for current browser
  • SmartlingContextTracker.enable() - Enable capture library again
  • SmartlingContextTracker.init(options) Re-initialize the library using different options, see [parameter reference]
  • SmartlingContextTracker.clearResourcesCache() - Remove any information about locally cached resourceIds
  • SmartlingContextTracker.config() - Display configuration information
  • SmartlingContextTracker.captureContext() - Trigger manual context capture (manual mode only)
  • SmartlingContextTracker.version() - Display version information

Capturing Context

Once the library is embedded on your site, context will be uploaded to your Smartling project. The package has been carefully designed to minimize impact on page load times. 

Technical Information

Performance

The Context Capture Library is carefully designed to have the least possible
performance and memory impact on the page. Taking into account performance factors
such as page load, ui thread, upload bandwidth and DOM interaction, the library
implementation is carefully constructed to avoid any issues with performance or browser
support.

Page Load

The following steps were taken in order to make sure the script doesn't impact page
load times:

  • Asynchronous loading Unless the page has older design pattern, where scripts are loaded sequentially via <script> tags (see HTML Embedding), the Context Capture library is loaded via a tiny snippet, which, in turn, loads the actual library asynchronously, without any impact on page loading time.
  • Small footprint tracker.min.js is about 50k, smaller than most images.
  • Fetch time The context capture library script is edge­cached on Amazon
  • Cloudfront CDN, and takes under 20ms to fetch. This allows the browser to complete requests fast, without exhausting limited number of concurrent resource requests. tracker.min.js is served to your visitors from a system of strategically positioned servers around the globe, which offers both fast loading and better availability.

The Context Capture Library will not slow your page ­load time, and, if it becomes unavailable for any reason, it will not impact your page.

UI Thread Utilization

The browser's most precious resource is CPU time spent in the main UI thread. All
javascript is executed on this thread and any delay of between 50ms and 100ms will be
noticeable to the user. The Context Capture Library avoids these "hangs" by leveraging
user event processing in conjunction with built­in page mutation events and throttling
the event handlers with the corresponding interval. All event processing is non­blocking
and control from the Library is returned as fast as possible to the browser event loop.
The only CPU intensive task in the thread is updating the document elements that have
new or changed visible text. We clone the document structure and instrument cloned
elements with appropriate CSS classes, leaving original DOM intact. The HTML of the
cloned DOM tree is then sent toweb worker ­ a separate background thread running in
the browser.

DOM and Script interaction

The following identifiers are exposed to the global scope by the script: SmartlingContextTracker.

SmartlingContextTracker

The script also uses the slTranslate DOM node property as an expando to track nodes
that needs text content matched.

Resource extraction and upload

Pretty much any HTML document that is used to represent a web page will have a
number of associated resources (style sheets, images and fonts) linked to the document
content. Capturing these resources along with HTML content is essential for displaying
accurate context to translators.

We do our best to capture resources snapshots by processing the uploaded HTML
snapshot the following way:

  • CSS stylesheets, images and font links are extracted from the HTML snapshot.
  • Extracted links are rewritten to point to Smartling resource storage.
  • Actual resources are captured and uploaded to Smartling resource storage.

This way we can ensure that the captured HTML context will look exactly as it looked at
the time it was captured, even if original style sheets, images or fonts no longer
available. Smartling uses several methods to fetch the resources, which cover most of
the use cases. Smartling will not be able to fetch protected resources that reside on different domain than the original page, unless CORS headers are properly configured for that domain.

  • Publicly available resources that reside on the same domain as the original page.
  • Protected resources (private network or firewalled) that reside on the same domain as the original page.
  • Publicly available resources that reside on a different domain than the original page.

Was this article helpful?