Smartling's Context Capture JavaScript Library

Most modern websites have dynamic or interactive elements in their design. These elements often control what content is shown to a user based on things like logged-in state, previous visits to the site, or how they got to a page. This makes it difficult to ensure that all translatable content for that website has good visual context for use during the translation and review process.

Smartling's Context Capture JavaScript Library allows you to capture the visual context of any dynamic website or application. It automatically triggers context captures as the visual (rendered) state of a page changes while a user browses your website.

Internet Explorer 11 and earlier versions are not supported. The library automatically detects and disables itself on mobile browsers to avoid performance issues.

Important considerations before you start

While the JS Context Library is a powerful and beneficial tool for capturing dynamic content, it's important to understand how it's designed to work and which scenarios it is best suited for. Here are some things to keep in mind:

Sensitive or personal data

Caution: We advise careful consideration when using the JS Library on live production websites that handle significant amounts of sensitive or personal data.

While the library offers cleanupInputFields, contentFilter and domContentFilter to filter sensitive data, setting them up site-wide can become complex if there is sensitive data everywhere. So, if your site deals with a lot of sensitive information, it is better to limit the use of the library to safe website sections and pages. If fine-tuned filtering isn’t feasible, other context capture tools and integrations can be a better fit. Using staging or test environments for such goals is a recommended practice.

Heuristic algorithms and guarantees

Design philosophy: The JS Context Library employs heuristic algorithms as a core part of its design. These algorithms are optimized to efficiently capture relevant visual context from dynamic pages with minimal performance impact.

While highly effective for most scenarios, this heuristic approach means that the library prioritizes common user interactions and significant page changes. It may not capture every possible micro-state or guarantee that 100% of all possible visual context variations will be uploaded in all edge cases.

"Manual mode" (mode parameter) and "disabling signature calculation" (calculateHashsumsSignatureDisabled parameter) can be helpful to get more control over what is captured, but it still relies on the library's heuristics on the backend. Additionally, contexts that have no matched strings will be skipped.

Heuristic algorithms do not guarantee that every possible visual context will be captured, uploaded and stored. Contexts without matched strings will be skipped.

When high control is needed: If your project requires absolute certainty that specific, predefined context snapshots are captured and uploaded without deviation, or if you need precise, high-level control over the triggering and submission process, we recommend using custom programmatic solutions based on web browser automation tool such as Selenium, Puppeteer, Playwright, etc. and uploading the HTML snapshots via Smartling's API. This provides greater flexibility and control over the context capture process and allows control of which strings (from which jobs, files, etc.) will be associated with the context.

How the JS library works

Smartling's Context Capture JavaScript Library can be embedded in a web page to automatically send HTML snapshots of the current page state to a Smartling project.

The library is triggered when a page is rendered in a browser and also as the page UI is updated from user interactions (the DOM changes). You will need to integrate and use it in an environment that is generating page requests and interactions. While the library employs various optimization techniques to minimize performance impact, it's important to understand that capturing and processing DOM changes inherently requires computational resources. The library is suitable for use in development, staging, and sometimes production environments, though performance considerations should be evaluated based on your specific use case.

It is important to note that when it is implemented on your production environment, the HTML snapshots from your end-user sessions are sent to Smartling. This means a high volume of context could be generated and ingested in Smartling, and depending on what the end-user is doing on your site, it could include sensitive information, if applicable. This content will be visible in the Context Dashboard and Context panel in the CAT Tool. For this reason, we recommend that you implement the library in a development or staging environment so you can control what is ingested and visible in Smartling.

The unique Smartling Org Id will be visible to anyone who inspects your page.

The content you want to contextualize with the Context Capture JavaScript Library must exist in the Smartling platform before context can be associated with the strings using this tool.

You can submit content via Smartling's API, Global Delivery Network, and connectors or upload manually by dragging and dropping files into a project.

Configure the JS library

The library uses your unique Smartling Organization Identifier (orgId) to upload context for your project. To locate your Smartling Organization Identifier (orgId), go to Account Settings > API.

Embedding the library

JavaScript embedding

This is the preferred method of embedding the library. It works asynchronously and therefore does not block page loading.

Use the following snippet to load the Context Capture JavaScript Library.

(function (w, o) {
  try {
    var h = document.getElementsByTagName('head')[0];
    var s = document.createElement('script');
    s.type = 'text/javascript';
    s.async = 1;
    s.crossOrigin = 'anonymous';
    s.src = '//d2c7xlmseob604.cloudfront.net/tracker.min.js';
    s.onload = function () {
      w.SmartlingContextTracker.init({ orgId: o });
    };
    h.insertBefore(s, h.firstChild);
  } catch (ex) {
  }
})(window, 'YOUR-SMARTLING-ORG-ID-HERE')

Here is a full HTML example:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Example JS Snippet</title>
    <!-- Initialize the SmartlingContextTracker -->
    <script>
        (function (w, o) {
            try {
                var h = document.getElementsByTagName("head")[0];
                var s = document.createElement("script");
                s.type = "text/javascript";
                s.async = 1;
                s.crossOrigin = "anonymous";
                s.src = "//d2c7xlmseob604.cloudfront.net/tracker.min.js";
                s.onload = function () {
                    w.SmartlingContextTracker.init({ orgId: o });
                };
                h.insertBefore(s, h.firstChild);
            } catch (ex) {
            }
        })(window, "YOUR-SMARTLING-ORG-ID-HERE")
    </script>
</head>
<body>
<p>This is a sample of content</p>
</body>
</html>

HTML embedding

For an HTML-based page, just reference the script, and initialize the SmartlingContextTracker object:

<script type="text/javascript" src="//d2c7xlmseob604.cloudfront.net/tracker.min.js"></script>
<script>
    SmartlingContextTracker.init({
        orgId: 'YOUR-SMARTLING-ORG-ID-HERE'
    });
</script>

Here is a full HTML example:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Example HTML Snippet</title>
    <!-- Include script that creates SmartlingContextTracker object in global scope -->
    <script src="//d2c7xlmseob604.cloudfront.net/tracker.min.js"></script>
    <!-- Initialize the SmartlingContextTracker -->
    <script>
        SmartlingContextTracker.init({
            orgId: 'YOUR-SMARTLING-ORG-ID-HERE'
        });
    </script>
</head>
<body>
    <p>This is a sample of content</p>
</body>
</html>

Parameter reference

The Context Capture JavaScript Library can be configured for a variety of use cases through the use of additional parameters, specified in the initialization function.

All parameters except orgId are optional.

Parameter	Purpose	Valid Values
`orgId`	`REQUIRED` Smartling project to submit context to	Your Smartling orgID, found under Smartling Dashboard > Account Settings > API
`host`	`EXPERT` API host name	valid host name, default is api.smartling.com
`mode`	Operation mode toggles between "Auto Mode" and "Manual Mode"	`auto` or `manual`, default is `auto`
`requestTimeout`	Context submission request timeout	values `0 - 60,000ms`, default is `2000ms`
`onInit`	Callback function called when library initialization completes	Function `(enabled: boolean) => void`
`snapshotSizeLimitBytes`	If captured HTML page is larger than this number, submission will be skipped	max `10 MB` in bytes, default is `2 MB` in bytes
`instrumentationProcessingEnabled`	`EXPERT` Removes the invisible characters which may appear as squares in some browsers on Windows operating systems. If `true`, removes instrumentation marks and transforms HTML into WrappedHTML with `<smartling:edit />` tags	`true` or `false`, default is `false`
`overrideContextOlderThanDays`	`EXPERT` Override existing context older than specified days	positive number (recommended: 14-30 days)
`cleanupInputFields`	Whether to clear form input fields before submitting context	`true` or `false`, default is `false`
`contentFilter`	Custom function to filter content before submission	Function `(request, onSuccess, onError) => void`
`domContentFilter`	Custom function to filter DOM content before submission	Function `(HTMLElement) => HTMLElement`
`calculateHashsumsSignatureDisabled`	`EXPERT` The library calculates page content signatures to avoid duplicate submissions of identical content. If `true`, JS Library stops calculating page signatures, potentially increasing traffic for similar pages	`true` or `false`, default is `false`
`resourceFetchLimit`	`EXPERT` The number of missing resources to upload after the context is submitted. It might be useful when the resources (images, CSS, fonts) are not accessible from the public Internet or authorization is required.	values `1 - 100`, default is `1`
`traverseElementsLimit`	`EXPERT` The context submission is skipped if number of elements to process on the page is greater than this value.	values `0 - 50,000`, default is `20,000`
`childElementsLimit`	`EXPERT` The context submission is skipped if any DOM element has more children than this value.	values `0 - 10,000`, default is `2,000`

Context override behavior

The overrideContextOlderThanDays parameter allows you to automatically update stale context in your project. When set, any existing context older than the specified number of days will be replaced with newly captured context for matched strings.

Recommended values: 14-30 days are typically sufficient to keep contexts up to date. Using small values (1-3 days) is discouraged as it often masks underlying problems rather than solving them.

Modes

The Context Capture JavaScript Library offers two primary operation modes: auto (default) and manual. Understanding the difference will help you choose the best approach for your specific needs.

Auto Mode (Default)

In auto mode, the library automatically detects changes in the Document Object Model (DOM) and user interactions. It employs heuristic algorithms to decide when to capture and submit a visual context snapshot. This mode is designed to be efficient and capture relevant context with minimal configuration.

Manual Mode

When mode is set to manual during initialization, the library will not automatically capture context. Instead, you gain explicit control over when a snapshot is taken. To trigger a context capture in this mode, you must call the SmartlingContextTracker.captureContext() method.

When to use manual mode:

Manual mode is particularly useful in scenarios where:

Precise control is needed: You require absolute certainty that context is captured at very specific moments or after particular user interactions that the auto mode's heuristics might not prioritize.
Complex single-page applications (SPAs): In highly dynamic SPAs, auto mode might generate too many snapshots or miss crucial states. Manual mode allows you to integrate context capture directly into your application's state management or routing logic.
Avoiding over-capture: If auto mode is capturing context too frequently for your needs, leading to excessive data submission, manual mode allows you to limit captures to only the most important instances.
Integration with automated testing: You can use manual mode in conjunction with automated testing tools to capture context at specific points in your test scripts.

It's important to note that even in manual mode, the library still relies on its internal heuristics to determine if the captured HTML has translatable content before submitting it to Smartling. Contexts without matched strings will be skipped.

Content filters

While both contentFilter and domContentFilter allow you to modify content before it's submitted to Smartling, they operate at different stages and on different types of data. Understanding these distinctions will help you choose the right filter for your needs.

domContentFilter

Operates on: A clone of the live Document Object Model (DOM). This is a copy of the webpage's structure and content before it's converted into an HTML string.
Manipulation: Allows direct changes to this DOM clone. You can add, remove, or modify DOM elements and their attributes.
Return mechanism: You should return the modified DOM clone (HTMLElement object). If you return a falsy value (such as false or null), the context capture for that specific snapshot might be skipped.
Primary use: Ideal for structural changes or filtering based on the DOM structure itself. For instance:
- Removing irrelevant layout elements (e.g., headers, footers, sidebars, ad banners) that are not relevant for translation context.
- Hiding or removing dynamic elements like pop-ups (chatbots, cookie banners, GDPR modals), loading spinners, or temporary notifications.
- Isolating specific content sections, for example, if a page contains multiple languages and only one is needed for context.
- Enriching the captured content by adding contextual metadata (e.g., special attributes for translators) to DOM elements.
- Normalizing or cleaning up the DOM structure (e.g., fixing minor structural issues) before it's serialized.

Example:

SmartlingContextTracker.init({
    orgId: 'YOUR-SMARTLING-ORG-ID-HERE',
    domContentFilter: function(clonedDom) {
        // Find an element with the ID 'sensitive-data-section' in the cloned DOM
        const sensitiveSection = clonedDom.querySelector('#sensitive-data-section');
        if (sensitiveSection) {
            // Remove the element from the cloned DOM
            sensitiveSection.parentNode.removeChild(sensitiveSection);
        }
        // Return the modified cloned DOM
        return clonedDom;
    }
});

contentFilter

Operates on: The serialized HTML string. This is the text representation of the page's content after the DOM has been processed and converted to an HTML string. It also receives the page URL.
Manipulation: Allows modifications to this HTML string, typically using string manipulation methods (e.g., replace(), regular expressions). It cannot directly interact with or traverse DOM elements because it's working with a string.
Return mechanism: This is an asynchronous function. To proceed with sending the content, you must call the onSuccess(request) callback, passing the (potentially modified) request object.
Primary use: Best for text-based modifications on the final HTML output. This is useful for:
- Removing or hiding sensitive information from the HTML string (e.g., replacing email addresses, user IDs, or other private data with placeholders).
- Making global text replacements across the entire HTML string.

Example:

SmartlingContextTracker.init({
    orgId: 'YOUR-SMARTLING-ORG-ID-HERE',
    contentFilter: function(request, onSuccess, onError) {
        // Replace all occurrences of "secret-code" with "[REDACTED]" in the HTML string
        request.html = request.html.replace(/secret-code/g, '[REDACTED]');

        // Proceed with submitting the modified HTML
        onSuccess(request); 
    }
});

Limitations: contentFilter function can't point to other functions. The whole filtering should be processed inside of it. contentFilter function is executed in a browser's web worker and you need to consider these limitations (e.g., DOM processing is not available).

Control the library

The following methods can be used to temporarily enable/disable the library. Call them from JS console or your code.

SmartlingContextTracker.disable() - Disable capture indefinitely for current browser
SmartlingContextTracker.enable() - Enable capture library again
SmartlingContextTracker.init(options) - Re-initialize the library using different options, see parameter reference
SmartlingContextTracker.clearResourcesCache() - Remove any information about locally cached resourceIds
SmartlingContextTracker.config() - Display configuration information
SmartlingContextTracker.captureContext() - Trigger manual context capture (manual mode only)
SmartlingContextTracker.version() - Display version information

Limit source domains for context

In some scenarios, you may want to limit the domains where the JS Context Library can function and generate context snapshots. For example, you may want to generate context from a UAT (user acceptance testing) environment but not a development environment.

You can restrict the domains used to generate context by navigating to your project settings and entering the Context page (Settings > Contexts).

Within that page, you can change the mode of the JS Context Library. The two options are:

All domains except banned ones
In this mode, the library captures context on all domains unless you specify domains for which it will not generate context.
Only specific domains
This mode turns off the JS Context Library until you specify some allowed domains. Those allowed domains are the only domains where the library will capture context.

Note that changes to these settings can take up to 30 minutes to propagate and be applied.

Technical information

Performance

The Context Capture JavaScript Library uses advanced optimization strategies to minimize its performance footprint while maintaining robust functionality. Through careful engineering of page load behavior, UI thread utilization, upload bandwidth management, and DOM interactions, the library aims to balance comprehensive context capture with minimal impact on user experience.

While the library implements advanced techniques including heuristic algorithms and intelligent throttling to manage resource consumption efficiently, it's important to note that any additional JavaScript functionality will introduce some level of overhead. Our approach focuses on making this overhead as negligible as possible through strategic optimization.

Page load

To minimize the library's impact on page load performance, we've implemented several key optimizations:

Asynchronous loading - The library loads via a lightweight initialization snippet that fetches the main library asynchronously, ensuring it doesn't block critical rendering paths. This approach significantly reduces the impact on initial page load metrics, though legacy implementations using sequential <script> tags may experience different loading characteristics.
Optimized payload - At ~25KB, tracker.min.js maintains a compact footprint comparable to a typical image asset, though this still represents additional bandwidth that should be considered in performance-critical applications.
Global CDN distribution - Using CloudFront's edge network typically enables sub-20ms fetch times in optimal conditions, helping preserve browser connection limits. However, actual performance may vary based on geographic location, network conditions, and CDN cache status.

Our architecture ensures that if the library becomes unavailable, your page functionality remains unaffected. While we've optimized extensively to minimize load time impact, actual performance characteristics will depend on various factors including page complexity, user device capabilities, and network conditions.

UI thread management

Given that JavaScript execution on the main UI thread directly impacts user experience, with delays of 50-100ms becoming noticeable, the library implements several strategies to minimize thread blocking:

Event throttling - Mutation observers and event handlers are carefully throttled to prevent excessive processing
Non-blocking operations - Event handlers return control to the browser quickly, though some processing overhead is unavoidable
Web worker offloading - CPU-intensive HTML serialization tasks are delegated to background threads when possible
Efficient DOM cloning - While DOM cloning and instrumentation do consume resources, we maintain separation from the live DOM to prevent interference

Despite these optimizations, the library's operation—including DOM traversal, cloning, and change detection—does require computational resources that scale with page complexity.

DOM and script interaction

The script exposes the following identifiers to the global scope: SmartlingContextTracker.

The script also uses the slTranslate DOM node property as a custom property to track nodes that need text content matched.

Resource extraction and upload

Almost any HTML document used to represent a web page will have a number of associated resources (style sheets, images and fonts) linked to the document content. Capturing these resources along with HTML content is essential for displaying accurate context to translators.

The library captures resource snapshots by processing the uploaded HTML snapshot the following way:

CSS stylesheets, images and font links are extracted from the HTML snapshot.
Extracted links are rewritten to point to Smartling resource storage.
Actual resources are captured and uploaded to Smartling resource storage.

This way we can ensure that the captured HTML context will look exactly as it looked at the time it was captured, even if original style sheets, images or fonts are no longer available. Smartling uses several methods to fetch the resources, which cover most of the use cases. Smartling will not be able to fetch protected resources that reside on a different domain than the original page, unless CORS headers are properly configured for that domain.

Publicly available resources that reside on the same domain as the original page.
Protected resources (private network or firewalled) that reside on the same domain as the original page.
Publicly available resources that reside on a different domain than the original page.

Hey! Hoi! ¡Oye! Ciao ! 你好! Hallo! Salut ! Hey! How can we help?