Extension | .xml |
Smartling Identifier | xml |
Example File | |
Resources | XML Standards |
Smartling supports XML 1.0 files by processing text within specified tags and attributes. You must specify the tags and attributes you want translated using the translate_paths
directive.
Visual Context
Visual context is not automatically provided for XML files. To provide visual context for your XML file, you can upload a context file manually, or integrate with Smartling’s JavaScript Library, or the Context Capture Google Chrome Extension, or an API.
An alternative method of providing visual context is to allow translators to view the raw XML file as context. Large files can split into chunks of strings or kilobytes by your Smartling Representative.
To enable raw file image context for a project:
- Go to Settings > Contexts
- Under Use file content as visual context, switch XML on
Keys-Variants
Key and Variant metadata must be enabled and configured using the translate_paths
and variants_enabled directives.
Here is an example of using the directive to keys from the XML file:
Directive: <!-- smartling.source_key_paths= <path> -->
Example: <!-- smartling.source_key_paths= body/{trans-unit.id} -->
File:
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!-- smartling.source_key_paths= body/{trans-unit.id}
-->
<body>
<trans-unit extradata="field-_yoast_wpseo_metadesc" resname="field-_yoast_wpseo_metadesc-0" restype="string" datatype="html" id="field-_yoast_wpseo_metadesc-0">
<source>This is a source string</source>
<target>This is the translation</target>
</trans-unit>
</body>
This would grab the ID from the trans-unit element to be used as a key, as shown:
Specifying Paths
Some directives require you to specify a path or set of paths to keys or strings in the file. A path is a slash-separated string which uses an Xpath-like syntax (although not all features of Xpath are supported). The node separator is always / (slash).
To specify an attribute, use dot notation: node.attribute.
To specify paths based on an attribute value, use the syntax /node[@attribute="value"]
.
For the translate_paths
directive, ending the path with a trailing / (slash) will also translate all child nodes.
Wildcards are not supported in path definitions.
Example XML file
<data>
<string name="home-button">Smartling Hotels</string>
<string name="back-button">Back</string>
<localize name="navigation">
<string>Browse Hotels</string>
<string>About Us</string>
<string>Site Map</string>
</localize>
<localize name="description">
<string>An excellent budget hotel in New York City</string>
</localize>
<images>
<img src="/img/0156849.png" title="Bedroom - Basic Suite"/>
<img src="/img/0156849.png" title="Bathroom - Basic Suite"/>
</images>
</data>
-
data/localize/string
- will match “Browse Hotels”, “About Us”, “Site Map” and “An excellent budget hotel in New York City”. -
data/images/img.title
- will match “Bedroom - Basic Suite” and “Bathroom - Basic Suite”. -
data/localize[@name="description"]/string
- will match “An excellent budget hotel in New York City.”
HTML-like Files
Some XML files closely resemble HTML files and are more effectively translated by parsing them as HTML files. Smartling allows you to specify HTML as the file type when uploading an XML file in the dashboard to cope with this type of XML file. If uploading via API, give HTML as the Smartling identifier for the file.
Managing Untranslated Strings
If using File API to download Custom XML files from Smartling, the parameter includeOriginalStrings=false
can be set to return an empty string if no translation is available. By default, Smartling returns the original source string if no translation is available, meaning the parameter is set to includeOriginalStrings=true
XML Characters
The following XML characters are always escaped. You can control this by using the entity_escaping directive.
XML provides a base level of escaping, which this directive does not affect. This directive can only be used to control second-level escaping. There are two levels of escaping for XML: the first level refers to native XML, where all base characters are escaped to generate valid XML. The second level is a functional one, which can be controlled using this directive. See example below.
Character (character name) | Escape sequence |
< (less-than) |
< |
> (greater-than) |
> |
& (ampersand) |
& |
' (apostrophe or single quote) |
' |
" (double-quote) |
" |
Directives
File directives are supported, both inline and via our API. Directives are specified in comments within the files, in the following format:
Inline File Format
<!-- smartling.[directive_name] = [value] -->
API Parameter
smartling.[directive_name] = [value]
Here are examples of supported directives for XML:
Path To Translatable Strings
Directive |
translate_paths |
Value | Comma-separated paths to source tags |
Smartling Translate Supported | Yes |
Description | Used to point paths to tags or attributes with translatable content. |
Examples |
Example of single tag pointer: Example of tags pointers: <!-- smartling.translate_paths = root/node/subnode, root/node2 --> Example of attributes pointers: <!-- smartling.translate_paths = root/node.name, root/node2.name --> Example of mixed pointers: <!-- smartling.translate_paths = root/node/subnode, root/node2.name --> |
Control Content For Translation
Directive |
sltrans |
Values |
(case-insensitive) |
Smartling Translate Supported | Yes |
Description |
Use this directive to enable or disable processing of translation strings in the file. You must turn translation back on once you're done with notranslate. Any content with this tag will not appear in the Smartling dashboard, but will appear in your translated file in your original source language. |
Examples |
<!-- smartling.sltrans = notranslate --> Strings below this directive will be captured as strings, but excluded from translation. <!-- smartling.sltrans = translate --> Strings below this directive will be translated. |
Change String Parsing
Directive | string_format_paths |
Values |
Currently, supported formats are:
|
Description |
Specifies the format of strings for the specified paths, and can enable HTML inside another file format. It is important to consider the side effects of HTML parsing, including losing keys, and precedence rules for other formats. |
Examples |
<!-- smartling.string_format =html-->
Smartling parses values of all nodes as HTML. |
Use Case |
If your document has disparate kinds of strings that are used in different contexts, for example, if your single XML file contains application UI strings, with standard string formatting and placeholders, together with one or few keys that are just giant HTML documents (example; TOS or Privacy Policy). |
Enable Variants to Make Strings Unique
Directives |
variants_enabled |
Values |
true|TRUE|on|ON OR false|FALSE|off|OFF |
Smartling Translate Supported | Yes |
Description |
When enabled, Smartling will make strings unique using variant metadata. Must be used with the translate_paths directive to specify keys, which provides the information needed to generate variant metadata. If you have previously uploaded a file with variants turned off, and re-upload the file with variants on, Smartling will capture all content as new strings. You can configure SmartMatch to automatically match the existing translations. |
Examples | <!-- smartling.variants_enabled = true --> |
Escape Base Characters
Directive | entity_escaping |
Values |
(case-insensitive) |
Smartling Translate Supported | Yes |
Description |
Controls whether base characters ( > < & " ) are "escaped" into entities when delivering translations. This can be set universally for the whole file via API, or by setting the directive at the top/start of the file. The directive can also be placed inline to control the behavior of specific strings. |
Examples |
For example, your XML file might have content that looks like this: The content will appear like this in Smartling: By default, using the "auto" setting, Smartling will assume this is HTML from the <hr> tag. When the translated XML file is downloaded with HTML nested inside it, the translated string will have base entities escaped, along with the default XML entities escaped as well:
Otherwise using <!-- smartling.entity_escaping = false --> will allow some entities to remain unescaped. When the translated file is downloaded, the translated string will look like this:
|
Escape Characters the Same as Source
Directive | entity_escaping_strategy |
Values | propagate | none |
Smartling Translate Supported | Yes |
Description |
Used to retain entity escaping for all non-base entities. For example, normally we turn © into © but if we use this new directive the translation will automatically update to use escaping from the source. For each entity character, we'll check to see if it was escaped in the source and try to match (propagate) it in the target. The default is none which is the current behavior, which recognizes HTML4 entities only - if HTML5 entities are required as well, you must use the entity_escaping_type=propagate directive.
This does not affect source content at all - so using it will not result in new strings. Numerical entities are not considered at all with this directive, and are treated normally. |
Examples |
<!-- smartling.entity_escaping_strategy = propagate --> If the same character is both escaped and unescaped in the same string, propagate will return the characters in the translation escaped in the same order as they were in the source. However, if there are a different number of characters in the translation where the translation process removed or added some and the escaping is inconsistent among them, propagate will escape all entities for that character. This does not affect source content at all - so using it will not result in new strings. propagate will only affect non-base entities - all named entities except & , ", <, >. Base entities continue to be controlled by HTML detection and the entity_escaping directive. |
Unescape HTML5
Directive |
entity_escaping_type |
Values |
html4 (default)|html5 (case-insensitive) |
Smartling Translate Supported | Yes |
Description |
By default, all html4 entities are unescaped, except the basic set: < > & ". When this directive is set to html5, all html5 entities will be unescaped as well. If you choose to set this directive to html5, you must also use the entity_escaping_strategy=propagate directive |
Examples |
<!-- smartling.entity_escaping_type = html5 --> |
Force Inline Tags
Directive |
force_inline_for_tags |
Values |
A comma-separated HTML tag list |
Smartling Translate Supported | Yes |
Description |
This parameter forces the HTML parser to treat the listed tags as inline. The difference between block and inline tags is that block tags are used to split HTML into strings, whereas inline tags are included in strings. |
Examples |
<!-- smartling.force_inline_for_tags = external_link,reference --> Any <external_link> or <reference> tags will be parsed as inline tags. Smartling will not create separate strings for content in these tags. |
Translator Instructions
Directive |
instruction_paths |
Values |
Comma-separated paths to translation instructions. |
Description |
Points paths to where translator instructions are located in the file. |
Examples |
<!-- smartling.instruction_paths = data/unit/instruction, data/sub-level/unit/instruction, data/item/instruction --> |
Variant Strategy (API only)
Directive |
variants_strategy |
Values |
(case-insensitive) |
Smartling Translate Supported | Yes |
Description |
This directive can only be used as an API parameter. context_match: enables ICE variants functionality. repetition_indexed: enables the "string indexes as variants for repeated strings" functionality that is the default behavior for business docs. This directive overrides the variants enabled directive. |
Examples |
<!-- smartling.force_block_for_tags=br --> Enables string separation by <br> tags |
Standard Placeholder Format
Directive |
placeholder_format |
Values | NONE; C; IOS; PYTHON; JAVA; YAML; QT, RESX |
Smartling Translate Supported | Yes |
Description | Used to specify a standard placeholder format. |
Examples |
<!-- smartling.placeholder_format = IOS -->
Specifies iOS-style placeholders for the file. |
Custom Placeholder Format
Directive | placeholder_format_custom |
Values | 1) Custom Java regular expression. 2) NONE - disables any current custom placeholders |
Smartling Translate Supported | Yes |
Description | Specifies a custom placeholder format. Any text in your file matching the regular expression you provide will be captured as a placeholder. |
Examples |
<!-- smartling.placeholder_format_custom = REGEX--> <!-- smartling.placeholder_format_custom=\{([^}]+)\} --> Any characters surrounded by curly brackets, e.g., {first name}, will be treated as a placeholder. |
See Placeholders in Resource Files for more on placeholders.
Pseudo Translation
Directives | pseudo_inflation |
Values | integer - Accepted values are 0 - 100 |
Description |
Sets the percentage by which original strings are inflated when downloading pseudo translations. If this directive is not set, pseudo translations are 30 percent longer than the original strings. |
Examples |
<!-- smartling.pseudo_inflation = 80 --> Downloaded pseudo translations will increase the length of original strings by 80 percent. |
Trim Whitespace
Directive |
whitespace_trim |
Values |
on|yes|true or off|no|false or leading|trailing The default value is on. |
Smartling Translate Supported | Yes |
Description |
A whitespace is any character or series of characters that represent horizontal or vertical space in typography. When rendered, a whitespace character is not a visible mark, but does occupy an area or space on a page. Although whitespaces are necessary within a string (typically to separate words), unnecessary whitespaces can be found at the start of a string (leading) and at the end of a string (trailing). With this directive, you can trim whitespaces, as it enables or disables whitespace trim management for the ingested strings. Whitespace is optionally trimmed from content then re-inserted on download for convenience so that translators do not have to manage the extra spaces. However, content owners may want to retain surrounding whitespace so that translators can By default, the leading and trailing whitespaces are trimmed. You can choose to disable trimming or specify trimming for leading or trailing whitespaces. The directive can only be used as the API request parameter. |
Examples |
<!-- smartling.whitespace_trim=on --> Smartling will trim leading and trailing whitespaces (default) <!-- smartling.whitespace_trim=off --> Smartling will not trim leading or trailing whitespaces <!-- smartling.whitespace_trim=leading --> Smartling will trim only leading whitespaces <!-- smartling.whitespace_trim=trailing --> Smartling will trim only trailing whitespaces |