Extension | .csv |
Smartling Identifier | CSV |
Example File | Example CSV with common directives |
Resources | Comma-Separated Values (CSV) files RFC |
CSV is a delimited plain text file that uses a comma to separate values. It is supported by almost all spreadsheets and database management programs, including Google Sheets, Apple Numbers, LibreOffice Calc, and Apache OpenOffice Calc.
Converting Files to CSV
It is important that the CSV is encoded in Unicode (or UTF-8) to preserve any special characters in your content. Most programs are encoded as Unicode or UTF-8 by default, but the most common spreadsheet program, Microsoft Excel, has some restrictions.
To create a CSV from Excel, simply save-as CSV UTF-8 from the file format dropdown.
Screenshot of Microsoft Excel for Mac. Version 16.65.
For information on how to open a CSV successfully in Microsoft Excel, see How to open a CSV file safely in Excel.
Content Parsing
Default
If you upload a CSV file without any custom Smartling directives, Smartling will:
- Ingest each cell as a string for translation
- The string order is in order of row number, regardless of column order. For example, if the CSV has content in cells A1, C2 and B3, the strings in Smartling will be displayed in that order.
- If using a non-native program (such as Microsoft Excel), only the content on the first sheet is ingested, excluding the tab name
- Capture HTML and parses with basic HTML parsing
- Does not capture formulas or formatting
- Does not capture string key and variant data
- Does not create placeholders in the strings using a default regular expression to capture a number of common placeholder formats.
- Does not capture any string instructions
-
Deliver translations, one file per language, or all languages in one file.
- For the latter, each translation is listed in the cells under the source string, with no language indicator.
For more information, see Content Parsing.
HTML
If a string contains HTML, Smartling automatically parses the strings in the file with basic HTML parsing, even without any HTML directive. Just note, this basic parsing will not break down the CSV value with HTML into smaller strings. It is ingested as one string.
If this is not desirable, and you want the value to be broken down into smaller strings in Smartling based on the HTML formatting, you can use the directive:
-
string_format_paths
with the value ofHTML:*
.
This will benefit both your translation resources and your translation memory leverage.
Tip: Strings can also be parsed as Markdown if thestring_format_paths
directive is set with the value of markdown
.
Note: When string_format_paths
is set to HTML or Markdown, keys and instructions are not ingested.
If you have given each CSV string a key, it is worth noting that in doing so, the additional smaller strings will not have any "keys". However, you still have the same "variants" across each of the additional smaller strings. In most cases, such as translation Jobs, this is acceptable, however, this is not acceptable if you are importing translations from a file, as keys are critical for alignment.
Example string on CSV
<p>This <br>is</br> another example source string</p><h1>This is a header <i>maybe</i>?<p>One more for the <strong>road</strong></p>
DEFAULT PARSING |
HTML PARSING |
1 string with variant only. Keys must be applied before translations can be imported. |
5 strings with variant only - Keys must be applied before translations can be imported. |
<p> This <br> is</br> another example source string</p> <h1> This is a header <i> maybe</i> ?<p> One more for the <strong> road</strong> </p> |
|
Plurals
Smartling currently does not support plural strings in CSV files.
Preparing CSV for Translation
Directives are commands embedded in code that essentially direct Smartling to the various elements in your file. Examples include which column is your content for translation, where to find translator instructions, where any key or variants are, and command multilingual output, so all translations are on one file. See examples of these directives below.
When applying directives to files, it's always best to use a text editor to ensure the directives are written in plain text (without formatting). Examples of text editors include SublimeText, TextEdit, TextWrangler, and Notepad++.
Specifying Paths
Some directives require you to specify a path or set of paths to keys or strings in the file. A path in CSV files is simply a column number, such as 1 (column A), 2 (column B) etc.
When declaring a path for a key or string instruction, the key or instruction will be applied to the next translatable string to the right, so you will need to organize your files so that keys and instructions are to the left of translatable strings in each row.
Keys
A key is a unique identifier for a string. By default, keys are not generated automatically in CSV files. To apply a key to a string, a custom key path must be set. Use any column before the source path column to define a key for each string.
Set using the source_key_paths
directive.
The value is the column number, e.g.: 1 for column A or 2 for column B.
String Instructions
String instructions help linguists understand the content that they are translating by providing additional helpful information about the content and how it should be treated.
Set using string_instructions_paths
directive.
You cannot set both string_instructions_paths
and string_format_paths
for HTML at the same time; if you want to use HTML parsing, you will need to add instructions to strings via the Smartling dashboard.
Placeholder Format
If you do not specify a custom placeholder format, even if you specify other directives, Smartling will not convert the following subtext of a string into a placeholder for that string:
- {x}
- {{x}}
- ${x}
- %x%
- %%x%%
- ##x##
- __x__
To successfully convert any subtext of a string into a placeholder, the placeholder_format_custom
directive must be used. The value depends on the format of the subtext, or placeholder. For more information and example placeholder formats with matching values, see Placeholders in Resource Files.
Other Information
You may define values with and without quotations. For example:
Textvalue1, "Value 2"
If you want to use the symbol “ inside quoted value, you escape it with double quotes like:
Text"She said ""hello"" to me."
This corresponds to the string: She said "hello" to me
For download options and how to open a CSV in Excel, see Translated CSV Files.
Directives
File directives are supported, both inline and via our API. Directives are specified in comments within the files, in the following format:
Inline File Format
# smartling.[directive_name] = [value] or [path]
API Parameter
smartling.[directive_name] = [value]
Here are examples of supported directives for CSV:
Character limits
Directive | character_limit_paths |
Values | column number - e.g.: 1 |
Description |
Applies a character limit to the translation, which will be visible on the string details in the dashboard and the CAT Tool. The character limit column should be placed somewhere before the source column. The directive should point to the column that contains a character limit for the string. Each string character limit is applied to the next source string, so you must place the character limit column to the left of (or before) the source string. The character limit applies only to the first translatable column placed after the character limit field. You can apply a character limit for each string by inserting a character limit number alongside each string cell. To remove a character limit from a CSV file that had been previously applied, the number in the character limit column must be set to "none". Simply removing the limit number from the column will not be successful. |
Example | # smartling.character_limit_paths = 1 |
String Instructions
Directive | string_instructions_paths |
Values | Comma-separated list of columns. |
Description |
Specifies which columns contain string instructions. This directive must be used together with smartling.paths to specify translatable strings. Each string instruction is applied to the next translatable string, so you must place your instruction column to the left of (or before) the source string. You may have more than one instruction column per translatable string. Note: When
|
Example |
Smartling will capture the content in the files as follows. Column 1 will be captured as key metadata, Columns 2 and 3 will be string instructions. Column 4 contains the translatable strings. # smartling.string_instructions_paths=2 |
Tip: Character limits and string instructions must precede source content.
Source Content
Directive | paths |
Values | The values of all columns to be captured as strings. |
Description | Defines the column numbers with values to be captured as translatable strings. For multilingual translations import, it defines a column and locale. |
Format |
For uploading original file: For multi-language imports: |
Example |
# smartling.paths=2,3 Specifies that columns 2 and 3 of the uploaded CSV file should be ingested as translatable strings. # smartling.paths=2/es-ES,3/fr-FR When importing translations, specifies that column 2 contains Spanish-SPAIN translations and column 3 contains French-FRANCE translations. |
String Keys
Directive |
source_key_paths |
Values |
A comma separated list of paths to use creates “keys” for strings on translate_paths. The key will be a space-separated string of all the keys leading to the source string. For example: “string”, “group1 string”. |
Description |
Used to define the schema for capturing a key for each source string. Keys are required: If you want to import pre-existing translations from a file with the same structure Creating or updating variants for previously uploaded strings cause new strings to be created that will not have translations. The SmartMatch feature can be configured to automatically apply the existing translations, or translators can use the 100% match from the Translation to manually apply the translation. Specify the full path to the value, then indicate which part of the path should be used as the key using {} notation. |
Example |
# smartling.source_key_paths = 1 Smartling will capture data from column 1 as keys. Each key will be applied to the next translatable string after it, so keys need to be placed to the left of translatable strings in each row for this directive to work. |
Standard Placeholder Format
Directive |
placeholder_format |
Values | NONE; C; IOS; PYTHON; JAVA; YAML; QT, RESX |
Smartling Translate Supported | Yes |
Description | Used to specify a standard placeholder format. |
Example | # smartling.placeholder_format = IOS
Specifies iOS-style placeholders for the file. |
Custom Placeholder Format
Directive | placeholder_format_custom |
Values | 1) Custom Java regular expression. 2) NONE - disables any current custom placeholders |
Smartling Translate Supported | Yes |
Description | Specifies a custom placeholder format. Any text in your file matching the regular expression you provide will be captured as a placeholder. |
Example |
# smartling.placeholder_format_custom = REGEX # smartling.placeholder_format_custom=\{([^}]+)\} Any characters surrounded by curly brackets, e.g., {first name}, will be treated as a placeholder. |
See Placeholders in Resource Files for more on placeholders.
First Row Is a Header
Directive | first_row_header |
Values | true / TRUE or false / FALSE (default) |
Smartling Translate Supported | Yes |
Description | If TRUE, the first non-empty string in a CSV file will be treated as a header and excluded from translation. |
Example | # smartling.first_row_header=TRUE |
Unescape HTML5
Directive |
entity_escaping_type |
Values |
html4 (default)|html5 (case-insensitive) |
Smartling Translate Supported | Yes |
Description |
By default, all html4 entities are unescaped, except the basic set: < > & ". When this directive is set to html5, all html5 entities will be unescaped as well. If you choose to set this directive to html5, you must also use the entity_escaping_strategy=propagate directive |
Example |
# smartling.entity_escaping_type=HTML5 |
Escape Characters the Same as Source
Directive | entity_escaping_strategy |
Values | propagate | none |
Smartling Translate Supported | Yes |
Description |
Used to retain entity escaping for all non-base entities. For example, normally we turn © into © but if we use this new directive the translation will automatically update to use escaping from the source. For each entity character, we'll check to see if it was escaped in the source and try to match (propagate) it in the target. The default is none which is the current behavior, which recognizes HTML4 entities only - if HTML5 entities are required as well, you must use the entity_escaping_type=propagate directive.
This directive can be placed inline, in the API or in a template (consult your Customer Success Manager about configuring directive templates). This does not affect source content at all - so using it will not result in new strings. Numerical entities are not considered at all with this directive, and are treated normally. |
Example |
#smartling.entity_escaping_strategy=propagate |
Pseudo Translation
Directive | pseudo_inflation |
Values | integer - Accepted values are 0 - 100 |
Description |
Sets the percentage by which original strings are inflated when downloading pseudo translations. If this directive is not set, pseudo translations are 30 percent longer than the original strings.’ |
Example |
# smartling.pseudo_inflation = 80 Downloaded pseudo translations will increase the length of original strings by 80 percent. |
Parse as HTML or Markdown
Directive |
string_format_paths |
Values | HTML | markdown |
Description |
When set to HTML, the strings in the file are parsed as HTML. When set to markdown, the string in the file are parsed as markdown. Note: When |
Example |
# smartling.string_format_paths=HTML:* |
Alternative Character as Field Separator
Directive | field_separator |
Values | String of characters. The default is the comma ",". |
Smartling Translate Supported | Yes |
Description | Defines the sequence of characters that separate values in a record line. |
Example |
# smartling.field_separator=, Fields are separated with a , character. |
Alternative Character as String Encloser
Directive | string_encloser |
Values |
String of characters. The default is the double quotation marks, e.g.: "example string". |
Smartling Translate Supported | Yes |
Description | Defines the sequence of characters that may enclose values. To use the character sequence inside values, you should escape it by repeating twice (default is ""). |
Example |
# smartling.string_encloser=* String literals are enclosed in * characters |
Remove Directives from Translated Files
Directive | strip_instructions_on_download |
Values | true / TRUE or false / FALSE (default) |
Smartling Translate Supported | Yes |
Description | Defines whether all Smartling directives in the source file should be removed from translated files when downloaded. |
Example | # smartling.strip_instructions_on_download=TRUE |
Translation Locale Identifier (map)
Directive | locales_map |
Values | Alternative labels for Smartling locales in JSON format. |
Smartling Translate Supported | Yes |
Description | Defines how languages are labeled in downloaded CSV files. The default label is the Smartling locale code, such as “fr-FR”, but you may wish to choose a different label, such as “French” to make the file easier to read or to match the labels used in your application. |
Example |
# smartling.locales_map={"es-ES":"Spanish","de-DE":"German"} Downloaded translations will be labeled as Spanish for es-ES and German for de-DE. |
UTF-8 Byte Order Mark (BOM)
Directive | add_utf8_bom |
Values | true / TRUE or false / FALSE (default) |
Smartling Translate Supported | Yes |
Description |
Determines whether to force the addition of a UTF-8 Byte Order Mark (BOM) to the output file when downloading translations. If set to FALSE (default), output files will only include a BOM if the original file did. If set to TRUE, a UTF-8 BOM will be added to the output file, even if none existed in the original file. Note: This applies to UTF-8 only. For UTF-16, BOM is always used. |
Example | # smartling.add_utf8_bom=TRUE |
Include Source in Translated Files
Directive | output_original_row |
Values | true / TRUE or false / FALSE (default) |
Smartling Translate Supported | Yes |
Description | Defines if the original source strings should be included in the translated file when downloading multiple languages. |
Examples | # smartling.output_original_row=TRUE |
Translations in Rows
Directive | translation_language_path |
Values | Column Number - for example: 4 |
Smartling Translate Supported | Yes |
Description |
Allows you to include all translations in one file, where the translations are in row, by defining the column to record the language for each row. Output will display a language code for each column, e.g. de, en, es, etc. This column should exist in the original file as an empty column. If using this directive, you should also include this column in the #smartling.paths to ensure it is not overwritten by translations. |
Example |
# smartling.translation_language_path = 4 When the translated file is downloaded, column 4 will record the language for each row. |
Translations in Columns
Directive | translations_in_columns |
Values | true / TRUE or false / FALSE (default) |
Smartling Translate Supported | Yes |
Description | Allows you to include all translations in one file, a locale per column. |
Example |
# smartling.translations_in_columns=TRUE |
Trim Whitespace
Directive |
whitespace_trim |
Values |
on|yes|true or off|no|false or leading|trailing The default value is on. |
Smartling Translate Supported | Yes |
Description |
A whitespace is any character or series of characters that represent horizontal or vertical space in typography. When rendered, a whitespace character is not a visible mark, but does occupy an area or space on a page. Although whitespaces are necessary within a string (typically to separate words), unnecessary whitespaces can be found at the start of a string (leading) and at the end of a string (trailing). With this directive, you can trim whitespaces, as it enables or disables whitespace trim management for the ingested strings. Whitespace is optionally trimmed from content then re-inserted on download for convenience so that translators do not have to manage the extra spaces. However, content owners may want to retain surrounding whitespace so that translators can By default, the leading and trailing whitespaces are trimmed. You can choose to disable trimming or specify trimming for leading or trailing whitespaces. The directive can only be used as the API request parameter. |
Example |
#smartling.whitespace_trim=on Smartling will trim leading and trailing whitespaces (default) #smartling.whitespace_trim=off Smartling will not trim leading or trailing whitespaces #smartling.whitespace_trim=leading Smartling will trim only leading whitespaces #smartling.whitespace_trim=trailing Smartling will trim only trailing whitespaces |
Need a template to start off with? Download this example source file. This template is designed to help you get started with a typical use case, but it's likely that you will need to adjust the file and add or remove directives to align with your need.
Steps to Translating CSV Files
- Ensure to create a Files Project for file translation management.
- Once you're ready to translate the file, create a Job. All content in the file will be ingested for translation.
- By default, no Visual Context will be displayed to the Translator from within the CAT Tool. If you would like to provide Translators with Visual Context, upload an image of the content's message.
- If you haven't provided string instructions from within the file using the directive above, you can also provide instructions from the dashboard to provide context.
- By attaching a JPG or PDF document for reference, the Translators can download the attachment in the CAT Tool.
- If you haven't applied character limits to the content from within the file using the directives above, applying character limits to strings in the dashboard can help done to ensure translations are kept to a certain length.
- When translations are complete, download the published translations to your locale drive.
Translated CSV Files
If you are using a program that is encoded with Unicode (or UTF-8) by default, then proceed to open the file as you would normally. In most cases, Microsoft Excel will be your default program for spreadsheet files. A simple double-click on your downloaded CSV could open your translated file in Excel with multiple corrupt characters. This is because Excel is not encoded with Unicode by default.
How to open a CSV file safely in Excel
After you have downloaded translations from Smartling
- Open a new blank workbook in Excel (separately)
- In Excel, go to the Data tab, click From Text. Choose the translated CSV file from your local drive and click Get Data
- Depending on your version of Excel, there is a series of steps to follow in the Import Wizard. Ensure Delimited is selected
- In the File Origin dropdown, scroll down and choose Unicode (UFT-8) > Next
- Ensure the Delimiters are set to Comma > Next
- Ensure the Column data format is Text > Finish
- Choose where you want the data > OK
- The CSV should have imported successfully. If you find corrupt characters in the file that are not visible in the Smartling Dashboard, revert to Microsoft Excel documentation.