If GPT (OpenAI) or Google Gemini (Vertex AI) are configured as your translation provider, Smartling allows you to augment your translation prompt with contextual information extracted from your linguistic assets. By using RAG technology (Retrieval-Augmented Generation), your translation prompt is automatically injected with highly customized translation data, allowing the LLM to better understand your translation preferences and to produce a more tailored translation output.
Benefits of prompt tooling with RAG
To achieve customized LLM translations that correctly reflect your brand voice and terminology, it is important to provide examples of desirable translations to the LLM. Compared to “zero-shot prompts” (i.e. prompts that don’t include any examples), translation prompts that include examples of good translations as contextual information (“few-shot prompts”) typically yield better results.
Historically, to achieve few-shot-prompting, any contextual information (such as examples of desirable translations or preferred terminology) had to be included in the translation prompt itself.
By using RAG technology (Retrieval-Augmented Generation), Smartling allows you to inject contextual information into your LLM translation prompt in a fully automated way.
How it works
RAG (Retrieval-Augmented Generation) enhances the output produced by LLMs by connecting them to external knowledge bases. This added data source helps the LLM provide more customized and contextually accurate responses.
In Smartling, RAG is used to send information about your translation preferences to your LLM translation provider. By providing fully customized translation data and examples, RAG increases the likelihood of achieving a translation output that adheres to your organization's preferences and brand voice.
Glossary terms, example translations from your translation memory, as well as locale-specific Automated Style Guide rules, are automatically identified by Smartling and do not need to be included manually in the translation prompt. The relevant translation data from your linguistic assets is sent to the LLM along with your translation prompt.
Note: Prompt tooling with RAG only takes into account the linguistic assets which are part of the Linguistic Package associated with the project that is used for LLM translation.
Based on the final "few-shot" output prompt, consisting of your translation instructions, as well as the glossary terms, translation examples and Automated Style Guide rules injected by Smartling, the LLM then provides a translation that is typically more consistent with your brand tone and voice.
Supported providers and workflows
Prompt tooling with RAG is only available for translations with GPT (OpenAI) or Google Gemini (Vertex AI).
Other providers, including GPT (Azure) and Google Translation LLM, currently don't support prompt tooling with RAG.
Note: Since this feature requires access to the assets configured in your project's Linguistic Package, it only takes effect in a translation workflow and for MT suggestions in the CAT Tool.
Prompt tooling with RAG is not supported for translations with Smartling's MT API, nor for Smartling's instant MT integrations (such as Smartling Translate, site-wide MT with the Global Delivery Network, the Zendesk Support Plugin, or dynamic translation with the ServiceNow Connector).
Prerequisites
Reliable linguistic assets
To achieve good results, your translation memories and glossaries should be up-to-date and of good quality. Since these assets are used as a reference, they should correctly reflect your preferred translation style, tone and terminology for each project where LLM translation will be used.
-
Translation memory (TM): When used as an asset reference, your TM can be one of the most impactful factors leading to a higher output quality.
- Any TM that is referenced for LLM translations should contain high-quality human translations that are reflective of the content type that will be translated with the LLM.
-
Glossary: When used for prompt tooling with RAG, a well-maintained glossary helps preserve important brand terminology in LLM translations.
- Your glossary should only contain relevant brand terminology (no common terms).
- Ensure that your glossary terms use correct capitalization. Terms that are not supposed to be capitalized in LLM translations should not be capitalized in the glossary.
- Avoid polysemous or homonymous terms (i.e. terms that could have more than one meaning).
- Enabling AI-enhanced glossary term insertion for your translation workflow can further increase the probability of your glossary terms being inserted correctly.
Enabling prompt tooling with RAG
Prompt tooling with RAG is configured directly in the LLM Profile (i.e. in the Translation Profile where an LLM is used as the translation provider).
- Follow the regular process of setting up a Large Language Model as your translation provider in Smartling.
- Create an LLM Profile, using either your own Provider Credentials, or customize the LLM Profile created for you by the Smartling team.
- On the Configuration Details screen, enter your provider details and optionally adjust any available token details and translation parameters to your preferences.
- Click Next to configure your translation prompt.
- Under Assets References, select the linguistic assets that you would like Smartling to inject into your translation prompt, by ticking the desired checkbox(es).
- Translation Memory Examples: Matching example translations from your translation memory are sent to the LLM to provide context around your translation preferences, and what is deemed a "good" translation.
- Glossary Terms: If glossary terms are detected in the source text, they will be sent to the LLM, along with the corresponding glossary term in the target locale or "Do Not Translate" (DNT) annotations.
- Automated Style Guides: Smartling automatically applies locale-specific style rules to capture linguistic conventions and nuances for each target locale.
- Write a translation prompt that reflects your overall translation preferences (e.g. the intended audience). The translation prompt does not need to be written as a few-shot prompt. This means that it is not necessary to include examples of good translations or glossary terms.
However, you may still want to add negative examples (i.e. examples of how not to translate your content) to your prompt, as these are not injected by RAG. - Test the translation prompt.
- Select the source and target locale you would like to get a test translation for.
- From the dropdown menu, select which Linguistic Package should be applied to test the prompt augmentation.
- The Linguistic Package selected in the LLM Profile is only relevant for testing the translation prompt. When prompt tooling with RAG is used in a translation workflow or in the CAT Tool, only the linguistic assets included in the project's Linguistic Package will be applied.
- Set up a translation workflow where the LLM translation profile is used as the translation provider, or use the LLM translation profile to provide MT suggestions in the CAT Tool.
- Authorize content for translation with the relevant translation workflow. The linguistic assets which are included in the project's Linguistic Package will automatically be used to augment the translation prompt.
Adding Assets References to an existing translation prompt
Any existing translation prompts, which are currently not using prompt tooling with RAG, remain unaltered. To inject your linguistic assets into an existing LLM translation prompt, edit your LLM Profile:
- Access the LLM Profile where an LLM is configured as the translation provider.
- Click on the profile name to edit the prompt details.
- Tick the checkboxes for your desired Asset Preferences.
- Click “Save” to confirm.
Tip: If your existing prompt already contains specific translation memory examples or glossary terms, they should be removed from the prompt to avoid sending conflicting information to the LLM, and to reduce the length and complexity of the prompt.
You may still want to include general translation examples to provide an indication of the desired overall style and tone, or examples of unwanted translations to indicate which type of translations should be avoided.
Assets References for prompt tooling with RAG
Based on the check boxes ticked in your LLM Profile, one or more of the following assets will be used to augment your translation prompt.
Translation Memory Examples
For each string, Smartling queries your translation memory to determine if any translation memory matches are available. Up to 10 available matches will be sent to LLM as examples of your preferred tone of voice, terminology, formality register, and industry-specific language.
The TM matches may be used by the LLM as a reference to apply your company's stylistic preferences. Please note that the TM matches are used as examples only, and not inserted directly instead of an LLM translation.
Info: Only the translation memories included in the project's Leverage Configuration will be taken into consideration.
Glossary Terms
If any glossary terms are detected in the source text, Smartling will send both the source term and the term in the target language to the LLM, alongside the text to translate , so they can be applied in the LLM translation output.
Info: Only the glossary/glossaries associated with the project's Linguistic Package will be taken into consideration.
Automated Style Guides
An Automated Style Guide is an automatically generated collection of locale style rules. These are linguistic conventions that capture locale-specific nuances, such as the correct punctuation rules or date format, for each target locale.
If the checkbox for Automated Style Guides under “Assets References” is ticked, Smartling applies automatically generated locale style rules for all target locales that the LLM Profile is used to translate into. Automated Style Guide rules include information on the following, locale-specific conventions:
-
Abbreviations and acronyms: Refers to any use of shortened words or letters that stand for something.
For example: ASAP, Dr., NASA. -
Addresses, phone numbers and personal information: Refers to any use of phone numbers (e.g., +1 123-456-7890, 123 456 7890, 123-456-789) and addresses (e.g., 51 street, 51 st, 51 ln) in the context of a translatable string, as well as names, public figures and mentions of specific people.
This rule does not apply to a collection of numbers outside of the context of a translatable string. -
Capitalization: Refers use of unusual capitalization.
For example: "WeLcOmE." or "WELCOME.", instead of "Welcome." -
Currencies: Refers to any use of currency.
For example: USD or $. -
Dates: Refers to the use of words or abbreviations in the translatable text that indicate a date, day of week, month or year.
For example: "Jan 5", "2025", "5/8/2025", "5.8.2025". -
Formality: If a style guide using the Smartling template has been set up, this rule checks for the desired formality register: Formal or informal.
- In your style guide, the preferred formality register is specified under "Use of the second person pronoun ('you') should be: Formal or informal".
- For languages where this is not applicable, a single formality register is used.
- In your style guide, the preferred formality register is specified under "Use of the second person pronoun ('you') should be: Formal or informal".
-
Numbers: Refers to any use of numeric values (excluding dates, currency, time, units).
For example: "123", "1,000". -
Punctuation: Defines the use of punctuation.
-
Period: Defines the use of full stops.
For example: "This is a sentence." -
Comma: Defines the use of commas.
For example: "apples, oranges, and bananas" -
Question marks and exclamation marks: Defines the use of "!" or "?".
For example: "What’s happening?" -
Quotes: Defines the use of all types of quotes (single, double, curly, etc).
For example: "Hello", ‘Hello’, «Bonjour» - Other: Defines the use of any other punctuation marks (dashes –, colons, semicolons, parenthesis, brackets, etc.).
-
Period: Defines the use of full stops.
-
Symbols: Refers to any use of symbols.
For example: @, #, %, &, *, ^, +, =, -, _, |, /, ~, <, >, $. -
Time: Defines the use of time formats.
For example: 12:30, 18:15, 06:15, 12h, 12 hours, 09:15:23 am. -
Units of measurement: Refers to any use of units of measurements.
For example: 20 cm, 20in, 20 inches, 20ºC.
In addition to linguistic conventions for each target locale, Smartling’s expert team can implement specific style rules to capture your organization’s translation preferences. This is typically done if specific issues have occurred in your LLM translations that may need to be addressed by providing information about your company's industry and domain, your audience and preferred tone, etc.
Expected results
The Large Language Model will provide a translation output based on your prompt, as well as the injected assets. On average, the use of prompt tooling with RAG is expected to produce a more customized translation result that requires fewer edits.
While the quality of the translation output is dependent on the LLM, as well as the quality of your linguistic assets and translation prompt, a significant reduction in the overall TER (Translation Edit Rate) is expected when prompt tooling with RAG is enabled.