GPT can be used as a translation provider in Smartling.
To get started, you first need to obtain an API key for GPT from either OpenAI or Azure. Then you will store these credentials in Smartling.
Your provider credentials can then be used to create a Translation Profile, which is where you can configure your translation prompt and adjust various parameters for further customization.
Once the LLM Profile is created, GPT can be used as part of a translation workflow, for translation suggestions in the CAT Tool, or with any of Smartling’s instant MT integrations.
Supported models
All GPT models. You will need to bring your own API key.
Supported languages
GPT does not have a list of supported languages. It will attempt to generate a translation for any language you prompt it to. All high-resource languages are supported, while low-resource languages could result in poor quality translations.
Prerequisites
An API key for GPT from OpenAI or Azure is required.
Limitations
Token limits
GPT models use tokenizers, which break the text down into units called tokens. This counts the total number of tokens (both input and output tokens). Input token is everything that is fed into the model (that includes the prompt and the input source string.) The output token is the number of tokens that is returned from the model.
There is no universal tokenization model, which means that the text could be broken down based on words, characters or character sequences. Therefore, the number of translated words in Smartling will likely differ from the token usage.
Your GTP instance comes with a monthly token limit. If you exceed this limit, an error flag will be shown in Smartling, and your GTP profile will cease to complete translations.
The larger the source content size and the higher the number of target languages, the higher the risk of reaching the token limit will be.
Tip: To test tokenization, test your prompt in OpenAI's tokenizer.
Rate limits
If GPT is used as a translation provider, three types of rate limits restrict the number of translations that can be processed:
- Requests per minute
- Requests per day (only within the first 48 hours)
- Tokens per minute
Info: If a token or rate limit is exceeded, an error will appear in Smartling on the Translation Profiles page. Smartling will retry using the Translation Profile until a translation can be produced successfully. If the overall or monthly token limit has been reached, the Translation Profile may stop generating translations.
More information about rate limits can be found in OpenAI and Azure documentation.
Setting up GPT as a translation provider
Step 1: Set up GPT(OpenAI) or GPT(Azure)
First, make sure you have obtained an API key from OpenAI or Azure, which provides access to a GPT model.
Step 2: Add provider credentials in Smartling
You will need to store your GPT credentials (API key) in Smartling, in order to use a GPT model as a translation provider.
- From the top navigation of your Smartling dashboard, access the AI Hub.
- Navigate to the Credentials section.
- Follow the instructions outlined here to store your provider credentials in Smartling.
Step 3: Use the provider credentials in a Translation Profile
Once the provider credential has been saved and tested successfully, it can now be used to create a Translation Profile. A Translation Profile allows you to configure your translation prompt, as well as additional preferences to further customize the translation output.
- From the top navigation of your Smartling dashboard, access the AI Hub.
- Navigate to the Translation Profiles section.
- Click Create Profile and select MT Profile or LLM Profile (RAG):
- Select LLM Profile (RAG) if your provider key was obtained from OpenAI.
Select GPT (OpenAI) as your LLM provider and follow our instructions on How to create an LLM Profile. - Select MT Profile if your provider key was obtained from Azure.
Select GPT (Azure) as your MT provider and follow our instructions on How to create a new MT Profile.
- Select LLM Profile (RAG) if your provider key was obtained from OpenAI.
Info: The exact setup process varies based on the provider used for the API key. For further details regarding the configuration and prompt details for each provider, please see Setting up GPT (Azure) or Setting up GPT (OpenAI).
Setting up GPT (Azure)
Please follow steps 1-3 as described above to begin the process of creating an MT Profile for GPT (Azure).
Configuration Details
- For GPT (Azure), the selected model is specified in the deployment. It cannot be modified in the MT Profile.
-
GPT model - max supported tokens: Max supported tokens of the GPT model
- This is an optional parameter where you can specify the max number of tokens your model can support per request. This is useful if you have a custom model or the model name is not obvious from the deployment ID. If no value is provided, the default is automatically set to 4K.
- To test tokenization, test your prompt in OpenAI's tokenizer.
Translation parameters (optional)
To further fine-tune the translation output, you can customize the following, optional parameters.
If you do not specify any values for these parameters in your MT Profile, your model's default values will be used.
Tip: For more information, see Translation Parameters for LLM Translations.
Translation prompt
Unlike with traditional MT providers, it is not sufficient to enter the API key, but you also need to set up a translation prompt. Using a customized translation prompt allows you to provide a tailored translation output based on your requirements.
We recommend using the word "translate" in the prompt to instruct GPT to provide a translation.
Additionally, the following two placeholders need to be used in the prompt for translation with GPT (Azure): {sourceLanguage} and {targetLanguage}.
- {sourceLanguage} : refers to the source locale as specified in your Smartling project
- {targetLanguage} : refers to all target locale(s) as specified in your Smartling project
Example prompt: Translate from {sourceLanguage} to {targetLanguage}. The tone should be formal and friendly.
A previous version of Smartling's GPT integration required the use of the placeholder {sourceContent}. This is no longer needed. If your translation prompt still contains the placeholder {sourceContent}, we would recommend removing it. Please ensure that the updated prompt will follow a clear sentence structure once the placeholder has been removed.
Testing the prompt
Once all values have been entered, the GPT translation prompt can be tested to ensure that it produces the desired outcome.
Select a source and target language and click Test Integration.
Note: There is a 250-character limit on the source text field. The translation text field cannot be edited.
Save the MT Profile
Once the MT Profile has been saved, it can now be used to translate your content in the Smartling platform, or with one of Smartling's instant MT integrations.
Setting up GPT (OpenAI)
To use GPT (OpenAI) as your translation provider, you will need to create an LLM Profile as described here.
Configuration details
- LLM Provider: GPT (OpenAI) will be selected automatically.
- LLM Profile Name: Enter a profile name that is easily identifiable by your team. We recommend choosing a name that indicates the selected provider, as well as any additional specifications.
- Provider Credentials: From the dropdown menu, select the provider credential you created earlier.
-
GPT model: Select the GPT model which you would like to use.
- Example: gpt-3.5-turbo
Token details (optional)
-
Max supported tokens of the GPT model:
- This is an optional parameter where you can specify the max number of tokens your model can support per request. This is useful if you have a custom model or the model name is not obvious from the deployment ID. If no value is provided, the default is automatically set to 4K.
- To test tokenization, test your prompt in OpenAI's tokenizer.
-
Max supported output tokens of the GPT model:
- This is an optional parameter where you can specify the max number of output tokens that your GPT model supports per request.
Parameter details (optional)
To further fine-tune the translation output, you can customize the following, optional parameters.
If you do not specify any values for these parameters in your LLM Profile, your model's default values will be used.
Tip: For more information, see Translation Parameters for LLM Translations.
OpenAI-Organization header value
For users who belong to multiple organizations, you can pass a header to specify which organization is used for an API request. Usage from these API requests will count against the specified organization's subscription quota. For more information, read Open AI's documentation.
Translation prompt
Unlike with traditional MT providers, you need to set up a translation prompt to instruct the LLM on how to translate your content. Using a customized translation prompt allows you to provide a tailored translation output based on your requirements.
Please follow our instructions on how to Configure your translation prompt.
Prompt tooling with RAG (Retrieval-Augmented Generation)
Prompt tooling with RAG allows you to augment your translation prompt with contextual information extracted from your linguistic assets. By using RAG technology (Retrieval-Augmented Generation), your translation prompt is automatically injected with highly customized translation data, allowing the LLM to better understand your translation preferences and to produce a more tailored translation output.
Note: Prompt tooling with RAG is currently only available if your GPT provider key was obtained from OpenAI.
If you are using GPT (OpenAI), follow our instructions for selecting the Assets References that should be used for prompt tooling with RAG.
Testing the prompt
Once all values have been entered, the GPT translation prompt can be tested to ensure that it produces the desired outcome. Please follow our instructions for testing the prompt.
Save the LLM Profile
Once the LLM Profile has been saved, it can now be used to translate your content in the Smartling platform, or with one of Smartling's instant MT integrations.
Considerations for translation with GPT
Compared to traditional machine translation providers, Large Language Models like GPT offer more flexibility, however they also present a number of challenges, which should be considered.
Fallback translation provider
When GPT is used on the Translation step of a workflow in Smartling, it is strongly recommended to set up an alternate MT profile and/or a fallback method. The alternative MT provider or method will be used if the translation with GPT should fail.
If GPT's services get a 500 or 503 error code, Smartling will retry to send your content for translation 3 times.
For 429 errors, Smartling stops trying immediately and tries the fallback translation provider, if one is provided. If no fallback is provided, Smartling retries with exponential backoff (retrying with an increased time period, up to 4 hours). This could take the Job days to complete, depending on how many tokens need to be translated.
If the content cannot be successfully translated at this point, the fallback translation method will be used instead.
If no fallback translation method is set up, Smartling will retry to send content for translation with GPT for 7 days.
AI-enhanced glossary term insertion
AI-Enhanced Glossary Term Insertion is supported for translations generated by GPT when used in a translation workflow. Glossary Term Insertion is not supported when GPT is used in an MT integration, such as MT in the CAT Tool, MT API, GDN, or Smartling Translate.
Prompts
Prompts are clear instructions to the LLM on what you want it to do. You can include as much direction and contextual information as necessary, such as formality, tone and style preferences, just be aware of your model's token limits.
GPT doesn't know how to say "I don't know" to any given prompt. As a result, it won't always perform the action instructed in the prompt, but instead, hallucinate. A way to prevent this is to include instructions on what you want it to do if it is unsure.
Furthermore, results generated from prompts can be inconsistent and take longer to generate, depending on the prompt.
Tags and placeholders
At times, HTML tags and placeholders may be handled incorrectly by GPT. A post-editing step may be required to ensure the correct placement of HTML tags and placeholder.
Hallucinations
GPT models can hallucinate, which means that additional material which wasn’t requested is added to the translation output. A higher temperature value can generally lead to more hallucinations.
Quality
GPT does not translate lower resource languages very well. Fine-tuning the translation quality with linguistic assets is not possible, compared to MT custom trained engines. Overall, translation quality is much better in MT custom trained models. If you are translating with GPT, we recommend including a post-translate edit and review step in the translation workflow.
Tip: For more best practices for translating with LLMs, visit our Smartling Community.