Translating with GPT

GPT can be used as a translation provider in Smartling.

To get started, you first need to obtain an API key for GPT from either OpenAI or Azure. You can either bring your own key (BYOK) or ask your Smartling representative to provision a Smartling-provided key for GPT (Azure).

Your provider credentials can then be used to create an LLM Profile, which is where you can configure your translation prompt and adjust various parameters for further customization.

Once the LLM Profile is created, GPT can be used as part of a translation workflow, for translation suggestions in the CAT Tool, or with any of Smartling’s instant MT integrations.

Supported models

All GPT models are supported.

Supported languages

GPT does not have a list of supported languages. It will attempt to generate a translation for any language you prompt it to. All high-resource languages are supported, while low-resource languages could result in poor quality translations.

Prerequisites

An API key for GPT from OpenAI or Azure is required. A Smartling-provisioned key is available for GPT (Azure). Please contact your Smartling representative if you would like to use Smartling-provisioned credentials.

Limitations

Token limits

GPT models use tokenizers, which break the text down into units called tokens. This counts the total number of tokens (both input and output tokens). Input token is everything that is fed into the model (that includes the prompt and the input source string.) The output token is the number of tokens that is returned from the model.

There is no universal tokenization model, which means that the text could be broken down based on words, characters or character sequences. Therefore, the number of translated words in Smartling will likely differ from the token usage.

Your GTP instance comes with a monthly token limit. If you exceed this limit, an error flag will be shown in Smartling, and your GTP profile will cease to complete translations.

The larger the source content size and the higher the number of target languages, the higher the risk of reaching the token limit will be.

Tip: To test tokenization, test your prompt in OpenAI's tokenizer.

Rate limits

If GPT is used as a translation provider, three types of rate limits restrict the number of translations that can be processed:

Requests per minute
Requests per day (only within the first 48 hours)
Tokens per minute

More information about rate limits can be found in OpenAI and Azure documentation.

Info: If a token or rate limit is exceeded, an error will appear in Smartling on the Profiles page. Smartling will retry using the LLM Profile until a translation can be produced successfully. If the overall or monthly token limit has been reached, the LLM Profile may stop generating translations.

Setting up GPT as a translation provider

Step 1: Set up GPT(OpenAI) or GPT(Azure)

First, make sure you have obtained an API key from OpenAI or Azure to access a GPT model. If you prefer not to bring your own key (BYOK), a Smartling-provisioned key is available for GPT (Azure). Contact your Smartling representative if you would like to use a Smartling-provided key.

Step 2: Add provider credentials in Smartling

You will need to store your GPT credentials (API key) in Smartling, in order to use a GPT model as a translation provider.

From the top navigation of your Smartling dashboard, access the AI Hub.
Navigate to the Credentials section.
Follow our instructions for storing provider credentials in Smartling.

Step 3: Use the provider credentials in an LLM Profile

Once the provider credential has been saved and tested successfully, it can now be used to create an LLM Profile. An LLM Profile allows you to configure your translation prompt, as well as additional preferences to further customize the translation output.

From the top navigation of your Smartling dashboard, access the AI Hub.
Navigate to the Profiles tab.
Click Create Profile and select LLM Profile (RAG).

Tip: Follow our instructions on How to create an LLM Profile.

Select GPT (OpenAI) or GPT (Azure) as your LLM Provider.
Enter your provider and token details.
- Under GPT model, select the GPT model which you would like to use.
  - Example: gpt-3.5-turbo
  - For GPT (Azure), the selected model is specified in the deployment. It cannot be modified in the MT Profile.
- Adjust parameter details (optional):
  - You can customize optional translation parameters to adjust the translation output. If you do not specify any custom parameters in your LLM Profile, your model's default values will be used. For more information on the available translation parameters, please see Translation Parameters for LLM Translation.
- OpenAI-Organization header value:
  - For users who belong to multiple organizations, you can pass a header to specify which organization is used for an API request. Usage from these API requests will count against the specified organization's subscription quota. For more information, read Open AI's documentation.
Configure your translation prompt:
- Unlike with traditional MT providers, you need to set up a translation prompt to instruct the LLM on how to translate your content. Using a customized translation prompt allows you to provide a tailored translation output based on your requirements. Please follow our instructions on how to Configure your translation prompt.
- Optionally, Prompt tooling with RAG allows you to augment your translation prompt with contextual information extracted from your linguistic assets. By using RAG technology (Retrieval-Augmented Generation), your translation prompt is automatically injected with highly customized translation data, allowing the LLM to better understand your translation preferences and to produce a more tailored translation output.
- Once all values have been entered, the translation prompt can be tested to ensure that it produces the desired outcome. Please follow our instructions for testing the prompt.

Click Create to create the LLM Profile.

Step 4: Use the LLM Profile in an MT workflow or MT integration

Once the LLM Profile has been created, it can be used to translate your content, either within the Smartling platform (using an MT workflow or MT suggestions in the CAT Tool), or through one of Smartling's MT integrations to display machine translations directly where needed.

Tip: For more information, see Using the LLM Profile to translate your content.

Enable AI-Enhanced Glossary Term Insertion (optional)

AI-Enhanced Glossary Term Insertion is supported for translations generated by GPT when used in a translation workflow or Smartling Translate. Glossary Term Insertion is not supported when GPT is used for an MT integration, such as MT in the CAT Tool, MT API or the GDN.

Considerations for translation with GPT

Compared to traditional machine translation providers, Large Language Models like GPT offer more flexibility, however they also present a number of challenges, which should be considered.

Fallback translation provider

When GPT is used on the Translation step of a workflow in Smartling, it is strongly recommended to set up an alternate MT profile and/or a fallback method. The alternative MT provider or method will be used if the translation with GPT should fail.

If GPT's services get a 500 or 503 error code, Smartling will retry to send your content for translation 3 times.

For 429 errors, Smartling stops trying immediately and tries the fallback translation provider, if one is provided. If no fallback is provided, Smartling retries with exponential backoff (retrying with an increased time period, up to 4 hours). This could take the Job days to complete, depending on how many tokens need to be translated.

If the content cannot be successfully translated at this point, the fallback translation method will be used instead.

If no fallback translation method is set up, Smartling will retry to send content for translation with GPT for 7 days.

Prompts

Prompts are clear instructions to the LLM on what you want it to do. You can include as much direction and contextual information as necessary, such as formality, tone and style preferences, just be aware of your model's token limits.

GPT doesn't know how to say "I don't know" to any given prompt. As a result, it won't always perform the action instructed in the prompt, but instead, hallucinate. A way to prevent this is to include instructions on what you want it to do if it is unsure.

Furthermore, results generated from prompts can be inconsistent and take longer to generate, depending on the prompt.

Tags and placeholders

At times, HTML tags and placeholders may be handled incorrectly by GPT. A post-editing step may be required to ensure the correct placement of HTML tags and placeholder.

Hallucinations

GPT models can hallucinate, which means that additional material which wasn’t requested is added to the translation output. A higher temperature value can generally lead to more hallucinations.

Quality

GPT does not translate lower resource languages very well. Fine-tuning the translation quality with linguistic assets is not possible, compared to MT custom trained engines. Overall, translation quality is much better in MT custom trained models. If you are translating with GPT, we recommend including a post-translate edit and review step in the translation workflow.

Tip: For more best practices for translating with LLMs, visit our Smartling Community.

Hey! Hoi! ¡Oye! Ciao ! 你好! Hallo! Salut ! Hey! How can we help?