You have the option to use Google's Vertex AI as a translation provider in Smartling. Google's Vertex AI platform allows you to translate using the latest Gemini models.
To translate using Vertex AI, you first will need to obtain an API key in the form of JSON credentials from Google. Then you will store this credential in Smartling. This credential will be linked in the MT Profile. Within the profile, you can configure your translation prompt and set various parameters. Once the MT profile is created, Vertex AI can be used as part of a translation workflow or with any of Smartling's machine translation integrations.
Requirements
You will need to bring your own API key in the form of JSON credentials for Google Vertex AI. Smartling does not provide API keys for this service.
Supported models
Smartling supports translation using Gemini models version 2.0 and above via the Vertex AI API.
Supported languages
Please refer to the Vertex AI documentation for the full list of languages supported by Gemini models.
Limitations
Token limits
LLM models use tokenizers to break text into units called tokens. Token usage includes both input and output tokens. Input tokens refer to everything fed into the model, including the prompt and the source string. Output tokens refer to the number of tokens returned by the model.
There is no universal tokenization method. Text can be broken down by words, characters, or character sequences, depending on the model. Therefore, the number of translated words shown in Smartling will likely differ from the token usage.
Each model has a maximum token limit that applies to both the input prompt and the generated response. Please see the Vertex AI documentation for specific token limits. Vertex AI also enforces quotas and usage limits. If you are using a Gemini model, Google uses a Dynamic Shared Quota (DSQ) system.
If a token limit is exceeded, an error will appear in Smartling on the MT profiles page and the MT profile will stop generating translations.
In addition to the length of the translation prompt sent with each request, larger source content and a greater number of target languages increase the risk of reaching the token limit. You can estimate your token usage and billable character count for a prompt using the Vertex AI API. More information on this is available here.
Rate limits
Rate limits control the number of requests or the volume of traffic allowed within a specific time period. Translation providers like Google Vertex AI define rate limits for their models. You must adhere to these limits to maintain successful translations.
If a rate limit is exceeded, the MT profile will stop translating, and an error message will appear in Smartling on the MT profiles page.
Please refer to the Vertex AI documentation for more detailed information on rate limits.
Setting up Google Gemini (Vertex AI) as a translation provider
Step 1: Setup in Google
First, enable Google Vertex AI and ensure that your preferred model is deployed. You will need to obtain an API key from Google, which provides access to the model.
Step 2: Add provider credentials in Smartling
You will need to store your Vertex AI credentials in Smartling, in order to use a Gemini model as a translation provider in Smartling.
- From Account Settings > Machine Translation Settings, access the Provider Credentials tab.
- Click Add Provider Credential.
This opens a modal where you can save your provider key. Please enter the following information:
-
MT or LLM Provider (required):
From the dropdown, select Google Gemini (Vertex-AI). -
Credential Name (required):
Enter a name for your provider credential to help identify it. Ideally, this nickname should reflect which team owns the credential, which provider is used, and any other valuable information to help you identify this provider key.
For example: Gemini Marketing Model - Credential Description (optional)
-
Project ID (required): A unique identifier for your project in Google. It is typically displayed below the project name in Google Cloud Console.
Example:
My Project
Project ID:my-project-id-123456
- Location (required): The location of your data (i.e., the Vertex AI datacenter location).
-
JSON credentials (required): Copy and paste the entire JSON credentials file into this field, including the brackets. See the instructions below for generating a JSON credentials file.
- In the Google Cloud Console, navigate to Service Accounts > select the relevant project > select the account.
- Go to the Keys tab > click Add key > Create new key.
- Choose JSON and click Create. A file will be downloaded—this is your credentials file. Copy and paste its contents into the Credentials field.
- In the Google Cloud Console, navigate to Service Accounts > select the relevant project > select the account.
-
Test Credential
Once you have provided all required information, please click "Test Credential" to check if the provider credential is fully functional.
- A success message will be displayed if the credential is working correctly. You can now proceed to associate this credential with an MT Profile.
- In case of any issues, an error message will be displayed. Please check if a valid provider credential was obtained, and if the JSON credentials and other details were entered correctly.
-
MT or LLM Provider (required):
- Click Save to create the credential.
Step 3: Use the provider credentials in an MT Profile
Once the provider credential has been saved and tested successfully, it can now be used to create an MT Profile. An MT Profile allows you to configure additional preferences to further customize the translation output.
- From Account Settings > Machine Translation Settings, navigate to the Profiles tab.
- Click Create MT Profile.
- Enter the required information:
- MT or LLM Provider: From the drop-down menu, select Google Gemini (Vertex-AI)
-
MT Profile Name: Choose a name to identify the profile. We recommend including the provider name and any other relevant identifiers.
For example: Gemini 2.0 Flash Marketing - Provider Credentials: From the drop-down menu, select the credential you created in the previous step.
- Gemini model: Enter the exact name of the model including the version (e.g., gemini-2.0-flash-lite-001)
- Max supported tokens of the Gemini model: Max amount of tokens that your Gemini model supports per request. If this field is left blank, the default value for the model will be used.
- Max supported output tokens of the Gemini model: Max amount of output tokens that your Gemini model supports per request. Please refer to the documentation of the model that is selected for the current profile. Please note, not all models support this parameter.
- 'Temperature' parameter: The temperature controls the creativity of the output. Lower values result in more direct translations, while higher values increase the creativity of the translation. However, higher temperatures reduce repeatability, making it harder to reproduce the same output. A high temperature may also increase the likelihood of the model straying from the context or hallucinating. If left blank, the default value of 1.0 will be used. Please refer to the documentation for more details.
- 'Top-P' parameter: Similar to autocomplete functionality in other systems, models can determine which word is the most likely to follow the previous token in a sentence. Top P refers to the probability of a certain word being chosen in a particular context. This parameter changes how the model selects tokens for output. The range is 0.0 to 1.0. If left blank, the default value for the model will be used. Please refer to the documentation for more details.
- 'Top-K' parameter: Top-K changes how the model selects tokens for output. The range is 0.0 to 1.0. If left blank, the default value for the model will be used. Please refer to the documentation for more details.
-
'Presence penalty' parameter: This setting ranges from -2.0 to 2.0. Positive values penalize tokens that have already appeared, encouraging more diverse outputs by reducing repetition. A higher presence penalty increases the likelihood of generating new content and prevents topic repetition. If left blank, the default value for the model will be used.
-
'Frequency penalty' parameter: This parameter penalizes tokens based on how often they have already appeared in the generated text. The more frequently a token appears, the more it is penalized. This helps reduce word repetition and can encourage the use of synonyms.
The value ranges from -2.0 to 2.0. Positive values discourage repeated words. If left blank, the default value for the model will be used. -
Translation prompt: Using a customized translation prompt allows you to provide a tailored translation output based on your requirements.
We recommend using the word "translate" in the prompt to instruct Gemini to provide a translation. Additionally, the following two placeholders need to be used in the prompt:
{sourceLanguage}
and{targetLanguage}
.{sourceLanguage}
: refers to the source locale as specified in your Smartling project{targetLanguage}
: refers to all target locale(s) as specified in your Smartling project
Example prompt: Translate from{sourceLanguage}
to{targetLanguage}
. The tone should be formal and friendly. - Test integration: Click "Test integration" to check if the MT Profile is fully functional. If an error message is displayed, please ensure that all information has been entered correctly.
- Click Save to create the MT Profile.
Step 4: Use the MT Profile in an MT workflow or MT integration
The MT Profile can now be used to translate your content, either within the Smartling platform (using an MT workflow or MT suggestions in the CAT Tool), or through one of Smartling's MT integrations to display machine translations directly where needed.
- MT Profiles can be used as a translation provider in a machine translation workflow.
Machine translation workflows allow you to translate any content in the Smartling platform, using your preferred provider and configurations. For more information, please visit this article.
- MT Profiles can be used to translate content with Smartling's MT API or one of Smartling's MT integrations, which provide Machine Translations directly where they should be displayed - without the need to upload the content into the Smartling platform first.
To select the desired MT Profile for each of these integrations, please navigate to your Account Settings > Machine Translation Settings > Settings.
- MT Profiles can be used to provide translation suggestions in the CAT Tool.
To select the desired MT Profile for translation suggestions in the CAT Tool, please navigate to your Account Settings > Machine Translation Settings > Settings > CAT Tool.
Considerations
Compared to traditional machine translation providers, Large Language Models (LLMs) like Google Gemini provide more flexibility. However, they also come with a number of challenges that should be considered.
Fallback translation provider
When using Vertex AI in the Translation step of a workflow in Smartling, it is strongly recommended to configure an alternate MT profile and/or a fallback method. This backup will be used if translation with Vertex AI fails.
If Vertex AI returns a 500 or 503 error, Smartling will retry the request to send your content for translation up to three times.
For 429 errors, Smartling stops trying immediately and tries the fallback translation provider, if one is provided. If no fallback is provided, Smartling retries with exponential backoff (retrying with an increased time period, up to 4 hours). This could take the job days to complete, depending on how many tokens need to be translated.
If the content cannot be successfully translated at this point, the fallback translation method will be used instead.
If no fallback is configured, Smartling will continue retrying Vertex AI for up to 7 days.
Glossary term insertion
AI-Enhanced Glossary Term Insertion is supported for translations generated by Google Gemini (Vertex AI) when used in a translation workflow. Glossary Term Insertion is not supported when Gemini is used for an MT integration, such as MT in the CAT Tool, MT API, GDN, or Smartling Translate.
Hallucinations
LLMs can hallucinate, meaning they can generate content that was not requested. Higher temperature settings can increase the likelihood of hallucinations.
Tags and placeholders
At times, HTML tags and placeholders may be handled incorrectly by LLMs. A post-editing step may be required to ensure the correct placement of HTML tags and placeholders.
Prompts
Prompts are instructions that guide the LLM on how to translate. You can include as much direction and context as needed such as tone, formality, or brand style. However, do always keep token limits in mind.
Because LLMs cannot indicate uncertainty, they may hallucinate instead of failing gracefully. To reduce this risk, include instructions in the prompt for situations when the model is unsure.
Prompt results may also vary in quality and response time, depending on the complexity of the instructions.
Quality
LLMs generally perform poorly when translating low-resource languages. They cannot be fine-tuned using linguistic assets like custom-trained MT engines can. Overall, custom-trained MT models tend to produce higher-quality translations.
If you’re using an LLM for translation, we strongly recommend including a human editing or review step in the workflow.