Best Practices for Multilingual Model Training: Single vs. Multi-Model for Translation

Generative AI

Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.

Hello everyone,

I’m working on a translation project involving documents up to 100 pages long, in 17 different languages, and I'm looking for the best approach to achieve high-quality translations in this multilingual context.

Single model vs. multi-model approach
- Is it better to use a single multilingual model or to train separate models for each source language?
- If I go with a single model, is it possible to progressively add each new language by retraining the model multiple times without losing the ability to translate into previously trained languages?
- Lastly, if I’m using the same source language, can I train the model to translate into multiple target languages without needing a dedicated model for each source-target combination?
Model
I’m planning to use Databricks to train the model, following the advice from this article: Fine-Tuning Large Language Models and leveraging Hugging Face’s translation script: run_translation.py. Would this approach be effective for achieving quality translations across a wide range of languages?
Using Databricks functions for common languages
Databricks offers a built-in translation function (ai_translate), but it currently only supports translations between French, English, and Spanish. If one of these languages matches my translation requirements, would it make sense to prioritize this solution? Is it potentially more effective than tools like DeepL, which haven’t fully met my client’s expectations?

Thanks in advance for any advice and insights on the best approach to take!

0 REPLIES 0

Photos

Upload Upload
URL URL
Saved Photos Saved Photos

Upload location

Upload location

Add Photos to Album:

New Album

Drag here to start uploading

Drag photos here or

Tap for upload options

You must install or upgrade to the latest version of Adobe Flash Player before you can upload images.