Introducing TranslateGemma: Open Translation Models for 55 Languages with Unmatched Efficiency and Quality
Today, we’re unveiling TranslateGemma, a new suite of open translation models built on Gemma 3 and available in 4B, 12B, and 27B parameter sizes. This marks a major leap forward in open-source translation technology, enabling high-quality communication across 55 languages, regardless of location or device. By distilling the expertise of our most advanced large models into compact, high-performance open models, we’ve created a family of systems where efficiency doesn’t come at the cost of quality. The results are impressive: the 12B TranslateGemma model outperforms the Gemma 3 27B baseline in translation accuracy, as measured by MetricX on the WMT24++ benchmark—despite using less than half the parameters. This breakthrough means developers can achieve top-tier translation quality with significantly reduced computational demands, enabling faster inference, lower latency, and higher throughput—all without sacrificing accuracy. The 4B model further demonstrates this efficiency, matching the performance of the larger 12B baseline, making it ideal for mobile and edge deployment. We evaluated TranslateGemma on the WMT24++ dataset, which includes 55 languages spanning diverse language families—covering high-, mid-, and low-resource languages. Across all languages, TranslateGemma significantly reduced error rates compared to the base Gemma 3 model, delivering superior quality with greater efficiency. How was this level of intelligence packed into smaller models? It comes from a specialized two-stage fine-tuning process that transfers the “intuition” of our Gemini models into an open architecture. First, Supervised Fine-Tuning (SFT) trained the base Gemma 3 models on a diverse dataset of parallel texts. This dataset combines human-translated content with high-quality synthetic translations generated by state-of-the-art Gemini models, ensuring broad language coverage and strong fidelity—even in low-resource languages. Next, we applied a novel Reinforcement Learning (RL) phase. Using an ensemble of advanced reward models—including MetricX-QE and AutoMQM—we guided the models to produce more contextually accurate, natural-sounding translations, refining fluency and precision. TranslateGemma has been rigorously trained and evaluated on 55 core language pairs, delivering reliable, high-quality performance across major global languages like Spanish, French, Chinese, and Hindi, as well as less commonly supported ones. Beyond these, we extended training to nearly 500 additional language pairs. While full evaluation metrics for this expanded set are still pending, we’ve included the complete list in our technical report to encourage community exploration and further research. Importantly, TranslateGemma retains the strong multimodal capabilities of Gemma 3. Testing on the Vistra image translation benchmark shows that improvements in text translation also enhance the ability to translate text within images—even without dedicated multimodal fine-tuning during training. Designed for real-world use, TranslateGemma runs efficiently across a wide range of environments, from cloud servers to mobile devices. With three model sizes, it offers flexibility for diverse deployment needs. TranslateGemma is now available for developers and researchers to use, adapt, and build upon. We’re excited to see how the community leverages these models to break down language barriers and promote deeper cross-cultural understanding. Try TranslateGemma today and help shape the future of open, accessible, and high-quality translation.
