Scaling Model and Data for Multilingual Machine Translation with Open Large Language Models
Yuzhe Shang, Pengzhi Gao, Wei Liu, Jian Luan, Jinsong Su
Abstract
Open large language models (LLMs) have demonstrated improving multilingual capabilities in recent years. In this paper, we present a study of open LLMs for multilingual machine translation (MT) across a range of languages, and investigate the effects of model scaling and data scaling when adapting open LLMs to multilingual MT through continual pretraining and instruction finetuning. Based on the Gemma3 model family,...
Results & Benchmarks
Benchmark data is not yet available for this paper.
Hardware Requirements
- Expect multi-day setup/compute for meaningful reproduction based on current guidance.
Best Implementation
Maintained implementation evidence is not confirmed for this paper yet.
Use the Implementation Status and Reproduction Path sections below for the current action plan.
Reproduction Path
Follow this baseline workflow to decide if this paper is worth immediate prototyping.
- 1
Use the paper-linked Hugging Face release as the starting artifact, then reconstruct training and evaluation settings from the paper.
- 2
Use the paper and benchmark evidence to scope a baseline reproduction plan.
- 3
Track assumptions and missing details in an experiment log before coding.
Framework baselines
- Hugging Face Transformers training guide
Modern transformer training baseline.
- PyTorch nn.Transformer docs
Reference transformer building block implementation.
Additional Implementations
Official
No additional official repositories detected.
Community
- NickDee96/ASR-TTS-paper-dailyConfidence: low
ASR-TTS Paper Daily automatically queries arXiv (with a Papers with Code fallback) to curate, categorize, and publish a daily-updated list of the latest research in speech and language technology—including ASR, TTS, machine translation, small language models, data augmentation, and synthetic generation—with direct links to papers and code.
Stars: 3Forks: 1Last push: Apr 2026License: Apache-2.0 - xiaomi-research/gemmaxConfidence: low
Gemma-based Multilingual Machine Translation Models
Stars: 40Forks: 5Last push: Feb 2026License: Apache-2.0