Papers
arxiv:2606.21413

CAT-Translate: Building Compact Open-Source Models for Japanese-English Translation

Published on Jun 19
Authors:

Abstract

Specialized Japanese-English translation models outperform multilingual counterparts on real-world benchmarks despite strong WMT performance.

Nowadays, large multilingual translation models demonstrate impressive translation capabilities in the machine translation benchmarks. This raises a practical question to the developers: is it worth developing translation models specialized for a particular language pair if you only need to support that language pair? To give an anecdotal answer to this question, we develop a family of small language models (0.8B, 1.4B, 3.3B, and 7B parameters) specialized for Japanese-English bidirectional translation. We employ a two-stage supervised fine-tuning approach followed by Multi-Objective GRPO (Ichihara et al. 2025) to train models on synthetically generated parallel corpora. We evaluate our models on WMT and real-world translation benchmarks across business, legal, medical, financial, and patent domains. While multilingual models achieve strong performance on WMT benchmarks, our compact models outperform them on real-world benchmarks, suggesting the practical utility of developing specialized translation models even in the era of large multilingual models.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2606.21413
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 4

Datasets citing this paper 1

Spaces citing this paper 1

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.