Introduction

English - Catalan translation model based on OpenNMT. These are the same models that we have in production at https://www.softcatala.org/traductor/.

Usage

pip3 install ctranslate2 pyonmttok

Simple translation using Python:

import ctranslate2
import pyonmttok
from huggingface_hub import snapshot_download
model_dir = snapshot_download(repo_id="softcatala/translate-eng-cat", revision="main")

tokenizer=pyonmttok.Tokenizer(mode="none", sp_model_path = model_dir + "/sp_m.model")
tokenized=tokenizer.tokenize("Hello world!")

import ctranslate2
translator = ctranslate2.Translator(model_dir)
translated = translator.translate_batch([tokenized[0]])
print(tokenizer.detokenize(translated[0][0]['tokens']))
Hola món!

Benchmarks

testset BLEU
test dataset (from train/dev/test) 46.9
Flores200 dataset 43.8

Additional information

Downloads last month
20
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.