opennmt-eng-cat / README.md
jordimas's picture
Libraryg
156352a
metadata
language:
  - ca
  - en
tags:
  - translation
library_name: opennmt
license: mit
metrics:
  - bleu

Introduction

English - Catalan translation models based on OpenNMT.

Usage

pip3 install ctranslate2 pyonmttok

Simple translation using Python:


import ctranslate2
translator = ctranslate2.Translator("ctranslate2/")
translator.translate_batch([["▁Hello", "▁world", "!"]])
[[{'tokens': ['▁Hola', '▁món', '!']}]]

Simple tokenization & translation using Python:


import pyonmttok
tokenizer=pyonmttok.Tokenizer(mode="none", sp_model_path = "tokenizer/sp_m.model")
tokenized=tokenizer.tokenize("Hello world!")

import ctranslate2
translator = ctranslate2.Translator("ctranslate2/")
translated = translator.translate_batch([tokenized[0]])
print(tokenizer.detokenize(translated[0][0]['tokens']))
Hola món!

Benchmarks

testset BLEU
Tatoeba-test.zho.eng 45.2