--- language: - ru - zh - en tags: - translation - text2text-generation - t5 license: apache-2.0 datasets: - ccmatrix metrics: - sacrebleu widget: - example_title: translate zh-ru text: > translate to ru: 开发的目的是为用户提供个人同步翻译。 - example_title: translate ru-en text: > translate to en: Цель разработки — предоставить пользователям личного синхронного переводчика. - example_title: translate en-ru text: > translate to ru: The purpose of the development is to provide users with a personal synchronized interpreter. - example_title: translate en-zh text: > translate to zh: The purpose of the development is to provide users with a personal synchronized interpreter. - example_title: translate zh-en text: > translate to en: 开发的目的是为用户提供个人同步解释器。 - example_title: translate ru-zh text: > translate to zh: Цель разработки — предоставить пользователям личного синхронного переводчика. model-index: - name: utrobinmv/t5_translate_en_ru_zh_base_200 results: - task: type: translation name: Translation en-ru dataset: name: ntrex_en-ru type: ntrex config: ntrex en-ru split: test metrics: - type: sacrebleu value: 28.575940911021487 name: bleu verified: false - type: chrf value: 54.27996346886896 name: chrf verified: false - type: ter value: 62.494863914873584 name: ter verified: false - type: meteor value: 0.5174833677740809 name: meteor verified: false - type: rouge value: 0.1908317951570274 name: ROUGE-1 verified: false - type: rouge value: 0.065555552204933 name: ROUGE-2 verified: false - type: rouge value: 0.1895542893295215 name: ROUGE-L verified: false - type: rouge value: 0.1893813749889601 name: ROUGE-LSUM verified: false - type: bertscore value: 0.8554933660030365 name: bertscore_f1 verified: false - type: bertscore value: 0.8578473615646363 name: bertscore_precision verified: false - type: bertscore value: 0.8534188346862793 name: bertscore_recall verified: false source: name: NTREX dataset Benchmark url: https://huggingface.co/spaces/utrobinmv/TREX_benchmark_en_ru_zh - name: utrobinmv/t5_translate_en_ru_zh_base_200 results: - task: type: translation name: Translation ru-en dataset: name: ntrex_ru-en type: ntrex config: ntrex ru-en split: test metrics: - type: sacrebleu value: 28.575940911021487 name: bleu verified: false - type: chrf value: 54.27996346886896 name: chrf verified: false - type: ter value: 62.494863914873584 name: ter verified: false - type: meteor value: 0.5174833677740809 name: meteor verified: false - type: rouge value: 0.1908317951570274 name: ROUGE-1 verified: false - type: rouge value: 0.065555552204933 name: ROUGE-2 verified: false - type: rouge value: 0.1895542893295215 name: ROUGE-L verified: false - type: rouge value: 0.1893813749889601 name: ROUGE-LSUM verified: false - type: bertscore value: 0.8554933660030365 name: bertscore_f1 verified: false - type: bertscore value: 0.8578473615646363 name: bertscore_precision verified: false - type: bertscore value: 0.8534188346862793 name: bertscore_recall verified: false source: name: NTREX dataset Benchmark url: https://huggingface.co/spaces/utrobinmv/TREX_benchmark_en_ru_zh --- # T5 English, Russian and Chinese multilingual machine translation This model represents a conventional T5 transformer in multitasking mode for translation into the required language, precisely configured for machine translation for pairs: ru-zh, zh-ru, en-zh, zh-en, en-ru, ru-en. The model can perform direct translation between any pair of Russian, Chinese or English languages. For translation into the target language, the target language identifier is specified as a prefix 'translate to :'. In this case, the source language may not be specified, in addition, the source text may be multilingual. Example translate Russian to Chinese ```python from transformers import T5ForConditionalGeneration, T5Tokenizer model_name = 'utrobinmv/t5_translate_en_ru_zh_small_1024' model = T5ForConditionalGeneration.from_pretrained(model_name) tokenizer = T5Tokenizer.from_pretrained(model_name) prefix = 'translate to zh: ' src_text = prefix + "Цель разработки — предоставить пользователям личного синхронного переводчика." # translate Russian to Chinese input_ids = tokenizer(src_text, return_tensors="pt") generated_tokens = model.generate(**input_ids) result = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True) print(result) #开发的目的是为用户提供个人同步翻译。 ``` and Example translate Chinese to Russian ```python from transformers import T5ForConditionalGeneration, T5Tokenizer model_name = 'utrobinmv/t5_translate_en_ru_zh_small_1024' model = T5ForConditionalGeneration.from_pretrained(model_name) tokenizer = T5Tokenizer.from_pretrained(model_name) prefix = 'translate to ru: ' src_text = prefix + "开发的目的是为用户提供个人同步翻译。" # translate Russian to Chinese input_ids = tokenizer(src_text, return_tensors="pt") generated_tokens = model.generate(**input_ids) result = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True) print(result) #Цель разработки - предоставить пользователям персональный синхронный перевод. ``` ## ## Languages covered Russian (ru_RU), Chinese (zh_CN), English (en_US)