gsarti
/

opus-mt-tc-base-en-hi

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

opus-mt-tc-base-en-hi / README.md

gsarti's picture

Add multilingual to the language tag (#1)

4a6e749 over 1 year ago

|

raw history blame contribute delete

No virus

1.84 kB

	---
	language:
	- en
	- hi
	- multilingual
	license: cc-by-4.0
	tags:
	- translation
	- opus-mt-tc
	model-index:
	- name: opus-mt-tc-base-en-hi
	results:
	- task:
	type: translation
	name: Translation eng-hin
	dataset:
	name: tatoeba-test-v2021-08-07
	type: tatoeba_mt
	args: eng-hin
	metrics:
	- type: bleu
	value: 22.2
	name: BLEU
	---

	# Opus Tatoeba English-Hindi

	This model was obtained by running the script [convert_marian_to_pytorch.py](https://github.com/huggingface/transformers/blob/master/src/transformers/models/marian/convert_marian_to_pytorch.py). The original models were trained by [J�rg Tiedemann](https://blogs.helsinki.fi/tiedeman/) using the [MarianNMT](https://marian-nmt.github.io/) library. See all available `MarianMTModel` models on the profile of the [Helsinki NLP](https://huggingface.co/Helsinki-NLP) group.

	* dataset: opus+bt
	* model: transformer-align
	* source language(s): eng
	* target language(s): hin
	* model: transformer-align
	* pre-processing: normalization + SentencePiece (spm32k,spm32k)
	* download: [opus+bt-2021-04-10.zip](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-hin/opus+bt-2021-04-10.zip)
	* test set translations: [opus+bt-2021-04-10.test.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-hin/opus+bt-2021-04-10.test.txt)
	* test set scores: [opus+bt-2021-04-10.eval.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-hin/opus+bt-2021-04-10.eval.txt)

	## Benchmarks

	\| testset \| BLEU \| chr-F \| #sent \| #words \| BP \|
	\|---------\|-------\|-------\|-------\|--------\|----\|
	\| newsdev2014.eng-hin \| 13.9 \| 0.421 \| 520 \| 9538 \| 1.000 \|
	\| newstest2014-hien.eng-hin \| 17.4 \| 0.442 \| 2507 \| 60878 \| 0.989 \|
	\| Tatoeba-test.eng-hin \| 22.2 \| 0.485 \| 5000 \| 32904 \| 1.000 \|
	\| tico19-test.eng-hin \| 30.6 \| 0.539 \| 2100 \| 62738 \| 0.988 \|