johntsi
/

nllb-200-distilled-600M_covost2_en-to-15

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

nllb-200-distilled-600M_covost2_en-to-15 / README.md

johntsi's picture

Update README.md

28eb35b verified 4 days ago

|

history blame contribute delete

No virus

3.42 kB

	---
	license: mit
	language:
	- en
	- ar
	- ca
	- de
	- et
	- fa
	- id
	- ja
	- lv
	- mn
	- sl
	- sv
	- ta
	- tr
	- zh
	metrics:
	- bleu
	pipeline_tag: translation
	datasets:
	- facebook/covost2
	---
	# Model Name

	This is a multilingually fine-tuned version of [NLLB](https://arxiv.org/abs/2207.04672) based on [nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) using the text data of CoVoST2 (En -> 15).

	It is part of the paper [Pushing the Limits of Zero-shot End-to-end Speech Translation](https://arxiv.org/abs/2402.10422). Details for the fine-tuning process are available at Appendix D.

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

	tokenizer = AutoTokenizer.from_pretrained("johntsi/nllb-200-distilled-600M_covost2_en-to-15")
	model = AutoModelForSeq2SeqLM.from_pretrained("johntsi/nllb-200-distilled-600M_covost2_en-to-15")

	model.eval()
	model.to("cuda")

	text = "Translate this text to German."
	inputs = tokenizer(text, return_tensors="pt").to("cuda")
	outputs = model.generate(
	**inputs,
	num_beams=5,
	forced_bos_token_id=tokenizer.lang_code_to_id["deu_Latn"]
	)
	translated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(translated_text)
	```

	## Results

	#### BLEU scores on CoVoST2 test

	\| Model \| Ar \| Ca \| Cy \| De \| Et \| Fa \| Id \| Ja \| Lv \| Mn \| Sl \| Sv \| Ta \| Tr \| Zh \| Average \|
	\|:------------------------:\|:------:\|:------:\|:------:\|:------:\|:------:\|:------:\|:------:\|:------:\|:------:\|:------:\|:------:\|:------:\|:------:\|:------:\|:------:\|:-------:\|
	\| [nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) (original) \| 20.0 \| 39.0 \| 26.3 \| 35.5 \| 23.4 \| 15.7 \| 39.6 \| 21.8 \| 14.8 \| 10.4 \| 30.3 \| 41.1 \| 20.2 \| 21.1 \| 34.8 \| 26.3 \|
	\| [nllb-200-distilled-600M_covost2_en-to-15](https://huggingface.co/johntsi/nllb-200-distilled-600M_covost2_en-to-15) \| 28.5 \| 46.3 \| 35.5 \| 37.1 \| 31.5 \| 29.2 \| 45.2 \| 38.4 \| 29.1 \| 22.0 \| 37.7 \| 45.4 \| 29.9 \| 23.0 \| 46.7 \| 35.0 \|
	\| [nllb-200-distilled-1.3B](https://huggingface.co/facebook/nllb-200-distilled-1.3B) (original) \| 23.3 \| 43.5 \| 33.5 \| 37.9 \| 27.9 \| 16.6 \| 41.9 \| 23.0 \| 20.0 \| 13.1 \| 35.1 \| 43.8 \| 21.7 \| 23.8 \| 37.5 \| 29.5 \|
	\| [nllb-200-distilled-1.3B_covost2_en-to-15](https://huggingface.co/johntsi/nllb-200-distilled-1.3B_covost2_en-to-15) \| 29.9 \| 47.8 \| 35.6 \| 38.8 \| 32.7 \| 29.9 \| 46.4 \| 39.5 \| 29.9 \| 21.7 \| 39.3 \| 46.8 \| 31.0 \| 24.4 \| 48.2 \| 36.1 \|

	## Citation

	If you find these models useful for your research, please cite our paper :)

	```
	@inproceedings{tsiamas-etal-2024-pushing,
	title = {{Pushing the Limits of Zero-shot End-to-End Speech Translation}},
	author = "Tsiamas, Ioannis and
	G{\'a}llego, Gerard and
	Fonollosa, Jos{\'e} and
	Costa-juss{\`a}, Marta",
	editor = "Ku, Lun-Wei and
	Martins, Andre and
	Srikumar, Vivek",
	booktitle = "Findings of the Association for Computational Linguistics ACL 2024",
	month = aug,
	year = "2024",
	address = "Bangkok, Thailand and virtual meeting",
	publisher = "Association for Computational Linguistics",
	url = "https://aclanthology.org/2024.findings-acl.847",
	pages = "14245--14267",
	}
	```