wtarit
/

nllb-600M-th-en

Text2Text Generation

Inference Endpoints

Model card Files Files and versions Community

nllb-600M-th-en / README.md

wtarit's picture

Create README.md

226f02c over 1 year ago

|

history blame contribute delete

1.25 kB

	---
	metrics:
	- sacrebleu
	language:
	- en
	- th
	---

	# NLLB 600M TH-EN finetuned
	This model is finetuned from [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) using SCB-1M and OPUS dataset.
	The finetuning script is on [GitHub](https://github.com/wtarit/th-en-machine-translation/tree/main/NLLB).
	View full finetuning logs on [wandb](https://wandb.ai/wtarit/NLLB%20TH-EN%20Machine%20Translation/runs/5ma65zoy).

	## Usage
	```Python
	from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
	import torch

	MODEL_NAME = "wtarit/nllb-600M-th-en"

	model = AutoModelForSeq2SeqLM.from_pretrained(MODEL_NAME)
	tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
	device = 0 if torch.cuda.is_available() else "cpu"

	translation_pipeline = pipeline(
	"translation",
	model=model,
	tokenizer=tokenizer,
	src_lang="tha_Thai",
	tgt_lang="eng_Latn",
	max_length=400,
	device=device
	)

	# Run translation pipeline
	result = translation_pipeline("สวัสดี เราคือโมเดลแปลภาษา")
	print(result[0]['translation_text'])
	```

	## Score
	BLEU Score (Using [sacrebleu](https://huggingface.co/spaces/evaluate-metric/sacrebleu)): 27.37 on IWSLT 2015