wtarit commited on
Commit
226f02c
1 Parent(s): 24ebea1

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -0
README.md ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ metrics:
3
+ - sacrebleu
4
+ language:
5
+ - en
6
+ - th
7
+ ---
8
+
9
+ # NLLB 600M TH-EN finetuned
10
+ This model is finetuned from [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) using SCB-1M and OPUS dataset.
11
+ The finetuning script is on [GitHub](https://github.com/wtarit/th-en-machine-translation/tree/main/NLLB).
12
+ View full finetuning logs on [wandb](https://wandb.ai/wtarit/NLLB%20TH-EN%20Machine%20Translation/runs/5ma65zoy).
13
+
14
+ ## Usage
15
+ ```Python
16
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
17
+ import torch
18
+
19
+ MODEL_NAME = "wtarit/nllb-600M-th-en"
20
+
21
+ model = AutoModelForSeq2SeqLM.from_pretrained(MODEL_NAME)
22
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
23
+ device = 0 if torch.cuda.is_available() else "cpu"
24
+
25
+ translation_pipeline = pipeline(
26
+ "translation",
27
+ model=model,
28
+ tokenizer=tokenizer,
29
+ src_lang="tha_Thai",
30
+ tgt_lang="eng_Latn",
31
+ max_length=400,
32
+ device=device
33
+ )
34
+
35
+ # Run translation pipeline
36
+ result = translation_pipeline("สวัสดี เราคือโมเดลแปลภาษา")
37
+ print(result[0]['translation_text'])
38
+ ```
39
+
40
+ ## Score
41
+ BLEU Score (Using [sacrebleu](https://huggingface.co/spaces/evaluate-metric/sacrebleu)): 27.37 on IWSLT 2015