johntsi commited on
Commit
8897321
1 Parent(s): 7917e90

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -0
README.md ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ - ar
6
+ - ca
7
+ - de
8
+ - et
9
+ - fa
10
+ - id
11
+ - ja
12
+ - lv
13
+ - mn
14
+ - sl
15
+ - sv
16
+ - ta
17
+ - tr
18
+ - zh
19
+ metrics:
20
+ - bleu
21
+ pipeline_tag: translation
22
+ datasets:
23
+ - facebook/covost2
24
+ ---
25
+ # Model Name
26
+
27
+ This is a multilingually fine-tuned version of [NLLB](https://arxiv.org/abs/2207.04672) based on [nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) using the text data of CoVoST2 (En -> 15).
28
+
29
+ ## Usage
30
+
31
+ ```python
32
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
33
+
34
+ tokenizer = AutoTokenizer.from_pretrained("johntsi/nllb-200-distilled-600M_covost2_en-to-15")
35
+ model = AutoModelForSeq2SeqLM.from_pretrained("johntsi/nllb-200-distilled-600M_covost2_en-to-15")
36
+
37
+ model.eval()
38
+ model.to("cuda")
39
+
40
+ text = "Translate this text to German."
41
+ inputs = tokenizer(text, return_tensors="pt").to("cuda")
42
+ outputs = model.generate(
43
+ **inputs,
44
+ num_beams=5,
45
+ forced_bos_token_id=tokenizer.lang_code_to_id["deu_Latn"]
46
+ )
47
+ translated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
48
+ print(translated_text)
49
+ ```
50
+
51
+ ## Results: BLEU scores on CoVoST2 test (text part)
52
+
53
+ | Model | Ar | Ca | Cy | De | Et | Fa | Id | Ja | Lv | Mn | Sl | Sv | Ta | Tr | Zh | Average |
54
+ |:------------------------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:-------:|
55
+ | nllb-200-distilled-600M (original) | 20.0 | 39.0 | 26.3 | 35.5 | 23.4 | 15.7 | 39.6 | 21.8 | 14.8 | 10.4 | 30.3 | 41.1 | 20.2 | 21.1 | 34.8 | 26.3 |
56
+ | nllb-200-distilled-600M_covost2_en-to-15 | 28.5 | 46.3 | 35.5 | 37.1 | 31.5 | 29.2 | 45.2 | 38.4 | 29.1 | 22.0 | 37.7 | 45.4 | 29.9 | 23.0 | 46.7 | 35.0 |
57
+ | nllb-200-distilled-1.3B (original) | 23.3 | 43.5 | 33.5 | 37.9 | 27.9 | 16.6 | 41.9 | 23.0 | 20.0 | 13.1 | 35.1 | 43.8 | 21.7 | 23.8 | 37.5 | 29.5 |
58
+ | nllb-200-distilled-1.3B_covost2_en-to-15 | 29.9 | 47.8 | 35.6 | 38.8 | 32.7 | 29.9 | 46.4 | 39.5 | 29.9 | 21.7 | 39.3 | 46.8 | 31.0 | 24.4 | 48.2 | 36.1 |