LLaMAX commited on
Commit
05472ed
1 Parent(s): e5b8673

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -53
README.md CHANGED
@@ -1,53 +1,57 @@
1
-
2
- ### Model Sources
3
- - **Paper**: LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages
4
- - **Link**: https://arxiv.org/pdf/2407.05975
5
- - **Repository**: https://github.com/CONE-MT/LLaMAX/
6
-
7
- ### Model Description
8
-
9
- 🔥 LLaMAX-7B-X-CSQA is a commonsense reasoning model with multilingual capability, which is fully fine-tuned the powerful multilingual model [LLaMAX-7B](https://huggingface.co/LLaMAX/LLaMAX-7B) on five English commonsense reasoning dataset to train LLaMAX-7B-X-CSQA, including X-CSQA, ARC-Easy, ARC-Challenge, OpenBookQA, and QASC.
10
-
11
- 🔥 Compared with fine-tuning Llama-2 on the same setting, LLaMAX-7B-X-CSQA improves the average accuracy up to 4.2% on the X-CSQA dataset.
12
-
13
-
14
- ### Experiments
15
-
16
-
17
- | X-CSQA | Avg. | Sw | Ur | Hi | Ar | Vi | Ja | Pl | Zh | Nl | Ru | It | De | Pt | Fr | Es | En |
18
- |--------------------|------|------|------|------|------|----|-------|------|-------|----|------|------|-------|------|-------|--------|--------|
19
- | Llama2-7B-X-CSQA | 50.9 | 23.2 | 24.7 | 32.9 | 32.4 | 51.0 | 50.0 | 51.5 | 55.6 | 56.9 | 55.8 | 58.8 | 59.9 | 60.4 | 61.8 | 61.9 | 78.1 |
20
- | LLaMAX-7B-X-CSQA | 55.1 | 43.5 | 39.0 | 44.1 | 45.1 | 54.0 | 49.9 | 54.6 | 58.2 | 58.9 | 57.1 | 59.1 | 59.0 | 60.9 | 61.6 | 62.7 | 74.0 |
21
-
22
- ### Model Usage
23
-
24
- Code Example:
25
- ```angular2html
26
- from transformers import AutoTokenizer, LlamaForCausalLM
27
-
28
- model = LlamaForCausalLM.from_pretrained(PATH_TO_CONVERTED_WEIGHTS)
29
- tokenizer = AutoTokenizer.from_pretrained(PATH_TO_CONVERTED_TOKENIZER)
30
-
31
- query = "What is someone operating a vehicle likely to be accused of after becoming inebriated? \n Options: A.punish \t B. arrest \t C. automobile accidents \t D. talking nonsense \t E.drunk
32
- driving \n Answer:"
33
- inputs = tokenizer(query, return_tensors="pt")
34
-
35
- generate_ids = model.generate(inputs.input_ids, max_length=30)
36
- tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
37
- # => E
38
- ```
39
-
40
- ### Citation
41
- if our model helps your work, please cite this paper:
42
-
43
- ```
44
- @misc{lu2024llamaxscalinglinguistichorizons,
45
- title={LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages},
46
- author={Yinquan Lu and Wenhao Zhu and Lei Li and Yu Qiao and Fei Yuan},
47
- year={2024},
48
- eprint={2407.05975},
49
- archivePrefix={arXiv},
50
- primaryClass={cs.CL},
51
- url={https://arxiv.org/abs/2407.05975},
52
- }
53
- ```
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - Multilingual
4
+ ---
5
+
6
+ ### Model Sources
7
+ - **Paper**: LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages
8
+ - **Link**: https://arxiv.org/pdf/2407.05975
9
+ - **Repository**: https://github.com/CONE-MT/LLaMAX/
10
+
11
+ ### Model Description
12
+
13
+ 🔥 LLaMAX-7B-X-CSQA is a commonsense reasoning model with multilingual capability, which is fully fine-tuned the powerful multilingual model [LLaMAX-7B](https://huggingface.co/LLaMAX/LLaMAX-7B) on five English commonsense reasoning dataset to train LLaMAX-7B-X-CSQA, including X-CSQA, ARC-Easy, ARC-Challenge, OpenBookQA, and QASC.
14
+
15
+ 🔥 Compared with fine-tuning Llama-2 on the same setting, LLaMAX-7B-X-CSQA improves the average accuracy up to 4.2% on the X-CSQA dataset.
16
+
17
+
18
+ ### Experiments
19
+
20
+
21
+ | X-CSQA | Avg. | Sw | Ur | Hi | Ar | Vi | Ja | Pl | Zh | Nl | Ru | It | De | Pt | Fr | Es | En |
22
+ |--------------------|------|------|------|------|------|----|-------|------|-------|----|------|------|-------|------|-------|--------|--------|
23
+ | Llama2-7B-X-CSQA | 50.9 | 23.2 | 24.7 | 32.9 | 32.4 | 51.0 | 50.0 | 51.5 | 55.6 | 56.9 | 55.8 | 58.8 | 59.9 | 60.4 | 61.8 | 61.9 | 78.1 |
24
+ | LLaMAX-7B-X-CSQA | 55.1 | 43.5 | 39.0 | 44.1 | 45.1 | 54.0 | 49.9 | 54.6 | 58.2 | 58.9 | 57.1 | 59.1 | 59.0 | 60.9 | 61.6 | 62.7 | 74.0 |
25
+
26
+ ### Model Usage
27
+
28
+ Code Example:
29
+ ```angular2html
30
+ from transformers import AutoTokenizer, LlamaForCausalLM
31
+
32
+ model = LlamaForCausalLM.from_pretrained(PATH_TO_CONVERTED_WEIGHTS)
33
+ tokenizer = AutoTokenizer.from_pretrained(PATH_TO_CONVERTED_TOKENIZER)
34
+
35
+ query = "What is someone operating a vehicle likely to be accused of after becoming inebriated? \n Options: A.punish \t B. arrest \t C. automobile accidents \t D. talking nonsense \t E.drunk
36
+ driving \n Answer:"
37
+ inputs = tokenizer(query, return_tensors="pt")
38
+
39
+ generate_ids = model.generate(inputs.input_ids, max_length=30)
40
+ tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
41
+ # => E
42
+ ```
43
+
44
+ ### Citation
45
+ if our model helps your work, please cite this paper:
46
+
47
+ ```
48
+ @misc{lu2024llamaxscalinglinguistichorizons,
49
+ title={LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages},
50
+ author={Yinquan Lu and Wenhao Zhu and Lei Li and Yu Qiao and Fei Yuan},
51
+ year={2024},
52
+ eprint={2407.05975},
53
+ archivePrefix={arXiv},
54
+ primaryClass={cs.CL},
55
+ url={https://arxiv.org/abs/2407.05975},
56
+ }
57
+ ```