huangzixian
commited on
Commit
•
d8949ca
1
Parent(s):
739b1b5
update readme
Browse files
README.md
CHANGED
@@ -6,45 +6,14 @@
|
|
6 |
- **Repository**: https://github.com/CONE-MT/LLaMAX/
|
7 |
|
8 |
### Model Description
|
9 |
-
|
10 |
-
|
11 |
|
12 |
-
|
13 |
-
### 🔥 Effortless Multilingual Translation with a Simple Prompt
|
14 |
-
|
15 |
-
LLaMAX supports translation between more than 100 languages, surpassing the performance of similarly scaled LLMs.
|
16 |
-
|
17 |
-
```angular2html
|
18 |
-
def Prompt_template(query, src_language, trg_language):
|
19 |
-
instruction = f'Translate the following sentences from {src_language} to {trg_language}.'
|
20 |
-
prompt = (
|
21 |
-
'Below is an instruction that describes a task, paired with an input that provides further context. '
|
22 |
-
'Write a response that appropriately completes the request.\n'
|
23 |
-
f'### Instruction:\n{instruction}\n'
|
24 |
-
f'### Input:\n{query}\n### Response:'
|
25 |
-
)
|
26 |
-
return prompt
|
27 |
-
```
|
28 |
-
|
29 |
-
And then run the following codes to execute translation:
|
30 |
-
```angular2html
|
31 |
-
from transformers import AutoTokenizer, LlamaForCausalLM
|
32 |
-
|
33 |
-
model = LlamaForCausalLM.from_pretrained(PATH_TO_CONVERTED_WEIGHTS)
|
34 |
-
tokenizer = AutoTokenizer.from_pretrained(PATH_TO_CONVERTED_TOKENIZER)
|
35 |
-
|
36 |
-
query = "你好,今天是个好日子"
|
37 |
-
prompt = Prompt_template(query, 'Chinese', 'English')
|
38 |
-
inputs = tokenizer(prompt, return_tensors="pt")
|
39 |
-
|
40 |
-
generate_ids = model.generate(inputs.input_ids, max_length=30)
|
41 |
-
tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
|
42 |
-
# => "Hello, today is a good day"
|
43 |
-
```
|
44 |
|
45 |
### 🔥 Effective Base Model for Multilingual Task
|
46 |
|
47 |
-
|
48 |
We fine-tuned LLaMAX using only the English training set of downstream task, which also shows significant improvements in non-English. We provide fine-tuning LLaMAX models for the following three tasks:
|
49 |
|
50 |
- **Math Reasoning**: https://huggingface.co/LLaMAX/LLaMAX2-7B-MetaMath
|
|
|
6 |
- **Repository**: https://github.com/CONE-MT/LLaMAX/
|
7 |
|
8 |
### Model Description
|
9 |
+
LLaMAX3-8B is a multilingual language base model, developed through continued pre-training on Llama3, and supports over 100 languages.
|
10 |
+
LLaMAX3-8B can serve as a base model to support downstream multilingual tasks but without instruct-following capability.
|
11 |
|
12 |
+
We further fine-tuned LLaMAX2-7B on Alpaca dataset to enhance its instruct-following capabilities. The model is available at https://huggingface.co/LLaMAX/LLaMAX3-8B-Alpaca.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
|
14 |
### 🔥 Effective Base Model for Multilingual Task
|
15 |
|
16 |
+
LLaMAX2-7B preserves its efficacy in general tasks and improves the performance on multilingual tasks.
|
17 |
We fine-tuned LLaMAX using only the English training set of downstream task, which also shows significant improvements in non-English. We provide fine-tuning LLaMAX models for the following three tasks:
|
18 |
|
19 |
- **Math Reasoning**: https://huggingface.co/LLaMAX/LLaMAX2-7B-MetaMath
|