VietnamAIHub
commited on
Commit
·
db25599
1
Parent(s):
d43c535
update
Browse files- README.md +81 -3
- special_tokens_map.json +23 -0
README.md
CHANGED
@@ -1,3 +1,81 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Llama-30b with LoRA Adapters
|
2 |
+
|
3 |
+
[Llama-30b with LoRA Adapters]
|
4 |
+
|
5 |
+
This repository contains a Llama-30b model fine-tuned with QLoRA (Quantization Low-Rank Adapter) adapters. The adapter is a plug-and-play tool that enables the LLaMa model to perform well in many Vietnamese NLP tasks.
|
6 |
+
|
7 |
+
## Model Overview
|
8 |
+
|
9 |
+
The Llama-30b model is a large language model capable of generating meaningful text and can be used in a wide variety of natural language processing tasks, including text generation, sentiment analysis, and more. By using LoRA adapters, the model achieves better performance on low-resource tasks and demonstrates improved generalization.
|
10 |
+
|
11 |
+
## Dataset and Fine-Tuning
|
12 |
+
|
13 |
+
The LLaMa model was fine-tuned on over 200K instructions from various sources to improve its ability to understand and generate text for different tasks. The instruction dataset comprises data from the following sources:
|
14 |
+
|
15 |
+
- Alpaca 52
|
16 |
+
- LiMA 1K
|
17 |
+
- Dolly 15K
|
18 |
+
- VietHealth
|
19 |
+
- WikiHow
|
20 |
+
- GPT4ALL
|
21 |
+
- VietQuAD
|
22 |
+
|
23 |
+
## Loading the Model
|
24 |
+
|
25 |
+
To load the fine-tuned Llama-30b model with LoRA adapters, follow the code snippet below:
|
26 |
+
|
27 |
+
```python
|
28 |
+
import torch
|
29 |
+
from transformers import AutoModelForCausalLM, LlamaTokenizer
|
30 |
+
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
31 |
+
|
32 |
+
model_name = "VietnamAIHub/Vietnamese_SFT_llama_30B_v1"
|
33 |
+
cache_dir="/save_weight_path"
|
34 |
+
## Loading Base LLaMa model weight and Merge with Adapter Weight wiht the base model
|
35 |
+
m = AutoModelForCausalLM.from_pretrained(
|
36 |
+
model_name,
|
37 |
+
torch_dtype=torch.bfloat16,
|
38 |
+
device_map={"cuda": 0},
|
39 |
+
cache_dir=cache_dir
|
40 |
+
)
|
41 |
+
|
42 |
+
## Save model to specific path
|
43 |
+
tok = LlamaTokenizer.from_pretrained(model_name, cache_dir=cache_dir)
|
44 |
+
|
45 |
+
## Loading Unified Model Again after Merging the Weight
|
46 |
+
tok.bos_token_id = 1
|
47 |
+
|
48 |
+
generation_config = dict(
|
49 |
+
temperature=0.2,
|
50 |
+
top_k=20,
|
51 |
+
top_p=0.9,
|
52 |
+
do_sample=True,
|
53 |
+
num_beams=1,
|
54 |
+
repetition_penalty=1.2,
|
55 |
+
max_new_tokens=400,
|
56 |
+
early_stopping=True,
|
57 |
+
|
58 |
+
)
|
59 |
+
|
60 |
+
prompt="Cách để học tập về một môn học thật tốt"
|
61 |
+
_DEFAULT_TEMPLATE=f"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### prompt:\n{prompt}\n\n### response:\n"
|
62 |
+
|
63 |
+
inputs = tok(message,return_tensors="pt") #add_special_tokens=False ?
|
64 |
+
generation_output = m.generate(
|
65 |
+
input_ids = inputs["input_ids"].to(device),
|
66 |
+
attention_mask = inputs['attention_mask'].to(device),
|
67 |
+
eos_token_id=tok.eos_token_id,
|
68 |
+
pad_token_id=tok.pad_token_id,
|
69 |
+
**generation_config
|
70 |
+
)
|
71 |
+
generation_output_ = m.generate(input_ids = inputs["input_ids"].to(device), **generation_config)
|
72 |
+
s = generation_output[0]
|
73 |
+
output = tok.decode(s,skip_special_tokens=True)
|
74 |
+
response = output.split("### response:")[1].strip()
|
75 |
+
print(respone)
|
76 |
+
```
|
77 |
+
|
78 |
+
## Conclusion
|
79 |
+
The Llama-30b with LoRA adapters is a versatile language model that can be utilized for a wide range of NLP tasks in Vietnamese. We hope that researchers and developers find this model useful and are encouraged to experiment with it in their projects.
|
80 |
+
|
81 |
+
For any questions, feedback, or contributions, please feel free to contact the maintainers of this repository TranNhiem 🙌: [Linkedin](https://www.linkedin.com/in/tran-nhiem-ab1851125/) [Twitter](https://twitter.com/TranRick2) [Facebook](https://www.facebook.com/jean.tran.336). Happy fine-tuning and experimenting with the Llama-30b model!
|
special_tokens_map.json
ADDED
@@ -0,0 +1,23 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"bos_token": {
|
3 |
+
"content": "<s>",
|
4 |
+
"lstrip": false,
|
5 |
+
"normalized": true,
|
6 |
+
"rstrip": false,
|
7 |
+
"single_word": false
|
8 |
+
},
|
9 |
+
"eos_token": {
|
10 |
+
"content": "</s>",
|
11 |
+
"lstrip": false,
|
12 |
+
"normalized": true,
|
13 |
+
"rstrip": false,
|
14 |
+
"single_word": false
|
15 |
+
},
|
16 |
+
"unk_token": {
|
17 |
+
"content": "<unk>",
|
18 |
+
"lstrip": false,
|
19 |
+
"normalized": true,
|
20 |
+
"rstrip": false,
|
21 |
+
"single_word": false
|
22 |
+
}
|
23 |
+
}
|