VietnamAIHub commited on
Commit
db25599
·
1 Parent(s): d43c535
Files changed (2) hide show
  1. README.md +81 -3
  2. special_tokens_map.json +23 -0
README.md CHANGED
@@ -1,3 +1,81 @@
1
- ---
2
- license: cc-by-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Llama-30b with LoRA Adapters
2
+
3
+ [Llama-30b with LoRA Adapters]
4
+
5
+ This repository contains a Llama-30b model fine-tuned with QLoRA (Quantization Low-Rank Adapter) adapters. The adapter is a plug-and-play tool that enables the LLaMa model to perform well in many Vietnamese NLP tasks.
6
+
7
+ ## Model Overview
8
+
9
+ The Llama-30b model is a large language model capable of generating meaningful text and can be used in a wide variety of natural language processing tasks, including text generation, sentiment analysis, and more. By using LoRA adapters, the model achieves better performance on low-resource tasks and demonstrates improved generalization.
10
+
11
+ ## Dataset and Fine-Tuning
12
+
13
+ The LLaMa model was fine-tuned on over 200K instructions from various sources to improve its ability to understand and generate text for different tasks. The instruction dataset comprises data from the following sources:
14
+
15
+ - Alpaca 52
16
+ - LiMA 1K
17
+ - Dolly 15K
18
+ - VietHealth
19
+ - WikiHow
20
+ - GPT4ALL
21
+ - VietQuAD
22
+
23
+ ## Loading the Model
24
+
25
+ To load the fine-tuned Llama-30b model with LoRA adapters, follow the code snippet below:
26
+
27
+ ```python
28
+ import torch
29
+ from transformers import AutoModelForCausalLM, LlamaTokenizer
30
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
31
+
32
+ model_name = "VietnamAIHub/Vietnamese_SFT_llama_30B_v1"
33
+ cache_dir="/save_weight_path"
34
+ ## Loading Base LLaMa model weight and Merge with Adapter Weight wiht the base model
35
+ m = AutoModelForCausalLM.from_pretrained(
36
+ model_name,
37
+ torch_dtype=torch.bfloat16,
38
+ device_map={"cuda": 0},
39
+ cache_dir=cache_dir
40
+ )
41
+
42
+ ## Save model to specific path
43
+ tok = LlamaTokenizer.from_pretrained(model_name, cache_dir=cache_dir)
44
+
45
+ ## Loading Unified Model Again after Merging the Weight
46
+ tok.bos_token_id = 1
47
+
48
+ generation_config = dict(
49
+ temperature=0.2,
50
+ top_k=20,
51
+ top_p=0.9,
52
+ do_sample=True,
53
+ num_beams=1,
54
+ repetition_penalty=1.2,
55
+ max_new_tokens=400,
56
+ early_stopping=True,
57
+
58
+ )
59
+
60
+ prompt="Cách để học tập về một môn học thật tốt"
61
+ _DEFAULT_TEMPLATE=f"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### prompt:\n{prompt}\n\n### response:\n"
62
+
63
+ inputs = tok(message,return_tensors="pt") #add_special_tokens=False ?
64
+ generation_output = m.generate(
65
+ input_ids = inputs["input_ids"].to(device),
66
+ attention_mask = inputs['attention_mask'].to(device),
67
+ eos_token_id=tok.eos_token_id,
68
+ pad_token_id=tok.pad_token_id,
69
+ **generation_config
70
+ )
71
+ generation_output_ = m.generate(input_ids = inputs["input_ids"].to(device), **generation_config)
72
+ s = generation_output[0]
73
+ output = tok.decode(s,skip_special_tokens=True)
74
+ response = output.split("### response:")[1].strip()
75
+ print(respone)
76
+ ```
77
+
78
+ ## Conclusion
79
+ The Llama-30b with LoRA adapters is a versatile language model that can be utilized for a wide range of NLP tasks in Vietnamese. We hope that researchers and developers find this model useful and are encouraged to experiment with it in their projects.
80
+
81
+ For any questions, feedback, or contributions, please feel free to contact the maintainers of this repository TranNhiem 🙌: [Linkedin](https://www.linkedin.com/in/tran-nhiem-ab1851125/) [Twitter](https://twitter.com/TranRick2) [Facebook](https://www.facebook.com/jean.tran.336). Happy fine-tuning and experimenting with the Llama-30b model!
special_tokens_map.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": true,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": true,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "unk_token": {
17
+ "content": "<unk>",
18
+ "lstrip": false,
19
+ "normalized": true,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ }
23
+ }