davidkim205 commited on
Commit
f962cef
β€’
1 Parent(s): 9146258

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +169 -0
README.md CHANGED
@@ -1,6 +1,175 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  library_name: peft
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  ## Training procedure
5
 
6
 
 
1
  ---
2
+ language:
3
+ - en
4
+ - ko
5
+ pipeline_tag: text-generation
6
+ inference: false
7
+ tags:
8
+ - facebook
9
+ - meta
10
+ - pytorch
11
+ - llama
12
+ - llama-2
13
+ - llama-2-chat
14
+ license: apache-2.0
15
  library_name: peft
16
  ---
17
+
18
+
19
+ # komt-Llama-2-13b-hf-lora
20
+ This model fine-tuned the aaa model using PEFT-LoRA.
21
+
22
+ The "komt-Llama-2-13b-hf-lora" model was developed using a multi-task instruction technique aimed at enhancing Korean language performance. For more details, please refer to the GitHub Repository.
23
+ Please refer below for more detailed information.
24
+
25
+ For more detailed information, please refer to the https://huggingface.co/davidkim205/komt-Llama-2-13b-hf.
26
+
27
+ ## Model Details
28
+
29
+ * **Model Developers** : davidkim(changyeon kim)
30
+ * **Repository** : https://github.com/davidkim205/komt
31
+ * **Lora target modules** : q_proj, o_proj, v_proj, gate_proj, down_proj, k_proj, up_proj
32
+ * **Model Size** : 120MB
33
+ * **Model Architecture** : komt-Llama-2-13b is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning by multi-task instruction
34
+ ## Dataset
35
+ korean multi-task instruction dataset
36
+
37
+ ## Prompt Template
38
+ ```
39
+ ### instruction: {prompt}
40
+
41
+ ### Response:
42
+ ```
43
+ Examples:
44
+ ```
45
+ ### instruction: μžλ™μ°¨ μ’…ν•©(μ •κΈ°)검사 μ˜λ¬΄κΈ°κ°„μ€ μ–Όλ§ˆμΈκ°€μš”?
46
+
47
+ ### Response:
48
+
49
+ ```
50
+ response:
51
+ ```
52
+ ### instruction: μžλ™μ°¨ μ’…ν•©(μ •κΈ°)검사 μ˜λ¬΄κΈ°κ°„μ€ μ–Όλ§ˆμΈκ°€μš”?
53
+
54
+ ### Response: μžλ™μ°¨ μ’…ν•©(μ •κΈ°)검사 μ˜λ¬΄κΈ°κ°„μ€ 2λ…„μž…λ‹ˆλ‹€. 이 κΈ°κ°„ λ™μ•ˆ 검사λ₯Ό 받지 μ•ŠμœΌλ©΄ κ³Όνƒœλ£Œκ°€ λΆ€κ³Όλ©λ‹ˆλ‹€. μžλ™μ°¨ μ’…ν•©(μ •κΈ°)검사 μ˜λ¬΄κΈ°κ°„μ€ 2013λ…„ 12μ›” 31일뢀터 μ‹œν–‰λ˜μ—ˆμŠ΅λ‹ˆλ‹€
55
+ ```
56
+
57
+ ## Usage
58
+
59
+ After downloading from GitHub, please install as follows:
60
+ ```
61
+ git clone https://github.com/davidkim205/komt
62
+ cd komt
63
+ pip install -r lora/requirements_lora.txt
64
+
65
+ ```
66
+ * Requirements Python >=3.8. Linux distribution (Ubuntu, MacOS, etc.) + CUDA > 10.0.
67
+ Refer https://github.com/TimDettmers/bitsandbytes#tldr
68
+
69
+ ```
70
+ import torch
71
+
72
+ from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
73
+ from transformers import StoppingCriteria, StoppingCriteriaList
74
+ from transformers import TextStreamer, GenerationConfig
75
+ from peft import PeftModel, PeftConfig
76
+
77
+ class LocalStoppingCriteria(StoppingCriteria):
78
+
79
+ def __init__(self, tokenizer, stop_words = []):
80
+ super().__init__()
81
+
82
+ stops = [tokenizer(stop_word, return_tensors='pt', add_special_tokens = False)['input_ids'].squeeze() for stop_word in stop_words]
83
+ print('stop_words', stop_words)
84
+ print('stop_words_ids', stops)
85
+ self.stop_words = stop_words
86
+ self.stops = [stop.cuda() for stop in stops]
87
+ self.tokenizer = tokenizer
88
+ def _compare_token(self, input_ids):
89
+ for stop in self.stops:
90
+ if len(stop.size()) != 1:
91
+ continue
92
+ stop_len = len(stop)
93
+ if torch.all((stop == input_ids[0][-stop_len:])).item():
94
+ return True
95
+
96
+ return False
97
+ def _compare_decode(self, input_ids):
98
+ input_str = self.tokenizer.decode(input_ids[0])
99
+ for stop_word in self.stop_words:
100
+ if input_str.endswith(stop_word):
101
+ return True
102
+ return False
103
+
104
+ def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor):
105
+ input_str = self.tokenizer.decode(input_ids[0])
106
+ for stop_word in self.stop_words:
107
+ if input_str.endswith(stop_word):
108
+ return True
109
+ return False
110
+
111
+ #
112
+ # config
113
+ peft_model_name = 'davidkim205/komt-Llama-2-7b-chat-hf-lora'
114
+ model_name = 'davidkim205/komt-Llama-2-7b-chat-hf'
115
+ instruction_prefix = "### instruction: "
116
+ input_prefix = "### input: "
117
+ answer_prefix = "### Response: "
118
+ endoftext = "<|end|>"
119
+ stop_words = [endoftext, '<s>', '###']
120
+ generation_config = GenerationConfig(
121
+ temperature=0.9,
122
+ top_p=0.7,
123
+ top_k=100,
124
+ max_new_tokens=2048,
125
+ early_stopping=True,
126
+ do_sample=True,
127
+ )
128
+ #
129
+ # create model
130
+ config = PeftConfig.from_pretrained(peft_model_name)
131
+ bnb_config = BitsAndBytesConfig(
132
+ load_in_4bit=True,
133
+ bnb_4bit_use_double_quant=True,
134
+ bnb_4bit_quant_type="nf4",
135
+ bnb_4bit_compute_dtype=torch.bfloat16
136
+ )
137
+ model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=bnb_config,
138
+ device_map="auto")
139
+ model = PeftModel.from_pretrained(model, peft_model_name)
140
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
141
+ stopping_criteria = StoppingCriteriaList([LocalStoppingCriteria(tokenizer=tokenizer, stop_words=stop_words)])
142
+ streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
143
+ model.eval()
144
+
145
+ #
146
+ # generate
147
+ prompt = f"### instruction: μžλ™μ°¨ μ’…ν•©(μ •κΈ°)검사 μ˜λ¬΄κΈ°κ°„μ€ μ–Όλ§ˆμΈκ°€μš”?.\n\n### Response:"
148
+ gened = model.generate(
149
+ **tokenizer(
150
+ prompt,
151
+ return_tensors='pt',
152
+ return_token_type_ids=False
153
+ ).to('cuda'),
154
+ generation_config=generation_config,
155
+ eos_token_id=model.config.eos_token_id,
156
+ stopping_criteria=stopping_criteria,
157
+ streamer=streamer
158
+ )
159
+ output_text = tokenizer.decode(gened[0], skip_special_tokens=True)
160
+
161
+ print('--------------------')
162
+ print(output_text)
163
+
164
+ ```
165
+ response:
166
+ ```
167
+ nlpλŠ” μžμ—°μ–΄ 처리의 μ•½μžλ‘œ, μžμ—°μ–΄λ₯Ό μ‚¬μš©ν•˜μ—¬ 인간과 컴퓨터 κ°„μ˜ μƒν˜Έ μž‘μš©μ„ λ‹€λ£¨λŠ” λΆ„μ•Όμž…λ‹ˆλ‹€. 컴퓨터와 인간이 μ„œλ‘œ μƒν˜Έ μž‘μš©ν•˜λŠ” 데 μ‚¬μš©λ˜λŠ” 언어와 κΈ°μˆ μ„ ν¬ν•¨ν•˜λ©°, μ»΄ν“¨ν„°λŠ” μΈκ°„μ˜ μ–Έμ–΄λ₯Ό μ²˜λ¦¬ν•˜κ³  λΆ„μ„ν•˜μ—¬ μΈκ°„μ˜ μž‘μ—…μ„ λ•κ±°λ‚˜ μž‘μ—…μ„ μžλ™ν™”ν•˜λŠ” 데 μ‚¬μš©λ©λ‹ˆλ‹€. λ”°λΌμ„œ 컴퓨터가 μ»΄ν“¨ν„°μ—μ„œ μž‘μ—…ν•˜λŠ” 데 μ‚¬μš©λ˜λŠ” 컴퓨터 ν”„λ‘œκ·Έλž¨μ΄λ‚˜ ν”„λ‘œκ·Έλž¨κ³Ό λΉ„μŠ·ν•˜κ²Œ 인간도 μžμ‹ μ˜ μž‘μ—…μ— μ‚¬μš©λ˜λŠ” 컴퓨터 ν”„λ‘œκ·Έλž¨κ³Ό λΉ„μŠ·ν•œ λ°©μ‹μœΌλ‘œ μž‘μ—…ν•  수 μžˆμŠ΅λ‹ˆλ‹€.
168
+ ```
169
+ ## Hardware and Software
170
+ - nvidia driver : 535.54.03
171
+ - CUDA Version: 12.2
172
+ -
173
  ## Training procedure
174
 
175