j5ng commited on
Commit
3ca758e
โ€ข
1 Parent(s): 25b2ef3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md CHANGED
@@ -1,3 +1,64 @@
1
  ---
2
  license: apache-2.0
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - ko
5
+ pipeline_tag: text-generation
6
  ---
7
+
8
+ ### How to use GPTQ model
9
+ https://github.com/jongmin-oh/korean-LLM-quantize
10
+ ```
11
+ mkdir ./templates && mkdir ./utils && wget -P ./templates https://raw.githubusercontent.com/jongmin-oh/korean-LLM-quantize/main/templates/kullm.json && wget -P ./utils https://raw.githubusercontent.com/jongmin-oh/korean-LLM-quantize/main/utils/prompter.py
12
+ ```
13
+
14
+ ### install package
15
+ ```
16
+ pip install torch==2.0.1 auto-gptq==0.4.2
17
+ ```
18
+
19
+ - ๊ธ‰ํ•˜์‹ ๋ถ„๋“ค์€ ๋ฐ‘์— ์˜ˆ์ œ์ฝ”๋“œ ์‹คํ–‰ํ•˜์‹œ๋ฉด ๋ฐ”๋กœ ํ…Œ์ŠคํŠธ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. (GPU memory 19GB ์ ์œ )
20
+ - 2023-08-23์ผ ์ดํ›„๋ถ€ํ„ฐ๋Š” huggingFace์—์„œ GPTQ๋ฅผ ๊ณต์‹์ง€์›ํ•˜๊ฒŒ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
21
+
22
+ ```python
23
+ import torch
24
+ from transformers import pipeline
25
+ from auto_gptq import AutoGPTQForCausalLM
26
+
27
+ from utils.prompter import Prompter
28
+
29
+ MODEL = "j5ng/kullm-5.8b-GPTQ-8bit"
30
+ model = AutoGPTQForCausalLM.from_quantized(MODEL, device="cuda:0", use_triton=False)
31
+
32
+ pipe = pipeline('text-generation', model=model,tokenizer=MODEL)
33
+
34
+ prompter = Prompter("kullm")
35
+
36
+ def infer(instruction="", input_text=""):
37
+ prompt = prompter.generate_prompt(instruction, input_text)
38
+ output = pipe(
39
+ prompt, max_length=512,
40
+ temperature=0.2,
41
+ repetition_penalty=3.0,
42
+ num_beams=5,
43
+ eos_token_id=2
44
+ )
45
+ s = output[0]["generated_text"]
46
+ result = prompter.get_response(s)
47
+
48
+ return result
49
+
50
+ instruction = """
51
+ ์†ํฅ๋ฏผ(ํ•œ๊ตญ ํ•œ์ž: ๅญซ่ˆˆๆ…œ, 1992๋…„ 7์›” 8์ผ ~ )์€ ๋Œ€ํ•œ๋ฏผ๊ตญ์˜ ์ถ•๊ตฌ ์„ ์ˆ˜๋กœ ํ˜„์žฌ ์ž‰๊ธ€๋žœ๋“œ ํ”„๋ฆฌ๋ฏธ์–ด๋ฆฌ๊ทธ ํ† ํŠธ๋„˜ ํ™‹์Šคํผ์—์„œ ์œ™์–ด๋กœ ํ™œ์•ฝํ•˜๊ณ  ์žˆ๋‹ค.
52
+ ๋˜ํ•œ ๋Œ€ํ•œ๋ฏผ๊ตญ ์ถ•๊ตฌ ๊ตญ๊ฐ€๋Œ€ํ‘œํŒ€์˜ ์ฃผ์žฅ์ด์ž 2018๋…„ ์•„์‹œ์•ˆ ๊ฒŒ์ž„ ๊ธˆ๋ฉ”๋‹ฌ๋ฆฌ์ŠคํŠธ์ด๋ฉฐ ์˜๊ตญ์—์„œ๋Š” ์• ์นญ์ธ "์˜๋‹ˆ"(Sonny)๋กœ ๋ถˆ๋ฆฐ๋‹ค.
53
+ ์•„์‹œ์•„ ์„ ์ˆ˜๋กœ์„œ๋Š” ์—ญ๋Œ€ ์ตœ์ดˆ๋กœ ํ”„๋ฆฌ๋ฏธ์–ด๋ฆฌ๊ทธ ๊ณต์‹ ๋ฒ ์ŠคํŠธ ์ผ๋ ˆ๋ธ๊ณผ ์•„์‹œ์•„ ์„ ์ˆ˜ ์ตœ์ดˆ์˜ ํ”„๋ฆฌ๋ฏธ์–ด๋ฆฌ๊ทธ ๋“์ ์™•์€ ๋ฌผ๋ก  FIFA ํ‘ธ์Šค์นด์Šค์ƒ๊นŒ์ง€ ํœฉ์“ธ์—ˆ๊ณ  2022๋…„์—๋Š” ์ถ•๊ตฌ ์„ ์ˆ˜๋กœ๋Š” ์ตœ์ดˆ๋กœ ์ฒด์œกํ›ˆ์žฅ ์ฒญ๋ฃก์žฅ ์ˆ˜ํ›ˆ์ž๊ฐ€ ๋˜์—ˆ๋‹ค.
54
+ ์†ํฅ๋ฏผ์€ ํ˜„์žฌ ๋ฆฌ๊ทธ 100ํ˜ธ๋ฅผ ๋„ฃ์–ด์„œ ํ™”์ œ๊ฐ€ ๋˜๊ณ  ์žˆ๋‹ค.
55
+ """
56
+ result = infer(instruction=instruction, input_text="์†ํฅ๋ฏผ์˜ ์• ์นญ์€ ๋ญ์•ผ?")
57
+ print(result) # ์†ํฅ๋ฏผ์˜ ์• ์นญ์€ "์˜๋‹ˆ"์ž…๋‹ˆ๋‹ค.
58
+ ```
59
+
60
+ ### Reference
61
+
62
+ - [EleutherAI/polyglot](https://huggingface.co/EleutherAI/polyglot-ko-12.8b)
63
+ - [๊ณ ๋ ค๋Œ€ํ•™๊ต/kullm](https://huggingface.co/nlpai-lab/kullm-polyglot-12.8b-v2)
64
+ - [GPTQ](https://github.com/IST-DASLab/gptq)