davidkim205 commited on
Commit
ef36a03
β€’
1 Parent(s): 65f628e

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -0
README.md ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - ko
5
+ pipeline_tag: text-generation
6
+
7
+ ---
8
+ # komt : korean multi task instruction tuning model
9
+ ![multi task instruction tuning.jpg](https://github.com/davidkim205/komt/assets/16680469/c7f6ade7-247e-4b62-a94f-47e19abea68e)
10
+
11
+ Recently, due to the success of ChatGPT, numerous large language models have emerged in an attempt to catch up with ChatGPT's capabilities.
12
+ However, when it comes to Korean language performance, it has been observed that many models still struggle to provide accurate answers or generate Korean text effectively.
13
+ This study addresses these challenges by introducing a multi-task instruction technique that leverages supervised datasets from various tasks to create training data for Large Language Models (LLMs).
14
+
15
+ ## Model Details
16
+
17
+ * **Model Developers** : davidkim(changyeon kim)
18
+ * **Repository** : https://github.com/davidkim205/komt
19
+ * **quant methods** : q4_0, q4_1, q5_0, q5_1, q2_k, q3_k, q3_k_m, q3_k_l, q4_k, q4_k_s, q4_k_m, q5_k, q5_k_s, q5_k_m, q8_0, q4_0
20
+
21
+
22
+ ## Training
23
+ Refer https://github.com/davidkim205/komt
24
+
25
+ ## usage
26
+ ```
27
+ cd llama.cpp
28
+ make -j && ./main -m ./models/komt-mistral-7b-v1/ggml-model-q4_1.gguf -p "[INST]인삼은 μ–΄λ–€ νš¨κ³Όκ°€ μžˆλŠ”κ°€μš”? [/INST]"
29
+ ```
30
+ output
31
+ ```
32
+ [INST]인삼은 μ–΄λ–€ νš¨κ³Όκ°€ μžˆλŠ”κ°€μš”? [/INST]μΈμ‚Όμ—λŠ” μ•½ 600개의 물질이 ν¬ν•¨λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€. κ·Έ μ€‘μ—μ„œλ„ λ§Žμ€ 연ꡬ듀을 톡해 효λŠ₯을 ν™•μΈν•œ κ²ƒμœΌλ‘œ μ•Œλ €μ§„ 3κ°€μ§€λŠ” μ•„λž˜μ™€ κ°™μŠ΅λ‹ˆλ‹€.
33
+ 1. ν˜ˆμ•• 쑰절: 각쒅 μ‹€ν—˜μ—μ„œ 인삼이 ν˜ˆμ••μ„ μ‘°μ ˆν•˜λŠ”λ° 효과λ₯Ό λ‚˜νƒ€λƒˆμœΌλ©°, 특히 μ€‘κ΅­μ˜ ν•œ μ—°κ΅¬μžλ“€μ€ 인삼을 μ‚¬μš©ν•΄ 40%의 ν˜ˆμ•• κ°μ†Œλ₯Ό λ³΄μ˜€μŠ΅λ‹ˆλ‹€.
34
+ 2. μœ„μž₯ κ°œμ„ : 인삼은 흉터, 톡증 λ“±μœΌλ‘œ κ³ ν†΅λ°›λŠ” μœ„μž₯ μ§ˆν™˜μ„ μΌλΆ€λ‚˜λ§ˆ κ°œμ„ ν•  수 μžˆλŠ”λ°, μ΄λŠ” 각쒅 μ‹€ν—˜λ“€μ—μ„œ ν™•μΈλœ κ²ƒμž…λ‹ˆλ‹€.
35
+ 3. λ©΄μ—­ κ°•ν™”: 인삼은 면역체계λ₯Ό κ°•ν™”μ‹œν‚€λŠ”λ° νš¨κ³Όκ°€ 있으며, κ΅­λ‚΄μ—μ„œλ„ 2014λ…„λΆ€ν„°λŠ” μ‹μ•½μ²˜μ˜ μ˜μ•½μš©ν’ˆ 수좜증λͺ…μ œμ— λŒ€ν•œ μ΅œμ’…μ μΈ ν‰κ°€λ‘œ μ‚¬μš©λ˜κ³  μžˆμŠ΅λ‹ˆλ‹€.
36
+ μœ„μ™€ 같은 효λŠ₯을 κ°–μΆ˜ 인삼은 많이 μ‚¬μš©ν•˜λŠ” κ±΄κ°•μ‹ν’ˆμ˜ μ›λ£Œλ‘œλ„ ν™œμš©λ©λ‹ˆλ‹€. [end of text]
37
+ ```
38
+ ## Evaluation
39
+ For objective model evaluation, we initially used EleutherAI's lm-evaluation-harness but obtained unsatisfactory results. Consequently, we conducted evaluations using ChatGPT, a widely used model, as described in [Self-Alignment with Instruction Backtranslation](https://arxiv.org/pdf/2308.06502.pdf) and [Three Ways of Using Large Language Models to Evaluate Chat](https://arxiv.org/pdf/2308.06259.pdf) .
40
+
41
+
42
+
43
+ | model | score | average(0~5) | percentage |
44
+ | --------------------------------------- |---------| ------------ | ---------- |
45
+ | gpt-3.5-turbo(close) | 147 | 3.97 | 79.45% |
46
+ | naver Cue(close) | 140 | 3.78 | 75.67% |
47
+ | clova X(close) | 136 | 3.67 | 73.51% |
48
+ | WizardLM-13B-V1.2(open) | 96 | 2.59 | 51.89% |
49
+ | Llama-2-7b-chat-hf(open) | 67 | 1.81 | 36.21% |
50
+ | Llama-2-13b-chat-hf(open) | 73 | 1.91 | 38.37% |
51
+ | nlpai-lab/kullm-polyglot-12.8b-v2(open) | 70 | 1.89 | 37.83% |
52
+ | kfkas/Llama-2-ko-7b-Chat(open) | 96 | 2.59 | 51.89% |
53
+ | beomi/KoAlpaca-Polyglot-12.8B(open) | 100 | 2.70 | 54.05% |
54
+ | **komt-llama2-7b-v1 (open)(ours)** | **117** | **3.16** | **63.24%** |
55
+ | **komt-llama2-13b-v1 (open)(ours)** | **129** | **3.48** | **69.72%** |
56
+ | **komt-llama-30b-v1 (open)(ours)** | **129** | **3.16** | **63.24%** |
57
+ | **komt-mistral-7b-v1 (open)(ours)** | **131** | **3.54** | **70.81%** |
58
+