Trofish commited on
Commit
857362e
ยท
1 Parent(s): 2b01250

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +121 -0
README.md ADDED
@@ -0,0 +1,121 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023 ์„ฑ๊ท ๊ด€๋Œ€ ํ•˜๊ณ„์ง‘์ค‘ ์‚ฐํ•™ํ˜‘๋ ฅํ”„๋กœ์ ํŠธ VAIV
2
+ ## GPT ๊ธฐ๋ฐ˜์˜ ์ž์—ฐ์Šค๋Ÿฝ๊ณ (Friendly) ์œค๋ฆฌ์ ์ธ(Harmless) ์ผ์ƒ ๋Œ€ํ™”ํ˜• ์ฑ—๋ด‡ ๋ชจ๋ธ
3
+
4
+ # ๊ณผ์ œ ๋ชฉํ‘œ
5
+ GPT-NEOX ๊ธฐ๋ฐ˜ ์ž์—ฐ์Šค๋Ÿฝ๊ณ  ์œค๋ฆฌ์ ์ธ ํ•œ๊ตญ์–ด ๊ธฐ๋ฐ˜ ์ผ์ƒ ๋Œ€ํ™”ํ˜• ์ฑ—๋ด‡ ๋ชจ๋ธ ๊ตฌํ˜„
6
+ - Self-Instruct: GPT4๋ฅผ ์ด์šฉํ•œ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ•
7
+ - RLHF(Reinforcement Learning from Human Feedback): ์‚ฌ๋žŒ์˜ ์„ ํ˜ธ๋„๋ฅผ ๋ฐ˜์˜ํ•œ ๊ฐ•ํ™”ํ•™์Šต
8
+ - DeepSpeed: ๋Œ€๊ทœ๋ชจ ๋ถ„์‚ฐ ๋”ฅ๋Ÿฌ๋‹์„ ์œ„ํ•œ ์ƒˆ๋กœ์šด ๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™” ๊ธฐ์ˆ 
9
+
10
+ # ๊ฐœ๋ฐœ ๋‚ด์šฉ
11
+ Task 1: ๊ฐ•ํ™”ํ•™์Šต ๋‹จ๊ณ„๋ณ„ ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์ถ•
12
+ Task 2: SFT ๋ชจ๋ธ Fine-tuning (https://huggingface.co/Trofish/KULLM-SFT-v2)
13
+ Task 3: Reward ๋ชจ๋ธ ver1,2,3 ๊ตฌํ˜„
14
+ Task 4: RLHF์™€ DeepSpeedChat์„ ํ†ตํ•œ ์ตœ์ข… ๋ชจ๋ธ ๊ตฌํ˜„ (https://huggingface.co/Trofish/KULLM-RLHF)
15
+
16
+ # Task1. ๊ฐ•ํ™”ํ•™์Šต ๋‹จ๊ณ„๋ณ„ ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์ถ•
17
+ ![image](https://github.com/VAIV-2023/VAIV2023/assets/79634774/a4988abd-c6fd-4fc2-8e53-9a02240e2275)
18
+ ![image](https://github.com/VAIV-2023/VAIV2023/assets/79634774/dae49a1e-a834-463c-9f95-34cf254fdaeb)
19
+ ## ๋ฐ์ดํ„ฐ์…‹ ์„ ์ • ์‹œ ๊ณ ๋ ค ์‚ฌํ•ญ
20
+ - **์ผ์ƒ ๋Œ€ํ™”์™€ ํ˜์˜ค ํ‘œํ˜„ ๋Œ€์ฒ˜ ๋Šฅ๋ ฅ์„ ์˜ฌ๋ฆฌ๊ธฐ ์œ„ํ•œ ๋ฐ์ดํ„ฐ์…‹๊ณผ, ํ•™์Šต ์‹œ ์ฑ—๋ด‡ ๋ชจ๋ธ์˜ generalํ•œ task์— ๋Œ€ํ•œ ์„ฑ๋Šฅ์ด ํ•˜๋ฝํ•˜๋Š” ๊ฒƒ์„ ๋ง‰๊ธฐ ์œ„ํ•ด์„œ general task ๋ฐ์ดํ„ฐ์…‹์„ ๊ตฌ์„ฑ**
21
+
22
+ - **๊ตญ๋ฆฝ๊ตญ์–ด์› ์ผ์ƒ ๋Œ€ํ™” ๋ฐ์ดํ„ฐ์…‹:** ์ผ์ƒ์ ์ธ ๋Œ€ํ™”์— ๋Œ€ํ•œ ์ž์—ฐ์Šค๋Ÿฌ์šด ์‘๋‹ต์ด ์žˆ์œผ๋ฉด์„œ๋„, ๋งž์ถค๋ฒ•์ด ์ž˜ ์ง€์ผœ์ง€๊ณ  ์€์–ด, ๋น„๋ฌธ, ์ดˆ์„ฑ ๋“ฑ์ด ์—†์œผ๋ฉฐ ์ฃผ์ œ๋ณ„๋กœ ๋‹ค์–‘ํ•œ ๋Œ€ํ™”๊ฐ€ ์žˆ์Œ
23
+
24
+ - **AI Hub ํ˜์˜ค ํ‘œํ˜„ ๋ฐ์ดํ„ฐ์…‹:** ํ˜์˜ค, ์ฐจ๋ณ„, ์„ฑ์ ์ธ ๋‚ด์šฉ, ํญ๋ ฅ, ๋ฒ”์ฃ„ ๋“ฑ ์นดํ…Œ๊ณ ๋ฆฌ๋ณ„๋กœ ๋‹ค์–‘ํ•œ ํ˜์˜ค ํ‘œํ˜„์ด ์žˆ์Œ
25
+
26
+ - **General task ๋ฐ์ดํ„ฐ์…‹**
27
+ - Evol-Instruct ๋ฐ์ดํ„ฐ์…‹: ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์— ๋Œ€ํ•œ ๋ณต์žกํ•˜๊ณ  ๋…ผ๋ฆฌ์ ์ธ prompt์™€ ๋‹ต๋ณ€์ด ์žˆ์Œ
28
+ - Self-Instruct ๋ฐ์ดํ„ฐ์…‹: ์‚ฌ๋žŒ์ด ์ง์ ‘ ์ƒ์„ฑํ•œ ์–‘์งˆ์˜ Seed data๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ•
29
+ - RLHF ํ•œ๊ตญ์–ด ๋ฒˆ์—ญ ๋ฐ์ดํ„ฐ์…‹: DeepSpeedChat์—์„œ ๊ณต๊ฐœํ•œ ๋ฐ์ดํ„ฐ์…‹์„ ํ•œ๊ตญ์–ด๋กœ ๋ฒˆ์—ญ
30
+
31
+ # Task2. SFT ๋ชจ๋ธ Fine-tuning
32
+ ## Baseline Model
33
+ [- ๊ณ ๋ ค๋Œ€ํ•™๊ต NLP & AI ์—ฐ๊ตฌ์‹ค๊ณผ HIAI ์—ฐ๊ตฌ์†Œ๊ฐ€ ๊ฐœ๋ฐœํ•œ ํ•œ๊ตญ์–ด LLM **"KULLM"** ์‚ฌ์šฉ](https://github.com/nlpai-lab/KULLM)
34
+
35
+ ## Datasets
36
+ ![image](https://github.com/VAIV-2023/VAIV2023/assets/79634774/085610db-3714-43c3-855b-58baad2f4e8b)
37
+
38
+ ## SFT Model Finetuning
39
+ ![image](https://github.com/VAIV-2023/VAIV2023/assets/79634774/0f5e36fa-20a8-43f9-bd03-5f8224d5e9d0)
40
+ * ๋ชจ๋ธํ•™์Šต์—๋Š” Google Colab์—์„œ ์ œ๊ณตํ•˜๋Š” A100 40GB GPU ์‚ฌ์šฉ
41
+
42
+ ## SFT Model Evaluation
43
+ ![image](https://github.com/VAIV-2023/VAIV2023/assets/79634774/9fe9e5aa-6dc7-4c7b-8529-45e0a75db9c6)
44
+ ![image](https://github.com/VAIV-2023/VAIV2023/assets/79634774/a994a960-db7c-4e75-a11a-d7755d372722)
45
+ * G-Eval: https://arxiv.org/abs/2303.16634
46
+
47
+ ## Final SFT Model
48
+ - https://huggingface.co/Trofish/KULLM-SFT-v2
49
+
50
+ # Task3-1. Reward Model ver1 ๊ตฌํ˜„
51
+ ## Baseline Model
52
+ - EleutherAI์—์„œ ๊ฐœ๋ฐœํ•œ ์ดˆ๊ฑฐ๋Œ€ ํ•œ๊ตญ์–ด ์–ธ์–ด ๋ชจ๋ธ **Polyglot-Ko** ์‚ฌ์šฉ
53
+ - 1.3b ๋ชจ๋ธ๊ณผ 5.8b ๋ชจ๋ธ์„ ๊ฐ๊ฐ ์‹คํ—˜
54
+ ## Datasets
55
+ ![image](https://github.com/VAIV-2023/RLHF-Korean-Friendly-LLM/assets/79634774/0082da9b-b0b8-4089-8647-cffa5ce724fb)
56
+ - InstructGPT์˜ ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์ถ• ๋ฐฉ๋ฒ•
57
+ - Reward ๋ชจ๋ธ ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ SFT ํ•™์Šต์— ์‚ฌ์šฉํ•œ prompt(1,500๊ฐœ - ์ผ์ƒ๋Œ€ํ™”:ํ˜์˜คํ‘œํ˜„=2:1)์™€ ์ƒˆ๋กœ์šด prompt(1,000๊ฐœ - DeepSpeedChat ๋ฒˆ์—ญ ๋ฐ์ดํ„ฐ์…‹) ์‚ฌ์šฉ
58
+ - SFT ๋ชจ๋ธ์—์„œ ํ•œ๊ฐœ์˜ prompt๋‹น K๊ฐœ์˜ Response๋ฅผ ์ƒ์„ฑํ•˜๊ณ , ์ˆœ์œ„๋ฅผ Labeling
59
+ - ๋ฐ์ดํ„ฐ์…‹ ๋ผ๋ฒจ๋ง
60
+ - Instruct GPT์˜ ๊ฒฝ์šฐ ์‚ฌ๋žŒ์ด ์ง์ ‘ Labeling์„ ํ•˜์—ฟ์ง€๋งŒ, ์ผ๊ด€๋œ ํ‰๊ฐ€์™€ ์‹œ๊ฐ„ ๋‹จ์ถ•์„ ์œ„ํ•ด GPt-4์™€ G-Eval์„ ์ด์šฉ
61
+ - SFT์—์„œ ์ƒ์„ฑํ•œ ๋‘ Response ์ค‘ G-Eval ํ‰๊ฐ€ ์ ์ˆ˜ ํ•ฉ์ด ๋†’์€ ๊ฒƒ์„ Chosen response๋กœ ๊ฒฐ์ •
62
+ - ๋ฐ์ดํ„ฐ์…‹ ์œ ํ˜•๋ณ„๋กœ G-Eval ํ‰๊ฐ€ Prompt์— ์ฐจ์ด๋ฅผ ๋‘์—ˆ์Œ
63
+ - ![image](https://github.com/VAIV-2023/RLHF-Korean-Friendly-LLM/assets/79634774/7d7117d0-02e9-42dd-8ce3-5244cf726bf8)
64
+ ## Reward v1 Model Finetuning
65
+ - ![image](https://github.com/VAIV-2023/RLHF-Korean-Friendly-LLM/assets/79634774/da4d9b15-ec91-44bb-84d9-f28aeffd16ad)
66
+ - InstructGPT ๋…ผ๋ฌธ์— ๋”ฐ๋ฅด๋ฉด, Reward ๋ชจ๋ธ์€ overfitting๋˜๋ฉด ์„ฑ๋Šฅ์ด ํฌ๊ฒŒ ์ €ํ•˜๋œ๋‹ค๊ณ  ํ•จ --> epoch ์ˆ˜๋ฅผ 1๋กœ ์„ค์ •
67
+ - batch size๋‚˜ learning rate ๋“ฑ ๋‹ค๋ฅธ hyper-parameter๋Š” ์„ฑ๋Šฅ์— ํฐ ์˜ํ–ฅ์ด ์—†๋‹ค๊ณ  ํ•จ
68
+ - Colab A100 40GB ๊ธฐ์ค€ ์ด ํ•™์Šต ์‹œ๊ฐ„ 4๋ถ„
69
+
70
+ ## Reward v1 Model Evaluation
71
+ - ![image](https://github.com/VAIV-2023/RLHF-Korean-Friendly-LLM/assets/79634774/f4af0b7d-af47-4881-8adf-d14be43c0eb1)
72
+ - Reward Model Template
73
+ - **"์•„๋ž˜๋Š” ์ž‘์—…์„ ์„ค๋ช…ํ•˜๋Š” ๋ช…๋ น์–ด์ž…๋‹ˆ๋‹ค. ์š”์ฒญ์„ ์ ์ ˆํžˆ ์™„๋ฃŒํ•˜๋Š” ์‘๋‹ต์„ ์ž‘์„ฑํ•˜์„ธ์š”. \n\n ### ๋ช…๋ น์–ด:\n{prompt}\n\n ### ์‘๋‹ต:\n"**
74
+
75
+ # Task3-2. Reward Model ver2,3 ๊ตฌํ˜„
76
+ ## RewardModel ver1 Issues
77
+ - ๊ตฌํ˜„๋œ Reward ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ด ์ข‹์ง€ ์•Š์Œ (Accuracy 0.65)
78
+ - Reward ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ Step3 ํ•™๏ฟฝ๏ฟฝ๏ฟฝ์‹œ ํ˜์˜คํ‘œํ˜„์ด ์•„๋‹Œ๋ฐ๋„ ํ˜์˜คํ‘œํ˜„์ด๋ผ๊ณ  ์ธ์‹ํ•˜๊ณ  ๋‹ต๋ณ€ํ•˜๋Š” ๋ฌธ์ œ ๋ฐœ์ƒ
79
+
80
+ ## Issue ํ•ด๊ฒฐ๋ฐฉ์•ˆ (Reward Model ver2,3)
81
+ - ![image](https://github.com/VAIV-2023/RLHF-Korean-Friendly-LLM/assets/79634774/99c7fd6c-448e-4780-9573-0ef51b8e3183)
82
+ - General Task ๋‹ต๋ณ€์— ๋Œ€ํ•œ ํ‰๊ฐ€ ์„ฑ๋Šฅ์„ ๋†’์ด๊ธฐ ์œ„ํ•ด Evol-instruct ๋ฐ์ดํ„ฐ ์ถ”๊ฐ€
83
+ - SFT ๋ชจ๋ธ๋กœ ๋‹ต๋ณ€์„ 2๊ฐœ ์ƒ์„ฑํ•˜์˜€์„ ๋•Œ, Chosen, Rejected ๋‹ต๋ณ€์˜ ์ฐจ์ด๊ฐ€ ํฌ๊ฒŒ ์—†์–ด ๋ชจ๋ธ์ด ํ•™์Šต๋˜์ง€ ์•Š๋Š” ํ˜„์ƒ์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•˜์—ฌ 2๊ฐœ์˜ ๋ชจ๋ธ **(ChatGPT, SFT)**๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ต๋ณ€์„ ์ƒ์„ฑ
84
+ - ํ˜์˜คํ‘œํ˜„ ํ•™์Šต์‹œ(Ver2) Step3 ํ•™์Šต ์ดํ›„์— ๋‹ต๋ณ€์ด ์ด์ƒํ•˜๊ฒŒ ์ƒ์„ฑ๋˜๋Š” Issue๊ฐ€ ์žˆ์–ด, ํ˜์˜คํ‘œํ˜„์„ ๋ฐ์ดํ„ฐ๋ฅผ ์ œ๊ฑฐํ•˜๊ณ  ํ•™์Šต(Ver3)
85
+ - RM-ver1์€ GPT4๊ฐ€ Chosen, Rejected ๋ ˆ์ด๋ธ”๋ง์„ ์ง„ํ–‰ํ•˜์˜€์ง€๋งŒ, Resource ์ด์Šˆ๋กœ ์ธํ•ด ์ผ๋ถ€๋งŒ ์‚ฌ๋žŒ์ด ๋ผ๋ฒจ๋ง ์ง„ํ–‰
86
+ - ์ผ์ƒ๋Œ€ํ™”, ํ˜์˜คํ‘œํ˜„ ๋ฐ์ดํ„ฐ์…‹
87
+ - ChatGPT์™€ SFT ๋ชจ๋‘ ์ผ๊ด€๋˜๊ฒŒ ๋†’์€ ํ€„๋ฆฌํ‹ฐ์˜ ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•˜์ง€ ์•Š์•„, ์‚ฌ๋žŒ์ด ์ง์ ‘ ๋ผ๋ฒจ๋ง ์ง„ํ–‰
88
+ - RLHF ํ•œ๊ตญ์–ด ๋ฒˆ์—ญ, Evol-Instruct ๋ฐ์ดํ„ฐ์…‹
89
+ - ChatGPT๊ฐ€ ์ผ๊ด€๋˜๊ฒŒ ๋†’์€ ํ€„๋ฆฌํ‹ฐ์˜ ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•˜์—ฌ ChatGPT๋ฅผ Chosen, SFT๋ฅผ Rejected๋กœ ๋ผ๋ฒจ๋ง ์ง„
90
+ ## Reward Model ver2,3 Evaluation
91
+ ![image](https://github.com/VAIV-2023/RLHF-Korean-Friendly-LLM/assets/79634774/7889398a-86dc-4b03-8300-64b772d49887)
92
+
93
+ # Task4. RLHF์™€ DeepSpeedChat์„ ํ†ตํ•œ ์ตœ์ข… ๋ชจ๋ธ ๊ตฌํ˜„
94
+ - Microsoft์—์„œ ๋งŒ๋“  ๋Œ€๊ทœ๋ชจ ๋ถ„์‚ฐ ๋”ฅ๋Ÿฌ๋‹์„ ์œ„ํ•œ ์ƒˆ๋กœ์šด ๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™” ๊ธฐ์ˆ (DeepSpeed)์„ RLHF Process์— ์ ์šฉํ•œ DeepSpeedChat ์‚ฌ์šฉ
95
+ - Human preference๋กœ ํ•™์Šต์„ ์‹œํ‚จ Reward ๋ชจ๋ธ๊ณผ ๊ฐ•ํ™”ํ•™์Šต์„ ํ†ตํ•ด SFT ๋ชจ๋ธ์— ์‚ฌ๋žŒ์˜ ์„ ํ˜ธ๋„๋ฅผ ๋ฐ˜์˜ํ•˜์—ฌ ์ž์—ฐ์Šค๋Ÿฝ๊ณ (FRIENDLY), ์œค๋ฆฌ์ ์ธ (HARMLESS)ย ์ฑ—๋ด‡ ์ƒ์„ฑ
96
+
97
+ ## Baseline Models
98
+ - Actor Model: KULLM-SFT-V2
99
+ - Reward Model: Polyglot-Ko-Reward-V3
100
+
101
+ ## Training Options
102
+ ![image](https://github.com/VAIV-2023/VAIV2023/assets/79634774/ae2cdfe5-7552-4009-a99a-244e79d945dc)
103
+
104
+ ## RLHF Training
105
+ ![image](https://github.com/VAIV-2023/VAIV2023/assets/79634774/3d4dbf68-5222-4f6a-a6d0-87ea176c5211)
106
+ - ํ•™์Šต ๊ฒฐ๊ณผ, SFT ๋ชจ๋ธ์˜ ๋‹ต๋ณ€์— ๋Œ€ํ•œ ํ€„๋ฆฌํ‹ฐ์ธ Reward๊ฐ€ ์ƒ์Šนํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธ (์‚ฌ๋žŒ์˜ ์„ ํ˜ธ๋„๊ฐ€ ๋†’์€ ๋‹ต๋ณ€์„ ์ƒ์„ฑ)
107
+
108
+ ## RLFH Model Evaluation
109
+ ![image](https://github.com/VAIV-2023/VAIV2023/assets/79634774/2b58ed3a-7ed5-4e60-ba4b-c9b291b1fdff)
110
+ ![image](https://github.com/VAIV-2023/VAIV2023/assets/79634774/75b2a1ee-d7c0-4ba9-ab2f-727abab644e9)
111
+
112
+ ## Final RLHF Model
113
+ - https://huggingface.co/Trofish/KULLM-RLHF
114
+
115
+
116
+ # Contributors ๐Ÿ™Œ
117
+ - ๋ฐ•์„ฑ์™„ (์„ฑ๊ท ๊ด€๋Œ€ํ•™๊ต ์†Œํ”„ํŠธ์›จ์–ดํ•™๊ณผ 20ํ•™๋ฒˆ, waniboyy@gmail.com)
118
+ - ์†กํ˜„๋นˆ (์„ฑ๊ท ๊ด€๋Œ€ํ•™๊ต ์†Œํ”„ํŠธ์›จ์–ดํ•™๊ณผ 20ํ•™๋ฒˆ, shbin0519@gmail.com)
119
+ - ํ—ˆ์œ ๋ฏผ (์„ฑ๊ท ๊ด€๋Œ€ํ•™๊ต ์†Œํ”„ํŠธ์›จ์–ดํ•™๊ณผ 21ํ•™๋ฒˆ, ymheo1123@gmail.com)
120
+ - ํ™์—ฌ์› (์„ฑ๊ท ๊ด€๋Œ€ํ•™๊ต ์†Œํ”„ํŠธ์›จ์–ดํ•™๊ณผ 20ํ•™๋ฒˆ, ryeowon13@gmail.com)
121
+