zhouliang
commited on
Commit
•
33502f3
1
Parent(s):
629e9cc
Update README.md
Browse files
README.md
CHANGED
@@ -18,6 +18,21 @@ Here are the steps to make this model:
|
|
18 |
3. Prepare the limarp+pippa data set, clean it into alpaca format, and use [goliath-120b](https://huggingface.co/alpindale/goliath-120b), which is good at role-playing, to score each question and answer pair, and filter out the high-quality ones. 30k data.
|
19 |
4. Use the data in 3 for sft on the base model obtained in 2, 6 epochs, r=16 alpha=32 for fine-tuning.
|
20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
*Effect*:
|
22 |
Proficient in role-playing skills, while being highly accepted on NSFW, pure love words will appear from time to time. like:
|
23 |
```#3
|
@@ -44,6 +59,21 @@ Support me [here](https://ko-fi.com/mikolisa) :)
|
|
44 |
3. 准备limarp+pippa数据集,统一清洗为alpaca格式,并且使用比较擅长角色扮演的[goliath-120b](https://huggingface.co/alpindale/goliath-120b)对每个问答对进行打分,筛选出其中质量高的大约30k数据。
|
45 |
4. 对2中得到的base模型使用3中的数据进行sft,6个epochs,r=16 alpha=32进行微调。
|
46 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
47 |
*效果*
|
48 |
熟练的角色扮演技能,在NSFW上有很高接受度的同时,会时不时的出现纯爱的话语。如:
|
49 |
```#3
|
|
|
18 |
3. Prepare the limarp+pippa data set, clean it into alpaca format, and use [goliath-120b](https://huggingface.co/alpindale/goliath-120b), which is good at role-playing, to score each question and answer pair, and filter out the high-quality ones. 30k data.
|
19 |
4. Use the data in 3 for sft on the base model obtained in 2, 6 epochs, r=16 alpha=32 for fine-tuning.
|
20 |
|
21 |
+
*Format*
|
22 |
+
|
23 |
+
alpaca
|
24 |
+
```[
|
25 |
+
{
|
26 |
+
"instruction": "user instruction (required)",
|
27 |
+
"input": "user input (optional)",
|
28 |
+
"output": "model response (required)",
|
29 |
+
"history": [
|
30 |
+
["user instruction in the first round (optional)", "model response in the first round (optional)"],
|
31 |
+
["user instruction in the second round (optional)", "model response in the second round (optional)"]
|
32 |
+
]
|
33 |
+
}
|
34 |
+
]```
|
35 |
+
|
36 |
*Effect*:
|
37 |
Proficient in role-playing skills, while being highly accepted on NSFW, pure love words will appear from time to time. like:
|
38 |
```#3
|
|
|
59 |
3. 准备limarp+pippa数据集,统一清洗为alpaca格式,并且使用比较擅长角色扮演的[goliath-120b](https://huggingface.co/alpindale/goliath-120b)对每个问答对进行打分,筛选出其中质量高的大约30k数据。
|
60 |
4. 对2中得到的base模型使用3中的数据进行sft,6个epochs,r=16 alpha=32进行微调。
|
61 |
|
62 |
+
*格式*
|
63 |
+
|
64 |
+
alpaca
|
65 |
+
```[
|
66 |
+
{
|
67 |
+
"instruction": "user instruction (required)",
|
68 |
+
"input": "user input (optional)",
|
69 |
+
"output": "model response (required)",
|
70 |
+
"history": [
|
71 |
+
["user instruction in the first round (optional)", "model response in the first round (optional)"],
|
72 |
+
["user instruction in the second round (optional)", "model response in the second round (optional)"]
|
73 |
+
]
|
74 |
+
}
|
75 |
+
]```
|
76 |
+
|
77 |
*效果*
|
78 |
熟练的角色扮演技能,在NSFW上有很高接受度的同时,会时不时的出现纯爱的话语。如:
|
79 |
```#3
|