lemonilia commited on
Commit
28322a9
1 Parent(s): a5dec36

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -20
README.md CHANGED
@@ -5,9 +5,7 @@ license: apache-2.0
5
  # LimaRP-Mistral-7B-v0.1 (Alpaca, 8-bit LoRA adapter)
6
 
7
  This is a version of LimaRP for [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) with
8
- about 1800 training samples _up to_ 4k tokens length. A 2-pass training procedure has been employed. The first pass includes
9
- finetuning on about 6800 stories within 4k tokens length and the second pass is LimaRP with changes introducing more effective
10
- control on response length.
11
 
12
  For more details about LimaRP, see the model page for the [previously released v2 version for Llama-2](https://huggingface.co/lemonilia/limarp-llama2-v2).
13
  Most details written there apply for this version as well. Generally speaking, LimaRP is a longform-oriented, novel-style
@@ -16,15 +14,9 @@ IRC/Discord-style RP (aka "Markdown format") is not supported yet. The model doe
16
  only manually picked and slightly edited RP conversations with persona and scenario data.
17
 
18
  ## Known issues
19
- - Due to software limitations, finetuning didn't take advantage yet of the Sliding Window Attention (SWA) which would have allowed
20
- to use longer conversations in the training data and a more accurate behavior with Scenario information. Thus, this version of LimaRP
21
- should be considered preliminary and will be updated in the future.
22
  - Despite performing a few finetuning attempts, including one that followed almost the same procedure as in previous releases,
23
  Mistral-7B-v0.1 appears to have strange repetition issues.
24
  - Even though benchmarks tell a different story, in practice the model doesn't feel smarter during roleplay than Llama-2-13B.
25
- - Although the second finetuning pass (the primary driver for model outputs) included in general relatively high-quality data,
26
- the first finetuning pass, added in an attempt to improve creativity, comprised almost completely quality-unchecked data which
27
- may occasionally bring undesirable grammatical issues to the model's outputs.
28
 
29
  ## Prompt format
30
  Same as before. It uses the [extended Alpaca format](https://github.com/tatsu-lab/stanford_alpaca),
@@ -75,16 +67,16 @@ User: {utterance}
75
  Character: {utterance}
76
  ```
77
 
78
- This has an immediately noticeable effect on bot responses. The available lengths are:
79
- `tiny`, `short`, `medium`, `long`, `huge`, `humongous`, `extreme`, `unlimited`. **The
80
- recommended starting length is `medium`**. Keep in mind that the AI may ramble
81
- or impersonate the user with very long messages.
82
 
83
  The length control effect is reproducible, but the messages will not necessarily follow
84
  lengths very precisely, rather follow certain ranges on average, as seen in this table
85
  with data from tests made with one reply at the beginning of the conversation:
86
 
87
- ![lengths](https://files.catbox.moe/dy39bt.png)
88
 
89
  Response length control appears to work well also deep into the conversation. **By omitting
90
  the modifier, the model will choose the most appropriate response length** (although it might
@@ -120,10 +112,10 @@ training process closer to a full finetune. It's suggested to merge the adapter
120
  the base Mistral-7B-v0.1 model.
121
 
122
  ### Training hyperparameters
123
- - learning_rate: 0.0001
124
  - lr_scheduler_type: cosine
125
- - num_epochs: 2 (1 for the first pass)
126
- - sequence_len: 4096
127
  - lora_r: 256
128
  - lora_alpha: 16
129
  - lora_dropout: 0.05
@@ -134,11 +126,11 @@ the base Mistral-7B-v0.1 model.
134
  - load_in_8bit: True
135
  - adapter: lora
136
  - micro_batch_size: 2
137
- - gradient_accumulation_steps: 1
138
- - warmup_steps: 40
139
  - optimizer: adamw_torch
140
 
141
  For the second pass, the `lora_model_dir` option was used to continue finetuning on the LoRA
142
  adapter obtained from the first pass.
143
 
144
- Using 4 GPUs, the effective global batch size would have been 8.
 
5
  # LimaRP-Mistral-7B-v0.1 (Alpaca, 8-bit LoRA adapter)
6
 
7
  This is a version of LimaRP for [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) with
8
+ about 1900 training samples _up to_ 9k tokens length
 
 
9
 
10
  For more details about LimaRP, see the model page for the [previously released v2 version for Llama-2](https://huggingface.co/lemonilia/limarp-llama2-v2).
11
  Most details written there apply for this version as well. Generally speaking, LimaRP is a longform-oriented, novel-style
 
14
  only manually picked and slightly edited RP conversations with persona and scenario data.
15
 
16
  ## Known issues
 
 
 
17
  - Despite performing a few finetuning attempts, including one that followed almost the same procedure as in previous releases,
18
  Mistral-7B-v0.1 appears to have strange repetition issues.
19
  - Even though benchmarks tell a different story, in practice the model doesn't feel smarter during roleplay than Llama-2-13B.
 
 
 
20
 
21
  ## Prompt format
22
  Same as before. It uses the [extended Alpaca format](https://github.com/tatsu-lab/stanford_alpaca),
 
67
  Character: {utterance}
68
  ```
69
 
70
+ This has an immediately noticeable effect on bot responses. The lengths using during training are:
71
+ `micro`, `tiny`, `short`, `medium`, `long`, `massive`, `huge`, `enormous`, `humongous`, `unlimited`.
72
+ **The recommended starting length is medium**. Keep in mind that the AI can ramble or impersonate
73
+ the user with very long messages.
74
 
75
  The length control effect is reproducible, but the messages will not necessarily follow
76
  lengths very precisely, rather follow certain ranges on average, as seen in this table
77
  with data from tests made with one reply at the beginning of the conversation:
78
 
79
+ ![lengths](https://i.imgur.com/2WXGgaV.png)
80
 
81
  Response length control appears to work well also deep into the conversation. **By omitting
82
  the modifier, the model will choose the most appropriate response length** (although it might
 
112
  the base Mistral-7B-v0.1 model.
113
 
114
  ### Training hyperparameters
115
+ - learning_rate: 0.0005
116
  - lr_scheduler_type: cosine
117
+ - num_epochs: 2
118
+ - sequence_len: 9000
119
  - lora_r: 256
120
  - lora_alpha: 16
121
  - lora_dropout: 0.05
 
126
  - load_in_8bit: True
127
  - adapter: lora
128
  - micro_batch_size: 2
129
+ - gradient_accumulation_steps: 32
130
+ - warmup_steps: 2
131
  - optimizer: adamw_torch
132
 
133
  For the second pass, the `lora_model_dir` option was used to continue finetuning on the LoRA
134
  adapter obtained from the first pass.
135
 
136
+ Using 4 GPUs, the effective global batch size would have been 128.