nxnhjrjtbjfzhrovwl commited on
Commit
9157de7
1 Parent(s): 8f217f2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +171 -0
README.md ADDED
@@ -0,0 +1,171 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ # For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
3
+ # Doc / guide: https://huggingface.co/docs/hub/model-cards
4
+ {{ card_data }}
5
+ ---
6
+
7
+ This repository contains the unquantized merge of [limarp-llama2 7b lora](https://huggingface.co/lemonilia/limarp-llama2) in ggml format.
8
+
9
+ The below is the contents of the original model card:
10
+
11
+ # Model Card for LIMARP-Llama2
12
+
13
+ LIMARP-Llama2 is an experimental [Llama2](https://huggingface.co/meta-llama) finetune narrowly focused on novel-style roleplay chatting.
14
+
15
+ ## Model Details
16
+
17
+ ### Model Description
18
+
19
+ This is an experimental attempt at creating an RP-oriented fine-tune using a manually-curated, high-quality dataset of human-generated conversations. The main rationale for this are the observations from [Zhou et al.](https://arxiv.org/abs/2305.11206). The authors suggested that just 1000-2000 carefully curated training examples may yield high quality output for assistant-type chatbots. This is in contrast with the commonly employed strategy where a very large number of training examples (tens of thousands to even millions) of widely varying quality are used.
20
+
21
+ For LIMARP a similar approach was used, with the difference that the conversational data is almost entirely human-generated. Every training example is manually compiled and selected to comply with subjective quality parameters, with virtually no chance for OpenAI-style alignment responses to come up.
22
+
23
+ ## Uses
24
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
25
+
26
+ The model is intended to approximate the experience of 1-on-1 roleplay as observed on many Internet forums dedicated on roleplaying. It _must_ be used with a specific format similar to that of this template:
27
+
28
+ ```
29
+ <<SYSTEM>>
30
+ Character's Persona: a natural language description in simple present form of Chara, without newlines. AI character information would go here.
31
+
32
+ User's Persona: a natural language description of in simple present form of User, without newlines. Intended to provide information about the human.
33
+
34
+ Scenario: a natural language description of what is supposed to happen in the story, without newlines. You can be descriptive.
35
+
36
+ Play the role of Character. You must engage in a roleplaying chat with User below this line. Do not write dialogues and narration for User. Character should respond with messages of medium length.
37
+
38
+ <<AIBOT>>
39
+ Character: The AI-driven character wrote its narration in third person form and simple past. "This is not too complicated." He said.
40
+
41
+ <<HUMAN>>
42
+ User: The character assigned to the human also wrote narration in third person and simple past form. "You're completely right!" User agreed. "It's not complicated at all, and it's similar to the style used in books and novels."
43
+
44
+ User noticed that double newlines could be used as well. They did not affect the results as long as the correct instruction-mode sequences were used.
45
+
46
+ <<AIBOT>>
47
+ Character: [...]
48
+
49
+ <<HUMAN>>
50
+ User: [...]
51
+ ```
52
+
53
+ With `<<SYSTEM>>`, `<<AIBOT>>` and `<<HUMAN>>` being special instruct-mode sequences.
54
+
55
+ It's possible to make the model automatically generate random character information and scenario by adding just `<<SYSTEM>>` and the character name in text completion mode in `text-generation-webui`, as done here (click to enlarge). The format generally closely matches that of the training data:
56
+
57
+ ![example](https://files.catbox.moe/5ntmcj.png)
58
+
59
+ Here is an example SillyTavern character card following the intended format: https://files.catbox.moe/r20w0r.png (download and import into SillyTavern)
60
+
61
+ And here is a sample of how the model is intended to behave with proper chat and prompt formatting: https://files.catbox.moe/egfd90.png
62
+
63
+
64
+ ### More detailed notes on prompt format and other settings
65
+ - **The model has been tested mainly using Oobabooga's `text-generation-webui` as a backend**
66
+ - **For somewhat improved compatibility with KoboldAI, this version of the model has been trained _without_ BOS or EOS tokens. They should be disabled in `text-generation-webui`.**
67
+ - Preferably respect spacing and newlines shown above. This might not be possible yet with some front-ends.
68
+ - Replace `Character` and `User` in the above template with your desired names.
69
+ - The model expects the characters to use third-person narration in simple past and enclose dialogues within standard quotation marks `" "`.
70
+ - Do not use newlines in Persona and Scenario. Use natural language.
71
+ - The last line in `<<SYSTEM>>` does not need to be written exactly as depicted, but should mention that `Character` and `User` will engage in roleplay and specify the length of `Character`'s messages
72
+ - The message lengths used during training are: short, average, long, huge, humongous. However, there might not have been enough training examples for each length for this instruction to have a significant impact.
73
+ - Suggested text generation settings:
74
+ - Temperature ~0.70
75
+ - Tail-Free Sampling 0.85
76
+ - Repetition penalty 1.05~1.10 (lower values preferred for stability)
77
+ - Not used: Top-P (disabled/set to 1.0), Top-K (disabled/set to 0), Typical P (disabled/set to 1.0)
78
+
79
+
80
+ ### Out-of-Scope Use
81
+
82
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
83
+
84
+ The model has not been tested for:
85
+
86
+ - IRC-style chat
87
+ - Markdown-style roleplay (asterisks for actions)
88
+ - Storywriting
89
+ - Usage without the suggested prompt format (it will output short and boring responses without it)
90
+
91
+ Furthermore, the model is not intended nor expected to provide factual and accurate information on any subject.
92
+
93
+ ## Bias, Risks, and Limitations
94
+
95
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
96
+
97
+ The model will show biases similar to those observed in niche roleplaying forums on the Internet, besides those exhibited by the base model.
98
+
99
+ ### Recommendations
100
+
101
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
102
+
103
+ The model may easily output disturbing and socially inappropriate content and therefore should not be used by minors or within environments where a general audience is expected.
104
+
105
+ ## How to Get Started with the Model
106
+
107
+ Download and load with `text-generation-webui` as a back-end application. It's suggested to start the `webui` via command line. Assuming you have copied the LoRA files under a subdirectory called `lora/limarp-llama2`, you would use something like this:
108
+
109
+ ```
110
+ python server.py --api --verbose --model Llama-7B --lora limarp-llama2
111
+ ```
112
+
113
+
114
+ Then, preferably use [SillyTavern](https://github.com/SillyTavern/SillyTavern) as a front-end using the following settings:
115
+
116
+ ![SillyTavern settings](https://i.imgur.com/gDPC8gx.png)
117
+
118
+ **Important! Disable "Add BOS token"**. It is also recommended to enable "Ban EOS Token" and "Skip Special Tokens" (the model does not use them).
119
+
120
+ ![Disabled BOS and EOS](https://i.imgur.com/9nlmV0q.png)
121
+
122
+ To take advantage of this model's larger context length, unlock the context size and set it up to any length up to 4096 tokens, depending on your VRAM constraints.
123
+
124
+ ![Unlock context size](https://files.catbox.moe/5vgpjt.png)
125
+
126
+ ## Training Details
127
+
128
+ ### Training Data
129
+
130
+ <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
131
+
132
+ The training data comprises **1005** manually edited roleplaying conversation threads from various Internet RP forums, for about 11 megabytes of training data.
133
+
134
+ Character and Scenario information was filled in for every thread with the help of mainly `gpt-4`, but otherwise conversations in the dataset are almost entirely human-generated except for a handful of messages. Character names in the RP stories have been isolated and replaced with standard placeholder strings. Usernames, out-of-context (OOC) messages and personal information have not been intentionally included.
135
+
136
+ ### Training Procedure
137
+
138
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
139
+
140
+ [QLoRA](https://arxiv.org/abs/2305.14314) by Dettmers et al. was used to finetune this model on a single consumer GPU.
141
+
142
+ #### Training Hyperparameters
143
+
144
+ The most important settings for QLoRA were as follows:
145
+
146
+ - --dataset-format input-output
147
+ - --train_on_source True
148
+ - --learning_rate 0.00006
149
+ - --lr_scheduler_type cosine
150
+ - --lora_r 32
151
+ - --max_steps -1
152
+ - --num_train_epochs 2
153
+ - --bf16 True
154
+ - --bits 4
155
+ - --per_device_train_batch_size 1
156
+ - --gradient_accumulation_steps 1
157
+ - --optim paged_adamw_32bit
158
+
159
+ An effective batch size of 1 was found to yield the lowest loss curves during fine-tuning.
160
+
161
+ It was also found that using `--train_on_source False` with the entire training example at the output yields similar results.
162
+
163
+ <!-- ## Evaluation -->
164
+
165
+ <!-- This section describes the evaluation protocols and provides the results. -->
166
+
167
+ ## Environmental Impact
168
+
169
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
170
+
171
+ Finetuning this model requires about 1 kWh of electricity for 2 epochs.