lemonilia commited on
Commit
b6b58c1
1 Parent(s): 97f37e8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md CHANGED
@@ -1,3 +1,60 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ # LimaRP-Llama2-7B-v3 (Alpaca, experimental, 4-bit LoRA adapter)
6
+
7
+ This is an experimental version of LimaRP using a somewhat updated dataset (1800 training samples)
8
+ and a 2-pass training procedure. The first pass includes unsupervised tuning on 2800 stories within
9
+ 4k tokens and the second is LimaRP.
10
+
11
+ For more details about LimaRP, see the model page for the [previously released version](https://huggingface.co/lemonilia/limarp-llama2-v2).
12
+ Most details written there apply for this version as well.
13
+
14
+ ## Prompt used
15
+ Same as before. It uses Alpaca format, with `### Input:` immediately preceding user inputs and `### Response`
16
+ immediately preceding model outputs.
17
+
18
+ ```
19
+ ### Instruction:
20
+ Character's Persona: {bot character description}
21
+
22
+ User's Persona: {user character description}
23
+
24
+ Scenario: {what happens in the story}
25
+
26
+ Play the role of Character. You must engage in a roleplaying chat with User below this line. Do not write dialogues and narration for User. Character should respond with messages of medium length.
27
+
28
+ ### Input:
29
+ User: {utterance}
30
+
31
+ ### Response:
32
+ Character: {utterance}
33
+ ```
34
+
35
+ ### Other notes
36
+ - Replace all the text in curly braces (curly braces included) with your own text.
37
+ - `User` and `Character` should be replaced with appropriate names.
38
+
39
+ ## Training Hyperparameters
40
+ [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) was used for training.
41
+ The model has been trained as a 4-bit LoRA adapter. It's so large because a LoRA rank
42
+ of 256 was used. It's suggested to merge it to the base Llama2-7B model.
43
+
44
+ - learning_rate: 0.0002
45
+ - lr_scheduler_type: constant
46
+ - lora_r: 256
47
+ - lora_alpha: 16
48
+ - lora_dropout: 0.1
49
+ - lora_target_linear: True
50
+ - num_epochs: 1
51
+ - bf16: True
52
+ - tf32: True
53
+ - load_in_4bit: True
54
+ - adapter: qlora
55
+ - micro_batch_size: 2
56
+ - gradient_accumulation_steps: 1
57
+ - optimizer: adamw_torch
58
+
59
+ For the multi-stage training, the `lora_model_dir` option was used to load and train the
60
+ previously created adapter.