Fizzarolli commited on
Commit
dab3f7b
1 Parent(s): b1a191b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -7
README.md CHANGED
@@ -9,31 +9,30 @@ tags:
9
  # Teleut 7b RP
10
  [cute boygirlthing pending]
11
 
12
- A roleplay-focused LoRA finetune of Teleut 7b. Methodology and hyperparams inspired by [SorcererLM](https://huggingface.co/rAIfle/SorcererLM-8x22b-bf16).
13
 
14
  ## Dataset
15
  The worst mix of data you've ever seen. Like, seriously, you do not want to see the things that went into this model. It's bad.
16
 
17
  ## Recommended Settings
18
- Chat template: ChatML
19
  Recommended samplers (not the be-all-end-all, try some on your own!):
20
  - Temp 1.03 / TopK 200 / MinP 0.05 / TopA 0.2
21
  - Temp 1.03 / TFS 0.75 / TopA 0.3
22
 
23
  ## Hyperparams
24
- General:
25
  - Epochs = 2
26
  - LR = 6e-5
27
  - LR Scheduler = Cosine
28
  - Optimizer = Paged AdamW 8bit
29
  - Effective batch size = 12
30
-
31
- LoRA:
32
  - Rank = 16
33
  - Alpha = 32
34
  - Dropout = 0.25 (Inspiration: [Slush](https://huggingface.co/crestf411/Q2.5-32B-Slush))
35
 
36
  ## Credits
37
- Thanks to the people who created the data. I would credit you, but that would be cheating ;)
38
- Thanks to all Allura members, especially Toasty, for testing and emotional support ilya /platonic
39
  NO thanks to Infermatic. They suck at hosting models
 
9
  # Teleut 7b RP
10
  [cute boygirlthing pending]
11
 
12
+ A roleplay-focused LoRA finetune of Teleut 7b. Methodology and hyperparams inspired by [SorcererLM](https://huggingface.co/rAIfle/SorcererLM-8x22b-bf16) and [Slush](https://huggingface.co/crestf411/Q2.5-32B-Slush).
13
 
14
  ## Dataset
15
  The worst mix of data you've ever seen. Like, seriously, you do not want to see the things that went into this model. It's bad.
16
 
17
  ## Recommended Settings
18
+ Chat template: ChatML
19
  Recommended samplers (not the be-all-end-all, try some on your own!):
20
  - Temp 1.03 / TopK 200 / MinP 0.05 / TopA 0.2
21
  - Temp 1.03 / TFS 0.75 / TopA 0.3
22
 
23
  ## Hyperparams
24
+ ### General
25
  - Epochs = 2
26
  - LR = 6e-5
27
  - LR Scheduler = Cosine
28
  - Optimizer = Paged AdamW 8bit
29
  - Effective batch size = 12
30
+ ### LoRA
 
31
  - Rank = 16
32
  - Alpha = 32
33
  - Dropout = 0.25 (Inspiration: [Slush](https://huggingface.co/crestf411/Q2.5-32B-Slush))
34
 
35
  ## Credits
36
+ Humongous thanks to the people who created the data. I would credit you all, but that would be cheating ;)
37
+ Big thanks to all Allura members, especially Toasty, for testing and emotional support ilya /platonic
38
  NO thanks to Infermatic. They suck at hosting models