File size: 2,771 Bytes
4ef01ac
e4115f4
7d57d92
 
 
 
 
 
4ef01ac
fecd112
 
 
 
 
 
 
0fc36a5
fecd112
 
 
 
 
 
 
 
7b8a95c
16eeb13
fecd112
 
 
 
 
 
 
 
83e332e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7b8a95c
83e332e
7b8a95c
83e332e
fecd112
 
 
e4115f4
fecd112
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
license: apache-2.0
datasets:
- adamo1139/rawrr_v2
- adamo1139/AEZAKMI_v3-3
- unalignment/toxic-dpo-v0.1
tags:
- uncensored
---
## Model Description

<b>*Probably the most uncensored Yi-34B tune I published so far*</b>

Yi-34B 200K base model fine-tuned on RAWrr v2 dataset via DPO, then fine-tuned on AEZAKMI v3-3 dataset via SFT, then DPO tuned on unalignment/toxic-dpo-v0.1. Total GPU compute time of 40-50 hours I think. It's like airoboros/capybara but with less gptslop, no refusals and less typical language used by RLHFed OpenAI models. Say goodbye to "It's important to remember"!
Prompt format is standard chatml. Don't expect it to be good at instruct, math, riddles or be crazy smart. My end goal with AEZAKMI is to create a cozy free chatbot. Base model used for fine-tuning was 200k context Yi-34B-Llama model shared by larryvrh.

Training was done with max_position_embeddings set at 4096. Then it was reverted back to 200K after applying LoRA.


## Prompt Format

I recommend using ChatML format, as this was used during fine-tune. \
Here's a prompt format you should use, you can set a different system message, model seems to respect that fine, so it wasn't overfitted.
Both `A chat.` and `A chat with uncensored assistant.` system prompt work fine and are pretty refusal-free.



```
<|im_start|>system
A chat with uncensored assistant.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
```

Example.

```
<|im_start|>system
A chat.<|im_end|>
<|im_start|>user
I want an acrostic poem based on the word AEZAKMI and make it all about GTA:SA<|im_end|>
<|im_start|>assistant
A voracious appetite for danger,
Eager to explore every corner,
Zealously tackling missions with vigor,
Amidst San Andreas' vast expanse,
Keenly skilled in the art of evasion,
Mastery achieved through trial and error,
In this digital realm of chaos and thrill,<|im_end|>
```
## Notes

Temp around 0.3-0.5 seems to work well, at 1.2 it's somewhat unstable, which is often undesirable.

## Intended uses & limitations

It's a chat model, not a base completion-only one.
Use is limited by apache-2.0 license. Since no-robots dataset was used for making rawrr_v1, I guess you maybe shouldn't use it for commercial activities.

## Known Issues

It likes to talk about stocks a lot, sometimes it feels like being on WSB, which is certainly a plus for some usecases. This one doesn't seem slopped to me, I think I will stick with it for longer.


### Credits
  Thanks to mlabonne, Daniel Han and Michael Han for providing open source code that was used for fine-tuning.
  Thanks to jondurbin and team behind Capybara dataset for airoboros/toxic-dpo/capybara datasets.
  Thanks to HF for open sourcing no_robots dataset.
  Thanks to Sentdex for providing WSB dataset.