brucethemoose
commited on
Commit
•
43b201d
1
Parent(s):
3c89600
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,61 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: other
|
3 |
+
license_name: yi-license
|
4 |
+
license_link: https://huggingface.co/01-ai/Yi-34B-200K/blob/main/LICENSE
|
5 |
+
datasets:
|
6 |
+
- ai2_arc
|
7 |
+
- unalignment/spicy-3.1
|
8 |
+
- codeparrot/apps
|
9 |
+
- facebook/belebele
|
10 |
+
- boolq
|
11 |
+
- jondurbin/cinematika-v0.1
|
12 |
+
- drop
|
13 |
+
- lmsys/lmsys-chat-1m
|
14 |
+
- TIGER-Lab/MathInstruct
|
15 |
+
- cais/mmlu
|
16 |
+
- Muennighoff/natural-instructions
|
17 |
+
- openbookqa
|
18 |
+
- piqa
|
19 |
+
- Vezora/Tested-22k-Python-Alpaca
|
20 |
+
- cakiki/rosetta-code
|
21 |
+
- Open-Orca/SlimOrca
|
22 |
+
- spider
|
23 |
+
- squad_v2
|
24 |
+
- migtissera/Synthia-v1.3
|
25 |
+
- datasets/winogrande
|
26 |
+
- nvidia/HelpSteer
|
27 |
+
- Intel/orca_dpo_pairs
|
28 |
+
- unalignment/toxic-dpo-v0.1
|
29 |
+
- jondurbin/truthy-dpo-v0.1
|
30 |
+
- allenai/ultrafeedback_binarized_cleaned
|
31 |
+
- Squish42/bluemoon-fandom-1-1-rp-cleaned
|
32 |
+
- LDJnr/Capybara
|
33 |
+
- JULIELab/EmoBank
|
34 |
+
- kingbri/PIPPA-shareGPT
|
35 |
+
---
|
36 |
+
|
37 |
+
# A bagel, with everything
|
38 |
+
|
39 |
+
![bagel](bagel.png)
|
40 |
+
|
41 |
+
Just a fiction oriented 6bpw exl2 quantization of https://huggingface.co/jondurbin/bagel-dpo-34b-v0.2
|
42 |
+
|
43 |
+
Quantized on 300K tokens of two Vicuna format chats, a sci fi story and a fiction story at a long context. This should yield better storywriting performance than the default exl2 quantization.
|
44 |
+
|
45 |
+
|
46 |
+
***
|
47 |
+
## Running
|
48 |
+
Being a Yi model, try running a lower temperature with ~0.05 MinP, a little repitition penalty, maybe mirostat with a low tau, and no other samplers. Yi tends to run "hot" by default.
|
49 |
+
|
50 |
+
24GB GPUs can run Yi-34B-200K models at **45K-75K context** with exllamav2, and performant UIs like [exui](https://github.com/turboderp/exui). I go into more detail in this [post](https://old.reddit.com/r/LocalLLaMA/comments/1896igc/how_i_run_34b_models_at_75k_context_on_24gb_fast/)
|
51 |
+
|
52 |
+
***
|
53 |
+
## Commands
|
54 |
+
First pass:
|
55 |
+
```
|
56 |
+
python convert.py --in_dir /home/alpha/FastModels/jondurbin_bagel-dpo-34b-v0.2 -o /home/alpha/FastModels/scratch -om /home/alpha/FastModels/bagelmeas.json --cal_dataset /home/alpha/Documents/stories.parquet -ml 32768 -mr 7 -ss 4096 -b 4.0 -hb 6 -nr
|
57 |
+
```
|
58 |
+
Second pass:
|
59 |
+
```
|
60 |
+
python convert.py --in_dir /home/alpha/FastModels/jondurbin_bagel-dpo-34b-v0.2 -o /home/alpha/FastModels/scratch -m /home/alpha/FastModels/bagelmeas.json --cal_dataset /home/alpha/Documents/stories.parquet -l 12288 -r 25 -ml 32768 -mr 9 -ss 4096 -b 4.0 -hb 6 -cf /home/alpha/FastModels/jondurbin_bagel-dpo-34b-v0.2-exl2-4bpw-fiction -nr
|
61 |
+
```
|