Merry commited on
Commit
04872d9
·
1 Parent(s): 84b6353

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -7
README.md CHANGED
@@ -10,23 +10,25 @@ datasets:
10
  - EleutherAI/the_pile_deduplicated
11
  ---
12
 
13
- Converted and quantized from EleutherAI's Pythia Deduped checkpoints.
14
 
15
  Notes:
16
  - Converted with ggerganov/ggml's gpt-neox conversion script, and tested with KoboldCpp.
17
- - I can't promise that this will work, especially with other frontends. I've had problems when generating words like "Alice" or "Hakurei" / "Gensokyo". Could be related to the ggml implementation of GPT-NeoX having a "hacked" tokenizer [(source)](https://github.com/ggerganov/ggml/tree/master/examples/gpt-neox#notes).
18
 
19
  Versions:
20
 
21
- **2023-04-20:** *q4_3. Used [commit 05f3079](https://github.com/ggerganov/ggml/tree/05f307971862b83df12fada0c42ee027ba5a82b5/examples/stablelm).*
22
 
23
- **2023-04-30:** *q5_0, q5_1, and q8_0, up to 2.8B. I can't upload all conversions of 6.9B and 12B due to my internet. Used [commit 5dd92f4](https://github.com/ggerganov/ggml/tree/5dd92f421ee44f18b8fde0afbf5ca8fc7bf93841/examples/stablelm).*
24
 
25
- **2023-05-06:** *q4_0 and q4_2, up to 2.8B. Used [commit ff6e03c](https://github.com/ggerganov/ggml/tree/ff6e03cbcd9bf6e9fa41d49f2495c042efae4dc6/examples/stablelm).*
26
 
27
- **2023-05-15:** **RECOMMENDED** - *New quantization format. q4_0 and q5_1, up to 2.8B. Used [commit 010203f](https://github.com/ggerganov/ggml/tree/010203f94a85df5c86b773dc5acb698c8e7b1e7b/examples/gpt-neox).*
28
 
29
- They're separated by date and commit so it's easier to track any breaking changes.
 
 
30
 
31
  # RAM USAGE (on KoboldCpp w/ OpenBLAS)
32
  Model | Initial RAM | After generation
 
10
  - EleutherAI/the_pile_deduplicated
11
  ---
12
 
13
+ ### This repository contains quantized conversions of EleutherAI's Pythia Deduped checkpoints.
14
 
15
  Notes:
16
  - Converted with ggerganov/ggml's gpt-neox conversion script, and tested with KoboldCpp.
17
+ - I can't promise that this will work, especially with other frontends. ~~I've had problems when generating words like "Alice" or "Hakurei" / "Gensokyo". Could be related to the ggml implementation of GPT-NeoX having a "hacked" tokenizer [(source)](https://github.com/ggerganov/ggml/tree/master/examples/gpt-neox#notes).~~ **This seems to have been improved with KoboldCpp v1.25.1 and my ggmlv3 conversions.**
18
 
19
  Versions:
20
 
21
+ **2023-04-20:** *q4_3. Used [commit 05f3079](https://github.com/ggerganov/ggml/tree/05f307971862b83df12fada0c42ee027ba5a82b5/examples/stablelm)*
22
 
23
+ **2023-04-30:** *q5_0, q5_1, and q8_0, up to 2.8B. I can't upload all conversions of 6.9B and 12B due to my internet. Used [commit 5dd92f4](https://github.com/ggerganov/ggml/tree/5dd92f421ee44f18b8fde0afbf5ca8fc7bf93841/examples/stablelm)*
24
 
25
+ **2023-05-06:** *q4_0 and q4_2, up to 2.8B. Used [commit ff6e03c](https://github.com/ggerganov/ggml/tree/ff6e03cbcd9bf6e9fa41d49f2495c042efae4dc6/examples/stablelm)*
26
 
27
+ **2023-05-15:** *New quantization format (ggmlv2). q4_0 and q5_1, up to 2.8B. Used [commit 010203f](https://github.com/ggerganov/ggml/tree/010203f94a85df5c86b773dc5acb698c8e7b1e7b/examples/gpt-neox)*
28
 
29
+ **2023-05-25 (RECOMMENDED):** *New quantization format (ggmlv3). q4_0 and q5_1, up to 2.8B. Used [commit 73ad593](https://github.com/ggerganov/ggml/tree/73ad593cf84f864f0fcfd3a196253575c70d66a2/examples/gpt-neox)*
30
+
31
+ They're separated by date and commit so it's easier to track of any breaking changes.
32
 
33
  # RAM USAGE (on KoboldCpp w/ OpenBLAS)
34
  Model | Initial RAM | After generation