File size: 2,151 Bytes
9b4d480
787577d
 
 
07b19cc
787577d
 
9b4d480
787577d
 
9b4d480
642a988
338e28c
04872d9
338e28c
 
 
 
 
 
 
 
 
b3eec42
0a0d945
d2882ad
 
de81924
d13c5a0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
338e28c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
---
language:
- en
tags:
- ggml
- causal-lm
- pythia
license: apache-2.0
datasets:
- EleutherAI/the_pile_deduplicated
---

# Pythia Deduped Series GGML
### This repository contains quantized conversions of EleutherAI's Pythia Deduped checkpoints.
*For use with frontends that support GGML quantized GPT-NeoX models, such as KoboldCpp and Oobabooga (with the CTransformers loader).*

*Last updated on 2023-05-25.*

For other versions of the models, see here:
- [GGMLv1 q4_3](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/a695a4c30c01ed9a41200c01f85d47c819fc93dd/2023-04-20) (70M to 12B)
- [GGMLv1 q5_0 / q5_1 / q8_0](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/a695a4c30c01ed9a41200c01f85d47c819fc93dd/2023-04-30) (70M to 2.8B)
- [GGMLv1 q4_0 / q4_2](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/a695a4c30c01ed9a41200c01f85d47c819fc93dd/2023-05-06) (70M to 2.8B)
- [GGMLv2 q4_0 / q5_1](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/a695a4c30c01ed9a41200c01f85d47c819fc93dd/2023-05-15) (70M to 2.8B)
- [GGMLv3 q4_0 / q5_1](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/main) (70M to 2.8B)

**Description:**
- The motivation behind these quantizations was that the LLaMA series lacks sizes below 7B, whereas it was the norm for older models to be available in as little as ~125M parameters. This makes it uncomfortable to run on hardware with less than 4GB of RAM, even with 2-bit quantization.

# RAM USAGE
Model | RAM usage
:--:|:--:
Unloaded | 41.3 MiB
|
ggmlv3-pythia-70m-deduped-q4_0.bin | 95.5 MiB
ggmlv3-pythia-160m-deduped-q4_0.bin | 201.1 MiB
ggmlv3-pythia-410m-deduped-q4_0.bin | 415.1 MiB
ggmlv3-pythia-1b-deduped-q4_0.bin | 762.2 MiB
ggmlv3-pythia-1.4b-deduped-q4_0.bin | 1.0 GiB
ggmlv3-pythia-2.8b-deduped-q4_0.bin | 1.9 GiB
|
ggmlv3-pythia-70m-deduped-q5_1.bin | 108.7 MiB
ggmlv3-pythia-160m-deduped-q5_1.bin | 226.9 MiB
ggmlv3-pythia-410m-deduped-q5_1.bin | 494.0 MiB
ggmlv3-pythia-1b-deduped-q5_1.bin | 943.9 MiB
ggmlv3-pythia-1.4b-deduped-q5_1.bin | 1.3 GiB
ggmlv3-pythia-2.8b-deduped-q5_1.bin | 2.3 GiB

*Tested on KoboldCpp with OpenBLAS enabled.*