Edit model card

Pythia Deduped Series GGML

This repository contains quantized conversions of EleutherAI's Pythia Deduped checkpoints.

For use with frontends that support GGML quantized GPT-NeoX models, such as KoboldCpp and Oobabooga (with the CTransformers loader).

Last updated on 2023-05-25.

For other versions of the models, see here:

Description:

  • The motivation behind these quantizations was that the LLaMA series lacks sizes below 7B, whereas it was the norm for older models to be available in as little as ~125M parameters. This makes it uncomfortable to run on hardware with less than 4GB of RAM, even with 2-bit quantization.

RAM USAGE

Model RAM usage
Unloaded 41.3 MiB
ggmlv3-pythia-70m-deduped-q4_0.bin 95.5 MiB
ggmlv3-pythia-160m-deduped-q4_0.bin 201.1 MiB
ggmlv3-pythia-410m-deduped-q4_0.bin 415.1 MiB
ggmlv3-pythia-1b-deduped-q4_0.bin 762.2 MiB
ggmlv3-pythia-1.4b-deduped-q4_0.bin 1.0 GiB
ggmlv3-pythia-2.8b-deduped-q4_0.bin 1.9 GiB
ggmlv3-pythia-70m-deduped-q5_1.bin 108.7 MiB
ggmlv3-pythia-160m-deduped-q5_1.bin 226.9 MiB
ggmlv3-pythia-410m-deduped-q5_1.bin 494.0 MiB
ggmlv3-pythia-1b-deduped-q5_1.bin 943.9 MiB
ggmlv3-pythia-1.4b-deduped-q5_1.bin 1.3 GiB
ggmlv3-pythia-2.8b-deduped-q5_1.bin 2.3 GiB

Tested on KoboldCpp with OpenBLAS enabled.

Downloads last month
0
Inference Examples
Unable to determine this model's library. Check the docs .

Dataset used to train Crataco/Pythia-Deduped-Series-GGML