GGML converted versions of EleutherAI's Pythia models

Description:

The Pythia Scaling Suite is a collection of models developed to facilitate interpretability research. It contains two sets of eight models of sizes 70M, 160M, 410M, 1B, 1.4B, 2.8B, 6.9B, and 12B. For each size, there are two models: one trained on the Pile, and one trained on the Pile after the dataset has been globally deduplicated. All 8 model sizes are trained on the exact same data, in the exact same order. We also provide 154 intermediate checkpoints per model, hosted on Hugging Face as branches.

The Pythia model suite was deliberately designed to promote scientific research on large language models, especially interpretability research. Despite not centering downstream performance as a design goal, we find the models match or exceed the performance of similar and same-sized models, such as those in the OPT and GPT-Neo suites.

Converted Models:

Name	Based on	Type	Container	GGML Version
pythia-1.4b-f16.bin	EleutherAI/pythia-1.4b	F16	GGML	V3
pythia-1.4b-q4_0.bin	EleutherAI/pythia-1.4b	Q4_0	GGML	V3
pythia-1.4b-q4_0-ggjt.bin	EleutherAI/pythia-1.4b	Q4_0	GGJT	V3
pythia-1.4b-q5_1.bin	EleutherAI/pythia-1.4b	Q5_1	GGML	V3
pythia-1.4b-q5_1-ggjt.bin	EleutherAI/pythia-1.4b	Q5_1	GGJT	V3
pythia-160m-f16.bin	EleutherAI/pythia-160m	F16	GGML	V3
pythia-160m-q4_0.bin	EleutherAI/pythia-160m	Q4_0	GGML	V3
pythia-160m-q4_0-ggjt.bin	EleutherAI/pythia-160m	Q4_0	GGJT	V3
pythia-160m-q5_1.bin	EleutherAI/pythia-160m	Q5_1	GGML	V3
pythia-160m-q5_1-ggjt.bin	EleutherAI/pythia-160m	Q5_1	GGJT	V3
pythia-1b-f16.bin	EleutherAI/pythia-1b	F16	GGML	V3
pythia-1b-q4_0.bin	EleutherAI/pythia-1b	Q4_0	GGML	V3
pythia-1b-q4_0-ggjt.bin	EleutherAI/pythia-1b	Q4_0	GGJT	V3
pythia-1b-q5_1.bin	EleutherAI/pythia-1b	Q5_1	GGML	V3
pythia-1b-q5_1-ggjt.bin	EleutherAI/pythia-1b	Q5_1	GGJT	V3
pythia-2.8b-f16.bin	EleutherAI/pythia-2.8b	F16	GGML	V3
pythia-2.8b-q4_0.bin	EleutherAI/pythia-2.8b	Q4_0	GGML	V3
pythia-2.8b-q4_0-ggjt.bin	EleutherAI/pythia-2.8b	Q4_0	GGJT	V3
pythia-2.8b-q5_1.bin	EleutherAI/pythia-2.8b	Q5_1	GGML	V3
pythia-2.8b-q5_1-ggjt.bin	EleutherAI/pythia-2.8b	Q5_1	GGJT	V3
pythia-410m-f16.bin	EleutherAI/pythia-410m	F16	GGML	V3
pythia-410m-q4_0.bin	EleutherAI/pythia-410m	Q4_0	GGML	V3
pythia-410m-q4_0-ggjt.bin	EleutherAI/pythia-410m	Q4_0	GGJT	V3
pythia-410m-q5_1.bin	EleutherAI/pythia-410m	Q5_1	GGML	V3
pythia-410m-q5_1-ggjt.bin	EleutherAI/pythia-410m	Q5_1	GGJT	V3
pythia-70m-f16.bin	EleutherAI/pythia-70m	F16	GGML	V3
pythia-70m-q4_0.bin	EleutherAI/pythia-70m	Q4_0	GGML	V3
pythia-70m-q4_0-ggjt.bin	EleutherAI/pythia-70m	Q4_0	GGJT	V3
pythia-70m-q5_1.bin	EleutherAI/pythia-70m	Q5_1	GGML	V3
pythia-70m-q5_1-ggjt.bin	EleutherAI/pythia-70m	Q5_1	GGJT	V3

Usage

Python via llm-rs:

Installation

Via pip: pip install llm-rs

Run inference

from llm_rs import AutoModel

#Load the model, define any model you like from the list above as the `model_file`
model = AutoModel.from_pretrained("rustformers/pythia-ggml",model_file="pythia-70m-q4_0-ggjt.bin")

#Generate
print(model.generate("The meaning of life is"))

Rust via Rustformers/llm:

Installation

git clone --recurse-submodules https://github.com/rustformers/llm.git
cd llm
cargo build --release

Run inference

cargo run --release -- gptneox infer -m path/to/model.bin  -p "Tell me how cool the Rust programming language is:"