afrideva
/

smol_llama-220M-GQA-GGUF

Text Generation

Model card Files Files and versions Community

BEE-spoke-data/smol_llama-220M-GQA-GGUF

Quantized GGUF model files for smol_llama-220M-GQA from BEE-spoke-data

Name	Quant method	Size
smol_llama-220m-gqa.fp16.gguf	fp16	436.50 MB
smol_llama-220m-gqa.q2_k.gguf	q2_k	102.60 MB
smol_llama-220m-gqa.q3_k_m.gguf	q3_k_m	115.70 MB
smol_llama-220m-gqa.q4_k_m.gguf	q4_k_m	137.58 MB
smol_llama-220m-gqa.q5_k_m.gguf	q5_k_m	157.91 MB
smol_llama-220m-gqa.q6_k.gguf	q6_k	179.52 MB
smol_llama-220m-gqa.q8_0.gguf	q8_0	232.28 MB

Original Model Card:

smol_llama: 220M GQA

model card WIP, more details to come

A small 220M param (total) decoder model. This is the first version of the model.

1024 hidden size, 10 layers
GQA (32 heads, 8 key-value), context length 2048
train-from-scratch on one GPU :)

Downloads last month: 45

GGUF

Model size

218M params

Architecture

llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Model tree for afrideva/smol_llama-220M-GQA-GGUF

Base model

BEE-spoke-data/smol_llama-220M-GQA

Quantized

(3)

this model

Datasets used to train afrideva/smol_llama-220M-GQA-GGUF