GPT-2 ChatML GGUF (no_robots SFT)

This repository contains GGUF quantized models converted from the fine-tuned JustACluelessKid2/gpt2-chatml-fp32.

Models Available

  • gpt2-f32.gguf (252.5 MB) - Baseline F16-Embedding GGUF
  • ggml-model-Q8_0.gguf (136.7 MB) - High-fidelity 8-bit quantization
  • ggml-model-IQ4_NL.gguf (84.8 MB) - Highly-optimized 4-bit non-linear quantization
  • ggml-model-IQ4_XS.gguf (82.2 MB) - Imatrix optimized 4-bit quantization
  • ggml-model-Q6_K.gguf (106.7 MB) - High-quality 6-bit quantization
  • ggml-model-Q5_K_M.gguf (98.8 MB) - High-quality 5-bit quantization
  • ggml-model-IQ3_XXS.gguf (64.8 MB) - Imatrix 3-bit quantization (Chromebook-compatible)
  • ggml-model-IQ2_M.gguf (62.5 MB) - Imatrix optimized 2-bit quantization
  • ggml-model-IQ2_XXS.gguf (55.5 MB) - Ultra-low 2-bit quantization

These models were calibrated using an importance matrix computed on 1,000 shuffled conversational sequences.

Downloads last month
2,219
GGUF
Model size
0.1B params
Architecture
gpt2
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for JustACluelessKid2/gpt2-chatml-fp32-GGUF

Quantized
(1)
this model

Dataset used to train JustACluelessKid2/gpt2-chatml-fp32-GGUF