VibeThinker-3B-LQ8-GGUF

🚨 1.This model was not trained on tool-calling or agent-based programming data. We therefore do not recommend using it for tasks that involve function calling, API orchestration, or autonomous coding agents. For programming tasks, we recommend using this model on competitive programming problems (e.g., LeetCode-style).

2.For harder math reasoning, try AMOBench, a problem set harder than the International Mathematical Olympiad (IMO), with included standard answers. Use it to evaluate VibeThinker against other SOTA models. Note: due to extreme difficulty, set max tokens to 60K–100K.

GitHub  |  ModelScope  |  Technical Report

This repository contains model weights for the unofficial LQ8 quantizations of VibeThinker 3B.

LQ8 is an experimental quantization technique that is still in early beta, designed to provide fp16-level quality with the same or lower memory footprint as Q8_0.

LQ8 is currently compatible with llama.cpp and Ollama out of the box. Please create a discussion if you find a bug.

File Name Quant Type Bit Depth Size Download Link
model-LQ8.gguf LQ8 ~8 bpw 3.33 GB 📥 Download LQ8
Downloads last month
23
GGUF
Model size
3B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for reecdev/VibeThinker-3B-LQ8-GGUF

Base model

Qwen/Qwen2.5-3B
Quantized
(49)
this model

Paper for reecdev/VibeThinker-3B-LQ8-GGUF