Qwen3-8B — BonfyreFPQ Native Format

This is Qwen/Qwen3-8B compressed into the BonfyreFPQ v4 native format using the Bonfyre FPQ binary.

Compression Summary

File	Tensors	BF16 Source	FPQ Size	File Ratio
model-part1.fpq	81	~3.72 GB	223 MB	~17.1×
model-part2.fpq	114	~3.71 GB	223 MB	~17.1×
model-part3.fpq	111	~3.69 GB	221 MB	~17.1×
model-part4.fpq	92	~2.97 GB	178 MB	~17.1×
model-part5.fpq	1	~1.16 GB	70 MB	~17.1×
Total	399	~15.3 GB	915 MB	~17.1×

FPQ version: v4
Algorithm: COORD@3+QJL (3-bit Lloyd-Max quantization + Johnson-Lindenstrauss transform)
Average bpw: 3.50
FP32-equivalent ratio: 9.1×

Each .fpq file is a stand-alone BonfyreFPQ archive. Decompress and reconstruct with the bonfyre-fpq binary:

bonfyre-fpq decompress model-part1.fpq model-part1.safetensors

BF16 safetensors from Qwen/Qwen3-8B (5 shards, 16.4 GB total).
Compressed shard-by-shard with BonfyreFPQ v4 on macOS/NEON.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

Finetuned

this model