Model: DeepSeek-R1-Distill-Qwen-7B Sparse 50%

Base model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

Method: SparseGPT unstructured pruning at 50% sparsity

Calibration data: GSM8K math problems (128 samples)

Sparsity achieved: 42.95%

Quality: Multi-step reasoning preserved. Model correctly solves word problems with step-by-step explanation.

Hardware used: Kaggle T4 GPU

Purpose: Proof of concept for PE-MoE architecture โ€” demonstrating that reasoning quality survives 50% weight pruning on a distilled reasoning model.

Limitations:

Sparsity is unstructured โ€” requires sparse-aware inference for memory benefits

Calibration data was general, not domain-specific

Not quantized โ€” AWQ step not yet applied

42.95% actual sparsity vs 50% target due to layer-wise variation

Quality tested on simple math only, not comprehensive benchmarks

Downloads last month
75
Safetensors
Model size
8B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support