Model: DeepSeek-R1-Distill-Qwen-7B Sparse 50%

Base model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

Method: SparseGPT unstructured pruning at 50% sparsity

Calibration data: GSM8K math problems (128 samples)

Sparsity achieved: 42.95%

Quality: Multi-step reasoning preserved. Model correctly solves word problems with step-by-step explanation.

Hardware used: Kaggle T4 GPU

Purpose: Proof of concept for PE-MoE architecture — demonstrating that reasoning quality survives 50% weight pruning on a distilled reasoning model.

Limitations:

Sparsity is unstructured — requires sparse-aware inference for memory benefits

Calibration data was general, not domain-specific

Not quantized — AWQ step not yet applied

42.95% actual sparsity vs 50% target due to layer-wise variation

Quality tested on simple math only, not comprehensive benchmarks

Safetensors

Model size

8B params

Tensor type

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support