Pearl EAGLE3.1 scratch speculator, 200k seq4096

Draft/speculator head trained from scratch for pearl-ai/Llama-3.1-8B-Instruct-pearl.

Training

Hidden-state samples: 200,000
Sequence length: 4096
Speculator: EAGLE3.1 / eagle3
Flags:

--norm-before-residual
--norm-before-fc
--draft-vocab-size 32000
--epochs 5
--lr 1e-4
--total-seq-len 4096

Final checkpoint source: /persist/pearl_eagle31_train_200k_seq4096/checkpoints/checkpoints_eagle31_scratch/checkpoint_best

Matrix benchmark

Stable old/materialized-B Pearl path, full requested matrix:

K = 2,3,4,5,6
C = 1,4,8,16,32,64
t ~= 50,180,350,700,1400

Coverage: 30 K/C cells x 5 prompt buckets = 150 rows.

Best mean output throughput across prompt buckets: K=2, C=64, 1039.1 output tok/s.

See:

FULL_MATRIX_SUMMARY.md
summary_compact.json
full_matrix_summary.jsonl
benchmark_raw/*/combined_summary.csv

Notes

The B-on-chip fused Pearl R128 path is not included in these final results. It was rebuilt as R128 128x128x128 stages=2 to fit H100 SMEM but still crashes requests with CUDA illegal instruction; stable results here use the old/materialized-B path.

Downloads last month: 279

Safetensors

Model size

1.0B params

Tensor type

I64

BF16

BOOL

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Avifenesh/pearl-eagle31-scratch-200k-seq4096

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Quantized

pearl-ai/Llama-3.1-8B-Instruct-pearl

Finetuned

(1)

this model