Pearl EAGLE3.1 scratch speculator, 200k seq4096

Draft/speculator head trained from scratch for pearl-ai/Llama-3.1-8B-Instruct-pearl.

Training

  • Hidden-state samples: 200,000
  • Sequence length: 4096
  • Speculator: EAGLE3.1 / eagle3
  • Flags:
--norm-before-residual
--norm-before-fc
--draft-vocab-size 32000
--epochs 5
--lr 1e-4
--total-seq-len 4096

Final checkpoint source: /persist/pearl_eagle31_train_200k_seq4096/checkpoints/checkpoints_eagle31_scratch/checkpoint_best

Matrix benchmark

Stable old/materialized-B Pearl path, full requested matrix:

K = 2,3,4,5,6
C = 1,4,8,16,32,64
t ~= 50,180,350,700,1400

Coverage: 30 K/C cells x 5 prompt buckets = 150 rows.

Best mean output throughput across prompt buckets: K=2, C=64, 1039.1 output tok/s.

See:

  • FULL_MATRIX_SUMMARY.md
  • summary_compact.json
  • full_matrix_summary.jsonl
  • benchmark_raw/*/combined_summary.csv

Notes

The B-on-chip fused Pearl R128 path is not included in these final results. It was rebuilt as R128 128x128x128 stages=2 to fit H100 SMEM but still crashes requests with CUDA illegal instruction; stable results here use the old/materialized-B path.

Downloads last month
279
Safetensors
Model size
1.0B params
Tensor type
I64
BF16
BOOL
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for Avifenesh/pearl-eagle31-scratch-200k-seq4096