gemma-4-31b-it-heretic-ara-eagle3-ko

This is Eagle-3 draft model for Korean conversation only. For other languages, please use other model.

You may expect 1.5x speed boost at maximum on Korean workload.

ν•œκ΅­μ–΄ λŒ€ν™”μ— μ΅œμ ν™” 된 Eagle-3 Draft λͺ¨λΈμž…λ‹ˆλ‹€. 타 μ–Έμ–΄μ˜ 경우 λ‹€λ₯Έ λͺ¨λΈμ„ μ‚¬μš©ν•˜μ„Έμš”.

ν•œκ΅­μ–΄ μž‘μ—…μ—μ„œ μ΅œλŒ€ 1.5x 정도 속도 κ°œμ„ μ΄ μžˆμŠ΅λ‹ˆλ‹€.

Model Overview

How it was made

  • Training framework: Speculators
  • Datasets: Private (Korean 8: English 2, 60k, No Reasoning)
  • Training hardware: 1 DGX Spark
Name Value
Learning Rate 1e-4
Scheduler Type Cosine
Warmup steps 50
Sequence length 4096
Epochs 4
Vocab size 32000

Usage

Tested with vLLM on DGX Spark (sm121)

vllm serve hell0ks/gemma-4-31b-it-heretic-ara-FP8 --port 8000 --reasoning-parser gemma4 --enable-auto-tool-choice --tool-call-parser gemma4 --speculative-config '{"model": "hell0ks/gemma-4-31b-it-heretic-ara-eagle3-ko", "num_speculative_tokens": 3, "method": "eagle3"}'
Downloads last month
61
Safetensors
Model size
2B params
Tensor type
I64
Β·
BF16
Β·
BOOL
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support