Announcing SLM-Bench: Evaluation Benchmark for SLM-10M

#3
by PY-AI-Dev - opened
Liodon AI org

We've released SLM-Bench, a benchmark specifically designed for evaluating sub-10M models like SLM-10M.

What It Covers

6 categories, 500 questions each (3,000 total):

  • Arithmetic β€” math with plausible distractors
  • Pattern β€” sequence completion
  • Grammar β€” syntactic understanding
  • Vocabulary β€” WordNet-based word knowledge
  • Logic β€” syllogistic reasoning
  • Word Analogy β€” relational reasoning

100% Programmatically Generated

Zero manual annotation. The entire benchmark is generated by code, making it fully reproducible and transparent.

Usage

lm_eval --model hf \
  --model_args pretrained=liodon-ai/slm-10m,trust_remote_code=True \
  --tasks slm_bench \
  --device cuda:0 --batch_size 64

Links

Apache-2.0 license.

Sign up or log in to comment