Announcing SLM-Bench: Evaluation Benchmark for SLM-10M
#3
by PY-AI-Dev - opened
We've released SLM-Bench, a benchmark specifically designed for evaluating sub-10M models like SLM-10M.
What It Covers
6 categories, 500 questions each (3,000 total):
- Arithmetic β math with plausible distractors
- Pattern β sequence completion
- Grammar β syntactic understanding
- Vocabulary β WordNet-based word knowledge
- Logic β syllogistic reasoning
- Word Analogy β relational reasoning
100% Programmatically Generated
Zero manual annotation. The entire benchmark is generated by code, making it fully reproducible and transparent.
Usage
lm_eval --model hf \
--model_args pretrained=liodon-ai/slm-10m,trust_remote_code=True \
--tasks slm_bench \
--device cuda:0 --batch_size 64
Links
Apache-2.0 license.