MMLU - 77

#3
by orendar - opened

New open-source SOTA!
Just ran 5-shot MMLU with lm-evaluation-harness, see results:

Groups Version Filter n-shot Metric Value Stderr
mmlu N/A none 0 acc 0.7735 ± 0.0034
- humanities N/A none 5 acc 0.7337 ± 0.0062
- other N/A none 5 acc 0.8182 ± 0.0067
- social_sciences N/A none 5 acc 0.8687 ± 0.0060
- stem N/A none 5 acc 0.6958 ± 0.0078

It's even better than mistral-medium. Complete set of benchmarks - https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1/discussions/4

Sign up or log in to comment