Recreating MMLU scores
#2
by
theblackcat102
- opened
Do you guys use lm-evaluation-harnesss for MMLU evaluation? I'm not getting the stark improvement found in fineweb-edu image using this checkpoint.
We do not. I've added a note to the top of this file detailing how you can reproduce our setup: https://huggingface.co/datasets/HuggingFaceFW/fineweb/blob/main/lighteval_tasks.py
@guipenedo Thanks for the quick reply, I will check it out
theblackcat102
changed discussion status to
closed