Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
Normalization for MuSR object placement
#1040
by
recojt
- opened
Hello!
I got a question regarding score normalization. On this page (https://huggingface.co/docs/leaderboards/open_llm_leaderboard/normalization), you mention that the lower bound for MuSR object placement is 0.5. However looking at the dataset (e.g. https://huggingface.co/datasets/TAUR-Lab/MuSR/viewer), you can see that the number of choices range between 2 and 5.
For the normalization on the leaderboard, are you actually using 5 or the average number of choices across the dataset?
For the object placement, we say 0.2 (1/5, we're assuming each question has 5 choices) - we might want to make it a bit more precise in the future, cc @alozowski