Spaces:

hallucinations-leaderboard
/

leaderboard

Running on CPU Upgrade

App Files Files Community

Why is the score for RACE so low?

#18

by scinerd68 - opened Mar 15

Discussion

scinerd68

Mar 15

If I understand correctly, the score for this dataset is just accuracy where the model must answer multiple choice questions. In that case, 49% as the highest score is really low. Also according to the Orca2 paper, it reached ~80% accuracy on this dataset. Am I misunderstanding something?

pminervini

hallucinations-leaderboard org Apr 10

Hey @thangphan68, we use the Harness implementation of RACE, which is available here: https://github.com/EleutherAI/lm-evaluation-harness/tree/main/lm_eval/tasks/race

@thangphan68 if you have suggestions on how to improve it, I can integrate your changes and do a pull request on the Harness repo!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment