Why is there a difference in the evaluation benchmarks between yours and those of the leaderboard?

by diegottt - opened Sep 25, 2024

Sep 25, 2024

MATH 80 vs MATH 0
GPQA 45.5 vs GPQA 9.62
etc....

Qwen org Sep 26, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment