Independent MMLU-Pro evaluation
#4
by
yaronr
- opened
Hi Qwen team,
I'm pleased to share our independent evaluation of the model using our implementation of the MMLU-Pro benchmark.
The results demonstrate impressive performance for the model across multiple categories compared with other models.
I hope you find this useful.
Hii kaise ho
@Deathgod7890 Main thik hoon, shukriya
Tum kya kar rahi ho
You may find additional analysis for detailed categories, under a new tab, 'Unity Subjects'