Reproducibility issue

by mlabonne - opened Dec 29, 2023

Dec 29, 2023

Hi @zyh3826 , I'm playing with mergekit and wanted to reproduce your results with this model. Unfortunately, I only got an average score of 48.54 (vs. your 73.3) on the Open LLM Leaderboard.

Did you do extra steps or is there something I might have missed? Thank you.

dillfrescott

Jan 9

Im trying the same layer combination with another model and getting complete gibberish. Amazing this even works at all

mlabonne

Jan 9

It's weird because their model does perform very well on Nous benchmark suite: https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment