Bad test results using lm-evaluation-harness

#68

by smart-liu - opened Mar 12, 2024

Mar 12, 2024

I have run some tests on Gemma-2B and Gemma-7B using lm-evaluation-harness package, but got terrible results with Gemma-7B. All the code runs well in other models. Is there anything I should be noted?
results of Gemma-2B:

results of Gemma-7B:

smart-liu

Mar 13, 2024

My bad, I didn't update the lm-harness frame, so there's a mistake(missing [BOS] token). After update the frame, it works well and fit the results in the paper.

smart-liu changed discussion status to closed Mar 13, 2024

abhijit2592

Aug 19, 2024

•

edited Aug 19, 2024

@smart-liu Hey can you tell me what you mean by lm-harness frame update? You mean you updated the package?

smart-liu

Aug 19, 2024

@smart-liu Hey can you tell me what you mean by lm-harness frame update? You mean you updated the package?

yes, by updating the package into latest version, the problem was solved.

tanliboy

Aug 20, 2024

You should have seen a log entry in your lm-evaluation-harness evaluation that tells Gemma-2 model is sensitive to the bos token.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment