Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

1102

How is lm-evaluation-harness run for chat and instruct models?

#49

by winddude - opened Jun 6, 2023

Discussion

winddude

Jun 6, 2023

How are you running lm-evaluation-harness for chat and instruct models?

clefourrier

Open LLM Leaderboard org Jul 7, 2023

Hi! Exactly the same as for the other models, in order to get something as reproducible as possible :)

clefourrier changed discussion status to closed Jul 15, 2023

andercorral

Jan 5, 2024

•

edited Jan 5, 2024

Hi, what about the special tags (i.e: [INST]) that are used during finetuning? These are not added in the prompts in lm-evaluation-harness. If instructions do not follow the format used during finetuning it may not be a fair comparison.

clefourrier

Open LLM Leaderboard org Jan 5, 2024

Hi!
Using the system prompts for evaluations have been discussed, it's something we'll add during Q1, but which is not done at the moment.
It's a known limitation of the leaderboard.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment