Spaces:
Running
on
CPU Upgrade
Is it possible to share commit hash or link of lm-evaluation-harness repository evaluating current leaderbaord?
I try to reproduce the leaderbaord in an offline environment.
Could you share the commit hash or link of lm-evaluation-harness
repo evaluating current open llm leaderboard?
Thanks.
@win7785 Hi ! We are using this version of the harness: b281b0921b636bc36ad05c0b0b0763bd6dd43463
.
It's also in the About section of the leaderboard :)
@SaylorTwift @clefourrier Thanks for your replying :)
I tested the LLaMA-7b model using the commit hash above and scored 0.3563
on the MMLU-5 shot.
The leaderboard showed a score of 0.383
, which is now invisible.
Is there any updates to lm-eval-harness what HF team are using or open llm leaderboard?
Here are the scripts what i used:
git clone https://github.com/EleutherAI/lm-evaluation-harness.git
git checkout b281b0921b636bc36ad05c0b0b0763bd6dd43463
cd lm-evaluation-harness
python main.py --model=hf-causal --model_args="pretrained={path_of_llama-7b}" --tasks="hendrycks*" --num_fewshot=5 --batch_size=2 --no_cache
Hi! We are investigating a small discrepancy for llama models, see the full discussion in this thread.
Thanks for sharing :)