Spaces:
Running
on
CPU Upgrade
Re-Evaluate models with old Llama 3 generation config
Hello,
some models like Neural-Daredevil still have the old generation file that specifies 120001 as the EOS token, (end_of_text), when it should be 120009 (<|eot_id|>). For Llama 3 Instruct, this is set correctly (see here: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/blob/main/generation_config.json#L3)
The old generation_config.json models like Neural Daredevil use basically leads to the model being incapable stopping itself during evaluation which results in unexpectedly low scores:
Here's an example for IFEval.
For models like Neural-Daredevil-abliterated the generation_config.json has to be replaced by the one I've linked above for proper evaluation. NDD has received special attention by me because I really like it, so I have opened a PR that fixes this (https://huggingface.co/mlabonne/NeuralDaredevil-8B-abliterated/discussions/8/files) but there might be more L3 models out there with the old generation file.
Hi
@Dampfinchen
,
Once this model is fixed with the new token management, feel free to resubmit it (and select the new commit) and they'll get re-evaluated.
However, it would be good if people could be careful with their submissions as it's costly to re-run badly submitted models.
Hello @clefourrier
mlabonne/NeuralDaredevil-8B-abliterated
The model has been fixed. Would you be so kind to flush the old test result so I can resubmit it? As I'm not the model creator, I cannot create a new commit.
Thank you!
If it's been merged, you can simply take the hash of the merge commit and submit with it.
(We don't delete previous run results.)
Good to know, thanks @clefourrier and @Dampfinchen