Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

1120

[Help needed] Re-labelling models to separate different kinds of fine-tuning

#160

by clefourrier HF staff - opened Aug 3, 2023

Discussion

clefourrier

Open LLM Leaderboard org Aug 3, 2023

@jaspercatapang suggested we should separate instruction-tuned from (vanilla) fine-tuned, and I agree!

If you want to give a hand, please open a PR and change the information in the TYPE_METADATA dict in this file, and I'll merge it asap!

deleted

Aug 3, 2023

•

edited Aug 3, 2023

Re-labelled the models in this PR here. It might need some reformatting.

For consistency, I followed a simple guide:

If the model type is either pre-trained/RL, retain it.
If the model card mentions that it follows instructions, then the new model type is instruction-tuned
If the model card makes no reference to instruction-following, then the new model type is fine-tuned

If there are errors in my re-labelling, please open a PR to modify it. Thank you.

clefourrier

Open LLM Leaderboard org Aug 7, 2023

That's amazing, thank you!
I'll leave your PR open for the week to in case the community wants to comment on it/adjust, and merge it on Friday!

clefourrier pinned discussion Aug 8, 2023

felixz

Aug 8, 2023

My only suggestion is that maybe there should be a "dialog-tuned" category. Instruction tuning does not imply tuning for multi turn dialog or aka "chat". RLHF almost always means dialog tuned. I am not aware of anyone doing RLHF for something not a chat model. Essentially instruction tuning alone implies a single turn dialog; One instruction - one response. If model card says we made a chat model or we tuned for dialog that implies more than just instruction tuning.

clefourrier changed discussion status to closed Aug 11, 2023

clefourrier unpinned discussion Aug 11, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment