why two versions in a day?
#1
by
supercharge19
- opened
what is the difference in them? or did you make a mistake training first version?
Both models are trained on the same preference dataset. The only difference is that v1 has a learning rate of 5e-5 and v2 has a learning rate of 5e-7.
Here's the axolotl configuration I used if you're interested: https://gist.github.com/mlabonne/0f781fd9eb47b7d5e4778d285f4a6aee You can find results here: https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard
Hi,
So "NeuralOmniBeagle-7B-v2" is based on mlabonne/OmniBeagle-7B?
Thanks
Yes it applies DPO to this model