Thank you very much for this model, I have questions
#1
by
NickyNicky
- opened
I would like to know how they made fine tune?
Did you use the huggingface trl GRPO libraries?
Could you provide the libraries for the training?
thank you so much
We used llama factory. Code coming soon in https://github.com/open-thoughts/open-thoughts
Thank you for your model! Did you use only SFT or another methods (like DPO, KTO or PPO)?
We only used SFT for this model