finetuning

#2
by ArturRenzenbrink - opened

Dear Team,

How could I continue to finetune this model?

kind regards

WhyHow org

Hey @ArturRenzenbrink !

It's actually gotten easier since we did it, as more and more has been understood about how R1 works and reasoning models in general. The best guide would be here:

https://unsloth.ai/blog/r1-reasoning

Unsloth are great and really understand this space, I can't recommend their content enough. PatientSeek is a finetuned version of the Llama Distilled model of DeepSeek R1. So you can download it and follow the linked guide, and should work fine.

  • tom
tomsmoker changed discussion status to closed

Sign up or log in to comment