Smaller model for fine tunning

#23
by vmirea - opened

I am testing the model on a GTX 4090 24G and it's great. I use TheBoke/Phind-CodeLlama-34B-v2-GGUF/phind-codellama-34b-v2.Q5_0.gguf
I would have loved to fine tune the model with my code but I'm getting OOM issues.
Because of this I would have loved to have a smaller version to try to fine tune using my personal data to see the results.

Meanwhile I have found out that you can load the model and say that you use 30 layers on GPU an the rest on CPU.
Any idea if this could be done somehow during the training? This could fix the OOM errors but I don't know if it's possible and how.

Sign up or log in to comment