Azure VM not launching Mixtral

#37
by pierrerichard - opened

I have a VM from Azure where I want to fine-tune Mixtral. Before doing this, I wanted to launch the model just to test it. I used the basic script from the official repo in "Run the model" section.
The problem is when I run this script, after downloading the model locally, it is loaded in RAM but the program crashes because 58 GB of RAM is not enough.
When I tried to use the script from "Lower precision using (8-bit & 4-bit) using bitsandbytes" section, to use less ressources and load the model on the GPU, the program says that there is no GPU on my VM, but there is and it is a NVIDIA Tesla 4.
I am running Debian 11 and I installed CUDA drivers but nothing seems to work.

How can I make the model run on my T4 GPU?

Would recommend you to load it according to the documentation with the approtiate device_map = "auto" ! This will offload the model to CPU RAM if you don't have enough and then disk if you still don't have enough!

Can you / do you want to try it with ollama ? It's very very fine tune lastly, run on rtx 3090 fast ! But maybe it's what you'r looking for๐Ÿ˜…

Thanks ArthurZ! What you said worked perfectly, I was just missing this flag : D

Hi WiLDCaT6, I checked ollama yes, but I wanted to try loading the model on its own ahah : )

Yes it's what I thank, thx to reply, that will be my next goal, for now ollama + Django is very cool for my 'little' configuration and Mixtral ๐Ÿ˜…

Sign up or log in to comment