mistralai/Mixtral-8x7B-v0.1 · Azure VM not launching Mixtral

Feb 2

I have a VM from Azure where I want to fine-tune Mixtral. Before doing this, I wanted to launch the model just to test it. I used the basic script from the official repo in "Run the model" section.
The problem is when I run this script, after downloading the model locally, it is loaded in RAM but the program crashes because 58 GB of RAM is not enough.
When I tried to use the script from "Lower precision using (8-bit & 4-bit) using bitsandbytes" section, to use less ressources and load the model on the GPU, the program says that there is no GPU on my VM, but there is and it is a NVIDIA Tesla 4.
I am running Debian 11 and I installed CUDA drivers but nothing seems to work.

How can I make the model run on my T4 GPU?

ArthurZ

Feb 2

Would recommend you to load it according to the documentation with the approtiate device_map = "auto" ! This will offload the model to CPU RAM if you don't have enough and then disk if you still don't have enough!

WiLDCaT6

Feb 2

Can you / do you want to try it with ollama ? It's very very fine tune lastly, run on rtx 3090 fast ! But maybe it's what you'r looking for😅

pierrerichard

Feb 2

•

edited Feb 2

Thanks ArthurZ! What you said worked perfectly, I was just missing this flag : D

Hi WiLDCaT6, I checked ollama yes, but I wanted to try loading the model on its own ahah : )

WiLDCaT6

Feb 2

Yes it's what I thank, thx to reply, that will be my next goal, for now ollama + Django is very cool for my 'little' configuration and Mixtral 😅