How much RAM and VRAM does the model require?

#5
by supercharge19 - opened

I tried original model but had problems.
CUDA went out of memory with the 400KB image (which is provided in the code). It asked for 12GB VRAM (tried to allocate 12GB of RAM).

Then I tried to load it 8bits still got the error.

Then I tried loading model without GPU (16GB RAM) and process of reading image got killed. I am going to try with gguf 8bit version now, hopefully this can be used.

It didn't feel like it was that much, but I can run a test locally to see.. Which image are you attempting? Is it this one they share in their GitHub code?

https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg

Picture of dog and woman. However, I'm able to try with gguf model (not a full model, I guess full model did not fit 3GB VRAM of 1050 laptop), anyway, even with gguf quant 8 it took about 10 minutes and I stopped execution. How long does it take normally to get response?

Sign up or log in to comment