Does llava supports multi-gpu inference?

#6
by ZealLin - opened

Hi, thanks for your contribution to open-souce LLM community. I recently try llava-v1.6-mistral-7b using newest transformers(4.39), everything go right if I load model to single gpu, a.k.a device_map={'':0}, but if I load the model dispatch to multi-gpu, a.k.a device_map='auto', the inference will raise cuda errors. It may looks like this issue: https://github.com/haotian-liu/LLaVA/issues/1319#issuecomment-2017359760

I want to make sure that it's this model support load to multi gpu?

Llava Hugging Face org

Feel free to open an issue on the Transformers library regarding the use of device_map="auto"

@ZealLin I also need multi-gpu inference, are you planning to open an issue, and if not, can I open one?

@seyongh I have not dive into this issue, so, open an issue on Transformers repo if you like.

Sign up or log in to comment