Visual Question Answering
Transformers
Safetensors
internvl_chat
feature-extraction
custom_code

Run inference in CPU

#1
by hythythyt3 - opened

Hello , is runnig this model on CPU/RAM posible?

Yes. You will need several modifications:

  1. comment .cuda() in /root/.cache/huggingface/modules/transformers_modules/OpenGVLab/Mini-InternVL-Chat-4B-V1-5/6f97087daec17e4b033d4d846c0b64c09c4268cd/modeling_internvl_chat.py and your demo code should not use .cuda()
  2. change “use_flash_attn” to false in /root/.cache/huggingface/hub/models–OpenGVLab–Mini-InternVL-Chat-4B-V1-5/snapshots/6f97087daec17e4b033d4d846c0b64c09c4268cd/config.json

Sign up or log in to comment