runtime error

you can pin a revision. Downloading (…)_generation_utils.py: 0%| | 0.00/14.9k [00:00<?, ?B/s] Downloading (…)_generation_utils.py: 100%|██████████| 14.9k/14.9k [00:00<00:00, 68.7MB/s] A new version of the following files was downloaded from https://huggingface.co/Qwen/Qwen-VL-Chat-Int4: - qwen_generation_utils.py . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision. A new version of the following files was downloaded from https://huggingface.co/Qwen/Qwen-VL-Chat-Int4: - modeling_qwen.py - visual.py - qwen_generation_utils.py . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision. /home/user/.pyenv/versions/3.10.13/lib/python3.10/site-packages/torch/cuda/__init__.py:138: UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 11080). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.) return torch._C._cuda_getDeviceCount() > 0 Traceback (most recent call last): File "/home/user/app/app.py", line 15, in <module> model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-VL-Chat-Int4", device_map="auto", trust_remote_code=True).eval() File "/home/user/.pyenv/versions/3.10.13/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 560, in from_pretrained return model_class.from_pretrained( File "/home/user/.pyenv/versions/3.10.13/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2702, in from_pretrained raise RuntimeError("GPU is required to quantize or run quantize model.") RuntimeError: GPU is required to quantize or run quantize model.

Container logs:

Fetching error logs...