Doesnt work

#2
by XtewaldX - opened

Runtime error

config.json: 0%| | 0.00/2.37k [00:00<?, ?B/s]
config.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2.37k/2.37k [00:00<00:00, 13.9MB/s]

model.bin: 0%| | 0.00/484M [00:00<?, ?B/s]

model.bin: 2%|▏ | 11.7M/484M [00:01<01:10, 6.74MB/s]
model.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 484M/484M [00:02<00:00, 234MB/s]

tokenizer.json: 0%| | 0.00/2.20M [00:00<?, ?B/s]
tokenizer.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2.20M/2.20M [00:00<00:00, 135MB/s]

vocabulary.txt: 0%| | 0.00/460k [00:00<?, ?B/s]
vocabulary.txt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 460k/460k [00:00<00:00, 59.3MB/s]
Traceback (most recent call last):
File "/home/user/app/app.py", line 34, in
model = WhisperModel(model_size, device="cuda", compute_type="float16")
File "/home/user/.local/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 130, in init
self.model = ctranslate2.models.Whisper(
RuntimeError: CUDA failed with error CUDA driver version is insufficient for CUDA runtime version

I am having the same problem, using Linux Mint 22 (based on Ubuntu 24). CPU operation works, but not GPU. I don't know if it makes a difference, but I am using podman rather than docker.
Doing an nvidia check in a container gives me the same CUDA version as my host PC -- 12.4. I upgraded drivers to V550 to make sure the CUDA was not too old. This is how I checked the container GPU: podman run --rm --device nvidia.com/gpu=all --security-opt=label=disable ubuntu nvidia-smi
It looks like the container is looking for version 11.8 of CUDA, so I don't know why 12.4 doesn't work.

root@d5ffe373a0a9:/app# ll /usr/local/
total 12
drwxr-xr-x 1 root root   6 Nov 10  2023 ./
drwxr-xr-x 1 root root  10 Oct  4  2023 ../
drwxr-xr-x 1 root root 654 Jan 30  2024 bin/
lrwxrwxrwx 1 root root  22 Nov 10  2023 cuda -> /etc/alternatives/cuda/
lrwxrwxrwx 1 root root  25 Nov 10  2023 cuda-11 -> /etc/alternatives/cuda-11/
drwxr-xr-x 1 root root  34 Nov 10  2023 cuda-11.8/

I tried a test script I found, and it says CUDA is not available to PyTorch. That would be a problem! Is there any other way of checking if GPU/CUDA is being passed through properly into the container?

root@e27de5c88794:/app/uploads# cat test_cuda.py 
import torch

def check_cuda():
    print("Is CUDA available in PyTorch:", torch.cuda.is_available())
    if torch.cuda.is_available():
        print("Number of CUDA devices:", torch.cuda.device_count())
        for i in range(torch.cuda.device_count()):
            print("CUDA Device #{}: {}".format(i, torch.cuda.get_device_name(i)))

if __name__ == "__main__":
    check_cuda()

gives

root@e27de5c88794:/app/uploads# python3 test_cuda.py 
Is CUDA available in PyTorch: False

Sign up or log in to comment