CUDA error

#66
by katebor - opened

Hello!

We are running experiments and get the same error for both 2b and 9b instruction tuned versions. We tried truncation and setting max length, there is also no problem with the memory. We also run the failed instances on CPU and everything works just fine. We cannot figure why this happens. Here is the error message:

RuntimeError: CUDA error: misaligned address
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Google org

Hi @katebor ,

The model card code ran successfully on Google Colab using a T4 GPU. We used the specific library versions listed in the gist file.

  transformers = '4.51.2'
  torch = '2.6.0+cu124'

It would be great if you could try running it with the same setup. If you still encounter any issues, feel free to reach out.

Thank you.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment