RuntimeError: NVML_SUCCESS == r INTERNAL ASSERT FAILED at "../c10/cuda/CUDACachingAllocator.cpp":995, please report a bug to PyTorch.

#54
by venkatesh-thiru - opened

I got this error when I try to run the sd3.5 large model on A100 - 80GB MiG GPU. But the medium model worked fine. I have been trying to find solutions for this with no luck so far.

Sign up or log in to comment