CUDA kernels incompatible with standard PyTorch device movement with 4bit/8bit, necessitating device-specific handling
2d782dd
verified
madhavanvenkatesh
commited on