RuntimeError: cutlassF: no kernel found to launch!
#32
by
kehkok
- opened
I am facing the issues as per the title, for executing "torch.bfloat16". Can suggest what's wrong in it?
Below is my dev env:
- NVIDIA V100 GPU device
- Python 3.10.12
- CUDA 12.4 with Driver 550.54.14
- accelerate==0.28.0
- torch==2.1.2
- transformers==4.38.2
Also, below code execute successfully , as my setup able to use the V100.
model = AutoModelForCausalLM.from_pretrained("google/gemma-2b-it", device_map="auto", torch_dtype=torch.float16, token = access_token)
However, it failed for below code, as using "torch.bfloat16"
...
model = AutoModelForCausalLM.from_pretrained("google/gemma-2b-it", device_map="auto", torch_dtype=torch.bfloat16, token = access_token)
input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))