Can use torch for attention implementation?

#8
by LouiSum - opened

Currently the LM is using Triton for attention implementation. Can we change it in config to torch?

Yes, the model support torch or triton for the attn_impl kwarg, and 'torch' is the default.

So just don't pass in the attn_impl kwarg to the AutoModelFromCausalLM.from_pretrained call and it will default to using the attention implementation in torch!

Lmk if you continue to have trouble with this!

It works. Thanks

madhavatreplit changed discussion status to closed

Sign up or log in to comment