Optional flash attention import
#2
by
korotas
- opened
Hi! I want to use your great model on Kaggle but I have a chain of problems:
- The code needs to import flash_attn (in the
flash_attention.py
file), although in the main modeling file the import of flash attention is optional; - So when this model is downloaded, Huggingface hub tells me that the flash_attn module needs to be installed;
- In the Kaggle environment flash_attn 2 cannot be used (because of P100 GPU) and flash_attn 1 refuses to install.
So my question is, can you please make flash_attn import optional in the flash_attention.py
file so that the model can be used without flash_attn?
Ok I found an alternative solution. I install flash_attn 2 (which throws an error only during inference, not during installation), download the model and then:
for m in model.modules():
if m.__class__.__name__ == 'InternAttention':
m.use_flash_attn = False
After that model works in Kaggle environment.