Image Feature Extraction
Transformers
PyTorch
intern_vit_6b
feature-extraction
custom_code

Optional flash attention import

#2
by korotas - opened

Hi! I want to use your great model on Kaggle but I have a chain of problems:

  1. The code needs to import flash_attn (in the flash_attention.py file), although in the main modeling file the import of flash attention is optional;
  2. So when this model is downloaded, Huggingface hub tells me that the flash_attn module needs to be installed;
  3. In the Kaggle environment flash_attn 2 cannot be used (because of P100 GPU) and flash_attn 1 refuses to install.

So my question is, can you please make flash_attn import optional in the flash_attention.py file so that the model can be used without flash_attn?

Ok I found an alternative solution. I install flash_attn 2 (which throws an error only during inference, not during installation), download the model and then:

for m in model.modules():
    if m.__class__.__name__ == 'InternAttention':
        m.use_flash_attn = False

After that model works in Kaggle environment.

May I ask, does the code I wrote not work? If Flash Attention is not in the environment, then the code should not use it.

image.png

Sign up or log in to comment