The model type whisper is not supported to be used with BetterTransformer

#21
by Venkatesh4342 - opened

till yesterday it was supporting now above error is popping up.

Whisper Distillation org
edited Dec 29, 2023

Hey @Venkatesh4342 ! Whisper now has native support for PyTorch SDPA flash attention. To use it, first upgrade your version of PyTorch to 2.1.2: https://pytorch.org/get-started/locally/

Then update Transformers to use main: https://huggingface.co/docs/transformers/installation#install-from-source

Transformers will then use PyTorch SDPA by default, alongside faster Torch STFT pre-processing, which should give you a nice speed-up overall: https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-and-memory-efficient-attention-through-pytorchs-scaleddotproductattention

Otherwise, using the latest version of Optimum should resolve the issue with BetterTransformer.

sanchit-gandhi changed discussion status to closed

Thanks for the quick respone @sanchit-gandhi it worked.

Sign up or log in to comment