Update generation_config.json

#7

Correct alignment heads after analyzing the cross-attention weights using DTW averaging 20 samples from "librispeech". Tests showed better timestamp aligments than "whisper-small.en" especially in shorter samples (8-10 seconds).

Whisper Distillation org

Super cool, thanks for the PR @rpinto ! Do you have a script to reproduce these results to inspect the cross-attention weights from DTW to verify the alignment improves? I'd love to run it for the other distil-whisper models too to confirm we have the optimal alignment!

Whisper Distillation org

Thanks for the PR @rpinto !

Do you have a script to reproduce these results to inspect the cross-attention weights from DTW to verify the alignment improves?

I'm also quite interested in this.

Sure thing, I will upload something later today when I get back from work.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment