Does this model use attention mask?

by gaunernst - opened Aug 16, 2023

Aug 16, 2023

In preprocessor_config.json, it states that "return_attention_mask": false. However, when I look at the original code https://github.com/asappresearch/sew, it seems to use attention mask (https://github.com/asappresearch/sew/blob/master/sew_asapp/data/audio_feat_dataset.py). Can anyone confirm if this model (and other models in SEW and SEW-D series) indeed does not use attention mask?

Thank you.

gaunernst

Aug 26, 2023

For those who are wondering the same thing, SEW and SEW-D use group norm in its 1D-CNN feature extraction, similar to Wav2Vec 2.0-Base. Thus, although using attention mask is possible, zero-padded input will have different results from non-padded input, even with attention mask.

gaunernst changed discussion status to closed Aug 26, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment