Does this model use attention mask?

#2
by gaunernst - opened

In preprocessor_config.json, it states that "return_attention_mask": false. However, when I look at the original code https://github.com/asappresearch/sew, it seems to use attention mask (https://github.com/asappresearch/sew/blob/master/sew_asapp/data/audio_feat_dataset.py). Can anyone confirm if this model (and other models in SEW and SEW-D series) indeed does not use attention mask?

Thank you.

For those who are wondering the same thing, SEW and SEW-D use group norm in its 1D-CNN feature extraction, similar to Wav2Vec 2.0-Base. Thus, although using attention mask is possible, zero-padded input will have different results from non-padded input, even with attention mask.

gaunernst changed discussion status to closed

Sign up or log in to comment