Does this model use attention mask?
#2
by
gaunernst
- opened
In preprocessor_config.json
, it states that "return_attention_mask": false
. However, when I look at the original code https://github.com/asappresearch/sew, it seems to use attention mask (https://github.com/asappresearch/sew/blob/master/sew_asapp/data/audio_feat_dataset.py). Can anyone confirm if this model (and other models in SEW and SEW-D series) indeed does not use attention mask?
Thank you.
For those who are wondering the same thing, SEW and SEW-D use group norm in its 1D-CNN feature extraction, similar to Wav2Vec 2.0-Base. Thus, although using attention mask is possible, zero-padded input will have different results from non-padded input, even with attention mask.
gaunernst
changed discussion status to
closed