Clarification: w2v-BERT 2.0 was first presented in SeamlessM4T v1 (not v2)

#21

by zuazo - opened Feb 23, 2024

Feb 23, 2024

Please note that the w2v-BERT 2.0 model was initially introduced in the "SeamlessM4T v1" paper, specifically in Section 4.1, available at https://arxiv.org/abs/2308.11596.

While the "SeamlessM4T v2" paper also discusses this model, it does not delve into the same level of detail as the v1 paper.

ylacombe

Mar 15, 2024

Thanks for the note, would you like to open a hub PR to correct this ?

zuazo

Mar 15, 2024

Done here: https://huggingface.co/facebook/w2v-bert-2.0/discussions/23

Lellouch

Jun 20, 2024

The architecture is the same, but it is trained on 4.5M hours of audio in Seamless v2 while in the v1 is trained of 1M of audio. And I think their only open sourced the weights for the v2.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment