Optimize seamlessM4T medium model for faster performance

by sanjitaa - opened Sep 19, 2023

Sep 19, 2023

I am trying to use seamlessm4t medium model on my project for speech to text translation. But I want the model to respond/predict faster. It is taking too much long time for it. What can be the best ideas for it?

sanjitaa

Sep 21, 2023

Can someone help me with it ?

elbayadm

AI at Meta org Dec 13, 2023

@sanjitaa what task are you working on, is it speech-to-speech? if so, I recommend using the v2 model here (it's 3x faster than large-v1)

sanjitaa

Dec 14, 2023

@elbayadm Yes, I am trying to work on speech-to-speech. What about the performance of v2 model cause the performance of the v1 model of seamless m4t was not better?

elbayadm

AI at Meta org Dec 15, 2023

•

edited Dec 15, 2023

The v2 model is better (more accurate in terms of ASR-BLEU and fatser) see this table from the paper:

These are averages across directions, but if you have a particular translation direction in mind, check the Tables 69-71 / pages 120-122 in the appendix here

sanjitaa

Dec 21, 2023

@elbayadm How can I implement this model for speech to speech translation in my application ?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment