Optimize seamlessM4T medium model for faster performance

#7
by sanjitaa - opened

I am trying to use seamlessm4t medium model on my project for speech to text translation. But I want the model to respond/predict faster. It is taking too much long time for it. What can be the best ideas for it?

Can someone help me with it ?

AI at Meta org

@sanjitaa what task are you working on, is it speech-to-speech? if so, I recommend using the v2 model here (it's 3x faster than large-v1)

@elbayadm Yes, I am trying to work on speech-to-speech. What about the performance of v2 model cause the performance of the v1 model of seamless m4t was not better?

The v2 model is better (more accurate in terms of ASR-BLEU and fatser) see this table from the paper:
Screenshot 2023-12-15 at 2.17.37 PM.png
These are averages across directions, but if you have a particular translation direction in mind, check the Tables 69-71 / pages 120-122 in the appendix here

@elbayadm How can I implement this model for speech to speech translation in my application ?

Sign up or log in to comment