accuracy issues with retrained tiny model for marathi numbers

#80
by SameerMahajan - opened

We have a use case of https://youtu.be/L3L4mEszzTs

Basically we want to simply build a 300 class classifier for numbers 1 through 300. We want to have our complete Android app under a couple of hundred MB including the model. Hence I picked the tiny model.

Approach

./samples/6/6_33.wav {'text': "' Sa'am."}
./samples/6/6_34.wav {'text': "' Peace."}
./samples/6/6_35.wav {'text': "' Sa'am."}
./samples/6/6_36.wav {'text': "' Peace."}

Any ideas what I am missing, any basic problem with the approach, how it can be addressed?

thanks,
Sameer

Various code snippets for this are on my github repo of https://github.com/sameermahajan/whisper if you want to review, try out, experiment etc.

@sanchit-gandhi ?

Sign up or log in to comment