Which punctuator is best?

#7
by MonsterMMORPG - opened

Currently I am using felflare/bert-restore-punctuation and oliverguhr/fullstop-punctuation-multilang-large

I feel like felflare/bert-restore-punctuation a little bit better

I am using punctuator for fixing the transcription generated by whisper for my youtube channel videos

My youtube channel (technology, education and programming) : https://www.youtube.com/SECourses

Whisper starts to lose ability to punctuate in some cases i don't know why but then it requires to fix punctuation otherwise it is very bad as a good subtitle

So if there are any better alternative punctuator atm that works better could anyone let me know?

And this is my how to use whisper video if anyone is interested in : https://youtu.be/msj3wuYf3d8

@MonsterMMORPG Well it depends on your data, I guess. The felflare/bert-restore-punctuation can also do casing and was trained on different data than this model. To find out which model works best for you, just create a test dataset and compare the results of both models.

Keep me posted if you have some results.

oliverguhr changed discussion status to closed

@oliverguhr so this is the best model you have atm. Do you have plans to improve release improved version?

Do you know any newer perhaps better one?

Thank you for replies

I released the code for people to improve this model on their data. I do not plan to upgrade this model at the moment.

Sign up or log in to comment