Regroup options?

by Dgoryeo - opened Sep 25, 2024

Sep 25, 2024

Hi,

Thank you so much for this very promising work.

I just came across your work and did couple of tests. I got quite long trasncription lines and long durations. I am wondering if it would be possible for the user to set the options for the stable_ts' regroup function to allow for shorter transcription lines? Or other ways to break into smaller lines/durations, like allowing to break from space or comma?

Thanks again for your superb work!

asahi417

Kotoba Technologies org Oct 18, 2024

Hi thanks for the feedback! I have investigated stable-ts and it turns out that the kotoba-whisper models are unable to generate word-level (character-level) timstamp, so stable-ts cannot perform the fine-grained timestamp reorganization. With that being said, stable-ts shouldn't be used as it would just make the timstamp coarser.

asahi417 changed discussion status to closed Oct 18, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment