openai/whisper-large-v3 · Update README.md

reach-vb

Jun 7, 2024

•

edited Jun 7, 2024

Replicating the model cards for Distil-whisper-large-v3 over to whisper-large-v3

Update README.md1eaca33a

sanchit-gandhi

Jun 7, 2024

•

edited Jun 7, 2024

Thanks for bringing these changes over from distil-large-v3 and updating accordingly! Compared to distil-whisper, large-v3 is multilingual, and we frequently get asked how to set the language/task args. It would be great to add a note on forcing/automatically detecting the source audio language, and switching between transcribe/translate

(nit) we can also run Transformers from the latest version, rather than main. I'll update this for distil-large-v3 now as well

reach-vb

Jun 7, 2024

•

edited Jun 7, 2024

GG! It does already mention this in the Short Transcription section:

The above arguments can be used in isolation or in combination. For example, to perform the task of speech transcription where the source audio is in French, and we want to return sentence-level timestamps, the following can be used:

result = pipe(sample, return_timestamps=True, generate_kwargs={"language": "french", "task": "translate"})
print(result["chunks"])

Let me know if this doesn't make sense. (this was already there in the model card)

sanchit-gandhi

Jun 10, 2024

Perfect, thanks @reach-vb 🙌

sanchit-gandhi changed pull request status to merged Jun 10, 2024