sanchit-gandhi/whisper-large-v2 · To handle videos longer than one hour and to transcribe them in segments, we need to make several modifications to the yt

Changes made:

Segmentation: The audio is split into segments of a specified length (default is 30 seconds). This helps in processing long videos and also ensures that each segment is transcribed accurately.

FFmpeg Integration: I've added a placeholder for the ffmpeg_read function. This function should use FFmpeg to extract audio from the segment and convert it to the desired format and sampling rate. You'll need to implement this function based on your requirements.

Transcription: Each segment is transcribed separately, and the results are combined to produce the full transcription of the video.

Note: The _return_yt_html_embed function is referenced but not provided. Ensure you have an implementation for this function. Similarly, you'll need to implement the ffmpeg_read function to handle audio extraction and conversion using FFmpeg.

Spaces:
Duplicated from whisper-event/whisper-demo

sanchit-gandhi
/

whisper-large-v2

Runtime error

To handle videos longer than one hour and to transcribe them in segments, we need to make several modifications to the yt_transcribe function.