--- license: apache-2.0 language: - he - en --- Background - This ASR model was trained on a private dataset containing approximately 310 hours of high-quality Hebrew data. Data was transcribed using professional transcription services. Model name decoding: \-\-\-\ This specific model is a faster-whisper variant, large-v2 variant, trained on version 1 of our private dataset (pd1), and saved after one epoch. Running the model - ``` # Initialize the model import faster_whisper model = faster_whisper.WhisperModel('ivrit-ai/faster-whisper-v2-pd1-e1') # Transcribe a media file segs, _ = model.transcribe(mp3_file, language='he') for seg in segs: print(seg.text) ``` The segment object contains more data such as timestamps. Feel free to explore them.