Automatic Speech Recognition
Transformers
Safetensors
Hebrew
whisper
Inference Endpoints

Model Card for Model ID

This model is a Hebrew finetune (continued training) of the OpenAI Whisper Large v3 model.

Model Details

Model Description

  • Developed by: ivrit-ai
  • Language(s) (NLP): Hebrew
  • License: Apache-2.0
  • Finetuned from model openai/whisper-large-v3

Bias, Risks, and Limitations

Language detection capability of this model has been degraded during training - it is intended for mostly-hebrew audio transcription. Language token should be explicitly set to Hebrew.

Additionally, the tanslation task was not trained and also degraded. This model would not be able to translate in any reasonable capacity.

How to Get Started with the Model

Please follow the original model card for usage details - replacing with this model name. You can also fine other weight formats ad quantizations on the ivrit ai HF page.

Training Details

Training Data

This model was trained on the following datasets:

Training Procedure

This model is a weighted-average of the lowest eval loss checkpoints from two seprate runs with the same setup. Training code can be found on the ivrit-ai Github here

Preprocessing

The "Crowd Recital" dataset contains timestamps and previous text following the Whisper expected inputs. Timestamps were used across all 50h from this datasets, and 50% of the previous text was used.

The "Crowd Transcribe" datasets has no timestamps or previous text and this preprocessing only included melspec feature extraction and text encoding.

Preprocessing code can be found within the training code repository.

Datasets were interleaved with 0.95:0.05 ratio (crowd-transcribe:crowd-recital).

Training Hyperparameters

  • Training regime: bf16 mixed precision with sdpa
  • Learning Rate: 1e-5, Linear decay, 800 steps warmup for 3 epochs
  • Batch Size: 32

Training Hardward / Duration

  • GPU Type: Single Nvidia L40S machine
  • Duration: 24h run, stopped at 2 epochs

Evaluation

Please refer to the ivrit-ai/hebrew-transcription-leaderboard

Downloads last month
13
Safetensors
Model size
1.54B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for ivrit-ai/whisper-large-v3

Finetuned
(430)
this model

Datasets used to train ivrit-ai/whisper-large-v3

Collection including ivrit-ai/whisper-large-v3