mozilla-ai
/

whisper-small-gl

@@ -1,95 +1,35 @@
 ---
-library_name: transformers
-license: apache-2.0
 base_model: openai/whisper-small
-tags:
-- generated_from_trainer
 datasets:
-- common_voice_17_0
-metrics:
-- wer
 model-index:
-- name: whisper-small-gl
   results:
   - task:
-      name: Automatic Speech Recognition
       type: automatic-speech-recognition
     dataset:
-      name: common_voice_17_0
-      type: common_voice_17_0
-      config: gl
-      split: None
-      args: gl
     metrics:
-    - name: Wer
-      type: wer
-      value: 13.681457327541507
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# whisper-small-gl
-This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the common_voice_17_0 dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.2102
-- Model Preparation Time: 0.0048
-- Wer: 13.6815
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 1e-05
-- train_batch_size: 16
-- eval_batch_size: 8
-- seed: 42
-- gradient_accumulation_steps: 2
-- total_train_batch_size: 32
-- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 50
-- training_steps: 1500
-- mixed_precision_training: Native AMP
-### Training results
-| Training Loss | Epoch  | Step | Validation Loss | Model Preparation Time | Wer     |
-|:-------------:|:------:|:----:|:---------------:|:----------------------:|:-------:|
-| 0.6456        | 0.0910 | 100  | 0.3671          | 0.0048                 | 22.6031 |
-| 0.3198        | 0.1821 | 200  | 0.3064          | 0.0048                 | 19.2674 |
-| 0.2694        | 0.2731 | 300  | 0.2810          | 0.0048                 | 17.8548 |
-| 0.2549        | 0.3641 | 400  | 0.2612          | 0.0048                 | 16.6173 |
-| 0.2284        | 0.4552 | 500  | 0.2510          | 0.0048                 | 16.0931 |
-| 0.2298        | 0.5462 | 600  | 0.2402          | 0.0048                 | 15.4248 |
-| 0.2229        | 0.6372 | 700  | 0.2325          | 0.0048                 | 15.1667 |
-| 0.2116        | 0.7283 | 800  | 0.2254          | 0.0048                 | 14.8106 |
-| 0.2093        | 0.8193 | 900  | 0.2208          | 0.0048                 | 14.4523 |
-| 0.199         | 0.9103 | 1000 | 0.2168          | 0.0048                 | 14.2172 |
-| 0.1881        | 1.0009 | 1100 | 0.2140          | 0.0048                 | 14.0444 |
-| 0.1189        | 1.0919 | 1200 | 0.2128          | 0.0048                 | 13.8969 |
-| 0.118         | 1.1830 | 1300 | 0.2108          | 0.0048                 | 14.2841 |
-| 0.1149        | 1.2740 | 1400 | 0.2107          | 0.0048                 | 13.9568 |
-| 0.1141        | 1.3650 | 1500 | 0.2102          | 0.0048                 | 13.6815 |
-### Framework versions
-- Transformers 4.49.0
-- Pytorch 2.6.0+cu124
-- Datasets 3.3.1
-- Tokenizers 0.21.0

 ---
 base_model: openai/whisper-small
 datasets:
+- mozilla-foundation/common_voice_17_0
+language: gl
+library_name: transformers
+license: apache-2.0
 model-index:
+- name: Finetuned openai/whisper-small on Galician
   results:
   - task:
       type: automatic-speech-recognition
+      name: Speech-to-Text
     dataset:
+      name: Common Voice (Galician)
+      type: common_voice
     metrics:
+    - type: wer
+      value: 13.681
 ---
+# Finetuned openai/whisper-small on 35141 Galician training audio samples from mozilla-foundation/common_voice_17_0.
+This model was created from the Mozilla.ai Blueprint:
+[speech-to-text-finetune](https://github.com/mozilla-ai/speech-to-text-finetune).
+## Evaluation results on 9990 audio samples of Galician:
+### Baseline model (before finetuning) on Galician
+- Word Error Rate: 40.812
+- Loss: 1.506
+### Finetuned model (after finetuning) on Galician
+- Word Error Rate: 13.681
+- Loss: 0.21