File size: 2,890 Bytes
b8aca96
 
 
 
 
 
 
 
 
 
baca145
b8aca96
 
6c8a384
 
 
 
 
 
 
 
 
 
 
879c1af
6c8a384
4debf80
879c1af
4debf80
eea0d8f
879c1af
eea0d8f
c0d4074
879c1af
c0d4074
fde1a3b
 
 
 
 
 
 
 
 
 
 
 
d8a16a8
 
 
27b3b25
 
 
0f0dd51
 
 
31a2cd9
 
 
 
 
 
 
 
 
 
 
 
a346525
 
 
d7f49be
 
 
8ef826b
 
 
b8aca96
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
---
language:
- pl
license: apache-2.0
tags:
- whisper-event
- generated_from_trainer
datasets:
- mozilla-foundation/common_voice_11_0
- google/fleurs
base_model: openai/whisper-small
model-index:
- name: Whisper Small PL
  results:
  - task:
      type: automatic-speech-recognition
      name: Automatic Speech Recognition
    dataset:
      name: mozilla-foundation/common_voice_11_0
      type: mozilla-foundation/common_voice_11_0
      config: pl
      split: test
    metrics:
    - type: wer
      value: 14.57
      name: WER
    - type: wer_without_norm
      value: 33.57
      name: WER unnormalized
    - type: cer
      value: 4.02
      name: CER
    - type: mer
      value: 14.37
      name: MER
  - task:
      type: automatic-speech-recognition
      name: Automatic Speech Recognition
    dataset:
      name: facebook/voxpopuli
      type: facebook/voxpopuli
      config: pl
      split: test
    metrics:
    - type: wer
      value: 15.73
      name: WER
    - type: wer_without_norm
      value: 34.51
      name: WER unnormalized
    - type: cer
      value: 7.73
      name: CER
    - type: mer
      value: 15.28
      name: MER
  - task:
      type: automatic-speech-recognition
      name: Automatic Speech Recognition
    dataset:
      name: google/fleurs
      type: google/fleurs
      config: pl_pl
      split: test
    metrics:
    - type: wer
      value: 16.79
      name: WER
    - type: wer_without_norm
      value: 35.69
      name: WER unnormalized
    - type: cer
      value: 4.99
      name: CER
    - type: mer
      value: 16.55
      name: MER
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# Whisper Small PL

This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the Common Voice 11.0 and the FLEURS datasets.
It achieves the following results on the evaluation set:
- eval_loss: 0.3571
- eval_wer: 14.8004
- eval_runtime: 2233.4204
- eval_samples_per_second: 3.714
- eval_steps_per_second: 0.232
- epoch: 4.03
- step: 3000

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 24
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 48
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- training_steps: 8000
- mixed_precision_training: Native AMP

### Framework versions

- Transformers 4.26.0.dev0
- Pytorch 1.13.0+cu117
- Datasets 2.7.1.dev0
- Tokenizers 0.13.2