marianbasti commited on
Commit
c81bab3
1 Parent(s): f4d8cf2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -6
README.md CHANGED
@@ -8,10 +8,11 @@ library_name: transformers
8
  pipeline_tag: automatic-speech-recognition
9
  tags:
10
  - spanish
 
11
  - speech
12
  - recognition
13
  - whisper
14
- - distl-whisper
15
  ---
16
 
17
  # distil-whisper-large-v3-es
@@ -155,7 +156,7 @@ print(result["text"])
155
  ```
156
  ## Training
157
 
158
- The model was trained for 40,000 optimisation steps (or 0.98 epochs), on a single RTX3090 for ~30 hours, using the following training parameters:
159
  ```
160
  --teacher_model_name_or_path "openai/whisper-large-v3"
161
  --train_dataset_name "mozilla-foundation/common_voice_16_1"
@@ -166,14 +167,14 @@ The model was trained for 40,000 optimisation steps (or 0.98 epochs), on a singl
166
  --eval_dataset_config_name "es"
167
  --eval_split_name "validation"
168
  --eval_text_column_name "sentence"
169
- --eval_steps 5000
170
- --save_steps 5000
171
  --warmup_steps 500
172
  --learning_rate 1e-4
173
  --lr_scheduler_type "linear"
174
  --logging_steps 25
175
  --save_total_limit 1
176
- --max_steps 40000
177
  --wer_threshold 10
178
  --per_device_train_batch_size 8
179
  --per_device_eval_batch_size 8
@@ -192,7 +193,7 @@ The model was trained for 40,000 optimisation steps (or 0.98 epochs), on a singl
192
 
193
  ## Results
194
 
195
- The distilled model performs with a 5.874% normalized WER. Further training would yield better results
196
 
197
  ## License
198
 
 
8
  pipeline_tag: automatic-speech-recognition
9
  tags:
10
  - spanish
11
+ - español
12
  - speech
13
  - recognition
14
  - whisper
15
+ - distil-whisper
16
  ---
17
 
18
  # distil-whisper-large-v3-es
 
156
  ```
157
  ## Training
158
 
159
+ The model was trained for 60,000 optimisation steps (or around 1.47 epochs), on a single RTX3090 for ~60 hours, using the following training parameters:
160
  ```
161
  --teacher_model_name_or_path "openai/whisper-large-v3"
162
  --train_dataset_name "mozilla-foundation/common_voice_16_1"
 
167
  --eval_dataset_config_name "es"
168
  --eval_split_name "validation"
169
  --eval_text_column_name "sentence"
170
+ --eval_steps 10000
171
+ --save_steps 10000
172
  --warmup_steps 500
173
  --learning_rate 1e-4
174
  --lr_scheduler_type "linear"
175
  --logging_steps 25
176
  --save_total_limit 1
177
+ --max_steps 60000
178
  --wer_threshold 10
179
  --per_device_train_batch_size 8
180
  --per_device_eval_batch_size 8
 
193
 
194
  ## Results
195
 
196
+ The distilled model performs with a 5.11% WER (10.15% orthogonal WER).
197
 
198
  ## License
199