File size: 12,976 Bytes
060b1c8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
---
license: apache-2.0
base_model: facebook/wav2vec2-large-xlsr-53
tags:
- generated_from_trainer
datasets:
- common_voice
metrics:
- wer
model-index:
- name: wav2vec2-commonvoice-20subset-xlsr-53-gpu2
  results:
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: common_voice
      type: common_voice
      config: zh-CN
      split: test[:20%]
      args: zh-CN
    metrics:
    - name: Wer
      type: wer
      value: 0.9377853881278538
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# wav2vec2-commonvoice-20subset-xlsr-53-gpu2

This model is a fine-tuned version of [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on the common_voice dataset.
It achieves the following results on the evaluation set:
- Loss: 1.7751
- Wer: 0.9378
- Cer: 0.2802

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 13
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 26
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 300

### Training results

| Training Loss | Epoch  | Step  | Validation Loss | Wer    | Cer    |
|:-------------:|:------:|:-----:|:---------------:|:------:|:------:|
| No log        | 1.9    | 400   | 32.9239         | 1.0    | 1.0    |
| 69.7146       | 3.81   | 800   | 6.6878          | 1.0    | 1.0    |
| 7.0732        | 5.71   | 1200  | 6.4976          | 1.0    | 1.0    |
| 6.4558        | 7.62   | 1600  | 6.4214          | 1.0    | 1.0    |
| 6.2755        | 9.52   | 2000  | 6.2492          | 1.0143 | 0.9682 |
| 6.2755        | 11.43  | 2400  | 5.8545          | 1.0525 | 0.9396 |
| 5.8857        | 13.33  | 2800  | 4.4603          | 1.0742 | 0.7200 |
| 4.54          | 15.24  | 3200  | 3.7454          | 1.0297 | 0.6146 |
| 3.5614        | 17.14  | 3600  | 3.2387          | 1.0126 | 0.5582 |
| 2.9773        | 19.05  | 4000  | 2.8934          | 1.0068 | 0.5186 |
| 2.9773        | 20.95  | 4400  | 2.6116          | 0.9977 | 0.4880 |
| 2.5488        | 22.86  | 4800  | 2.4307          | 0.9932 | 0.4716 |
| 2.2665        | 24.76  | 5200  | 2.2844          | 0.9874 | 0.4532 |
| 2.0508        | 26.67  | 5600  | 2.1050          | 0.9886 | 0.4270 |
| 1.7944        | 28.57  | 6000  | 1.9768          | 0.9857 | 0.4150 |
| 1.7944        | 30.48  | 6400  | 1.8712          | 0.9789 | 0.3984 |
| 1.6074        | 32.38  | 6800  | 1.8050          | 0.9749 | 0.3916 |
| 1.4656        | 34.29  | 7200  | 1.7572          | 0.9783 | 0.3824 |
| 1.3429        | 36.19  | 7600  | 1.6546          | 0.9686 | 0.3677 |
| 1.2215        | 38.1   | 8000  | 1.6265          | 0.9726 | 0.3653 |
| 1.2215        | 40.0   | 8400  | 1.6046          | 0.9640 | 0.3625 |
| 1.1133        | 41.9   | 8800  | 1.5787          | 0.9737 | 0.3601 |
| 1.0702        | 43.81  | 9200  | 1.5449          | 0.9652 | 0.3503 |
| 0.9732        | 45.71  | 9600  | 1.5307          | 0.9572 | 0.3473 |
| 0.8858        | 47.62  | 10000 | 1.4962          | 0.9566 | 0.3394 |
| 0.8858        | 49.52  | 10400 | 1.5053          | 0.9561 | 0.3423 |
| 0.8067        | 51.43  | 10800 | 1.5053          | 0.9595 | 0.3389 |
| 0.7418        | 53.33  | 11200 | 1.4833          | 0.9566 | 0.3321 |
| 0.6962        | 55.24  | 11600 | 1.4927          | 0.9583 | 0.3312 |
| 0.6395        | 57.14  | 12000 | 1.4833          | 0.9509 | 0.3263 |
| 0.6395        | 59.05  | 12400 | 1.4908          | 0.9543 | 0.3263 |
| 0.5834        | 60.95  | 12800 | 1.4937          | 0.9521 | 0.3244 |
| 0.5422        | 62.86  | 13200 | 1.5123          | 0.9498 | 0.3206 |
| 0.4907        | 64.76  | 13600 | 1.5149          | 0.9515 | 0.3216 |
| 0.4525        | 66.67  | 14000 | 1.5079          | 0.9475 | 0.3206 |
| 0.4525        | 68.57  | 14400 | 1.5305          | 0.9469 | 0.3167 |
| 0.4229        | 70.48  | 14800 | 1.5427          | 0.9532 | 0.3235 |
| 0.3835        | 72.38  | 15200 | 1.5402          | 0.9452 | 0.3143 |
| 0.3642        | 74.29  | 15600 | 1.5569          | 0.9475 | 0.3151 |
| 0.3378        | 76.19  | 16000 | 1.5744          | 0.9463 | 0.3169 |
| 0.3378        | 78.1   | 16400 | 1.5578          | 0.9503 | 0.3122 |
| 0.3238        | 80.0   | 16800 | 1.5748          | 0.9481 | 0.3116 |
| 0.2997        | 81.9   | 17200 | 1.5708          | 0.9509 | 0.3139 |
| 0.2841        | 83.81  | 17600 | 1.5944          | 0.9521 | 0.3128 |
| 0.2573        | 85.71  | 18000 | 1.5941          | 0.9543 | 0.3108 |
| 0.2573        | 87.62  | 18400 | 1.6095          | 0.9515 | 0.3095 |
| 0.2496        | 89.52  | 18800 | 1.6170          | 0.9475 | 0.3102 |
| 0.2342        | 91.43  | 19200 | 1.6399          | 0.9469 | 0.3130 |
| 0.2261        | 93.33  | 19600 | 1.6241          | 0.9475 | 0.3099 |
| 0.2062        | 95.24  | 20000 | 1.6309          | 0.9446 | 0.3098 |
| 0.2062        | 97.14  | 20400 | 1.6360          | 0.9521 | 0.3061 |
| 0.2009        | 99.05  | 20800 | 1.6280          | 0.9526 | 0.3081 |
| 0.1916        | 100.95 | 21200 | 1.6606          | 0.9452 | 0.3053 |
| 0.1841        | 102.86 | 21600 | 1.6677          | 0.9475 | 0.3030 |
| 0.1794        | 104.76 | 22000 | 1.6625          | 0.9475 | 0.3039 |
| 0.1794        | 106.67 | 22400 | 1.6524          | 0.9481 | 0.3061 |
| 0.1718        | 108.57 | 22800 | 1.6761          | 0.9469 | 0.3085 |
| 0.174         | 110.48 | 23200 | 1.6778          | 0.9543 | 0.3048 |
| 0.1586        | 112.38 | 23600 | 1.6784          | 0.9503 | 0.3024 |
| 0.1595        | 114.29 | 24000 | 1.6844          | 0.9543 | 0.3021 |
| 0.1595        | 116.19 | 24400 | 1.6888          | 0.9463 | 0.3035 |
| 0.1494        | 118.1  | 24800 | 1.6767          | 0.9498 | 0.2984 |
| 0.141         | 120.0  | 25200 | 1.6898          | 0.9441 | 0.3044 |
| 0.139         | 121.9  | 25600 | 1.6812          | 0.9463 | 0.2990 |
| 0.1361        | 123.81 | 26000 | 1.6965          | 0.9446 | 0.2997 |
| 0.1361        | 125.71 | 26400 | 1.7046          | 0.9435 | 0.3014 |
| 0.1285        | 127.62 | 26800 | 1.6941          | 0.9463 | 0.2988 |
| 0.1273        | 129.52 | 27200 | 1.6980          | 0.9492 | 0.3008 |
| 0.1215        | 131.43 | 27600 | 1.7161          | 0.9424 | 0.2988 |
| 0.1188        | 133.33 | 28000 | 1.7033          | 0.9424 | 0.2976 |
| 0.1188        | 135.24 | 28400 | 1.7159          | 0.9446 | 0.2966 |
| 0.1183        | 137.14 | 28800 | 1.7157          | 0.9424 | 0.2965 |
| 0.118         | 139.05 | 29200 | 1.7073          | 0.9429 | 0.2932 |
| 0.1081        | 140.95 | 29600 | 1.7453          | 0.9424 | 0.2979 |
| 0.1064        | 142.86 | 30000 | 1.7120          | 0.9441 | 0.2964 |
| 0.1064        | 144.76 | 30400 | 1.7219          | 0.9418 | 0.2970 |
| 0.1028        | 146.67 | 30800 | 1.7217          | 0.9458 | 0.2960 |
| 0.1008        | 148.57 | 31200 | 1.7296          | 0.9481 | 0.2965 |
| 0.101         | 150.48 | 31600 | 1.7179          | 0.9412 | 0.2939 |
| 0.096         | 152.38 | 32000 | 1.7267          | 0.9418 | 0.2928 |
| 0.096         | 154.29 | 32400 | 1.7336          | 0.9401 | 0.2938 |
| 0.0898        | 156.19 | 32800 | 1.7229          | 0.9338 | 0.2921 |
| 0.0934        | 158.1  | 33200 | 1.7236          | 0.9406 | 0.2907 |
| 0.09          | 160.0  | 33600 | 1.7300          | 0.9378 | 0.2954 |
| 0.09          | 161.9  | 34000 | 1.7358          | 0.9435 | 0.2927 |
| 0.09          | 163.81 | 34400 | 1.7349          | 0.9452 | 0.2948 |
| 0.0886        | 165.71 | 34800 | 1.7336          | 0.9475 | 0.2935 |
| 0.0854        | 167.62 | 35200 | 1.7307          | 0.9429 | 0.2906 |
| 0.0829        | 169.52 | 35600 | 1.7329          | 0.9446 | 0.2947 |
| 0.0868        | 171.43 | 36000 | 1.7490          | 0.9446 | 0.2905 |
| 0.0868        | 173.33 | 36400 | 1.7322          | 0.9418 | 0.2929 |
| 0.0832        | 175.24 | 36800 | 1.7477          | 0.9441 | 0.2924 |
| 0.0792        | 177.14 | 37200 | 1.7541          | 0.9418 | 0.2897 |
| 0.0774        | 179.05 | 37600 | 1.7504          | 0.9424 | 0.2908 |
| 0.0754        | 180.95 | 38000 | 1.7516          | 0.9458 | 0.2925 |
| 0.0754        | 182.86 | 38400 | 1.7633          | 0.9469 | 0.2912 |
| 0.0779        | 184.76 | 38800 | 1.7526          | 0.9429 | 0.2928 |
| 0.0733        | 186.67 | 39200 | 1.7387          | 0.9412 | 0.2916 |
| 0.0765        | 188.57 | 39600 | 1.7464          | 0.9412 | 0.2900 |
| 0.0725        | 190.48 | 40000 | 1.7581          | 0.9384 | 0.2887 |
| 0.0725        | 192.38 | 40400 | 1.7424          | 0.9429 | 0.2872 |
| 0.0701        | 194.29 | 40800 | 1.7372          | 0.9401 | 0.2887 |
| 0.0707        | 196.19 | 41200 | 1.7570          | 0.9424 | 0.2904 |
| 0.0679        | 198.1  | 41600 | 1.7523          | 0.9418 | 0.2896 |
| 0.0649        | 200.0  | 42000 | 1.7767          | 0.9389 | 0.2891 |
| 0.0649        | 201.9  | 42400 | 1.7509          | 0.9412 | 0.2875 |
| 0.0654        | 203.81 | 42800 | 1.7480          | 0.9446 | 0.2878 |
| 0.0652        | 205.71 | 43200 | 1.7489          | 0.9395 | 0.2866 |
| 0.0642        | 207.62 | 43600 | 1.7609          | 0.9446 | 0.2871 |
| 0.0665        | 209.52 | 44000 | 1.7644          | 0.9412 | 0.2887 |
| 0.0665        | 211.43 | 44400 | 1.7583          | 0.9366 | 0.2882 |
| 0.0591        | 213.33 | 44800 | 1.7510          | 0.9384 | 0.2869 |
| 0.0593        | 215.24 | 45200 | 1.7632          | 0.9406 | 0.2874 |
| 0.0654        | 217.14 | 45600 | 1.7562          | 0.9418 | 0.2864 |
| 0.0571        | 219.05 | 46000 | 1.7585          | 0.9389 | 0.2850 |
| 0.0571        | 220.95 | 46400 | 1.7542          | 0.9389 | 0.2853 |
| 0.0576        | 222.86 | 46800 | 1.7625          | 0.9395 | 0.2857 |
| 0.0564        | 224.76 | 47200 | 1.7652          | 0.9384 | 0.2857 |
| 0.0566        | 226.67 | 47600 | 1.7698          | 0.9395 | 0.2829 |
| 0.0539        | 228.57 | 48000 | 1.7684          | 0.9378 | 0.2845 |
| 0.0539        | 230.48 | 48400 | 1.7737          | 0.9395 | 0.2849 |
| 0.0539        | 232.38 | 48800 | 1.7549          | 0.9349 | 0.2832 |
| 0.0533        | 234.29 | 49200 | 1.7548          | 0.9401 | 0.2839 |
| 0.0534        | 236.19 | 49600 | 1.7661          | 0.9372 | 0.2841 |
| 0.0514        | 238.1  | 50000 | 1.7680          | 0.9361 | 0.2844 |
| 0.0514        | 240.0  | 50400 | 1.7620          | 0.9384 | 0.2854 |
| 0.0504        | 241.9  | 50800 | 1.7744          | 0.9384 | 0.2841 |
| 0.0522        | 243.81 | 51200 | 1.7774          | 0.9344 | 0.2838 |
| 0.0486        | 245.71 | 51600 | 1.7739          | 0.9384 | 0.2839 |
| 0.0497        | 247.62 | 52000 | 1.7732          | 0.9389 | 0.2840 |
| 0.0497        | 249.52 | 52400 | 1.7705          | 0.9401 | 0.2842 |
| 0.0489        | 251.43 | 52800 | 1.7707          | 0.9424 | 0.2841 |
| 0.0496        | 253.33 | 53200 | 1.7754          | 0.9412 | 0.2830 |
| 0.0478        | 255.24 | 53600 | 1.7684          | 0.9429 | 0.2830 |
| 0.0515        | 257.14 | 54000 | 1.7675          | 0.9418 | 0.2815 |
| 0.0515        | 259.05 | 54400 | 1.7745          | 0.9429 | 0.2819 |
| 0.0474        | 260.95 | 54800 | 1.7783          | 0.9378 | 0.2820 |
| 0.0476        | 262.86 | 55200 | 1.7744          | 0.9418 | 0.2813 |
| 0.0448        | 264.76 | 55600 | 1.7715          | 0.9406 | 0.2822 |
| 0.0462        | 266.67 | 56000 | 1.7708          | 0.9389 | 0.2822 |
| 0.0462        | 268.57 | 56400 | 1.7719          | 0.9395 | 0.2820 |
| 0.0443        | 270.48 | 56800 | 1.7777          | 0.9384 | 0.2816 |
| 0.0445        | 272.38 | 57200 | 1.7750          | 0.9372 | 0.2807 |
| 0.0425        | 274.29 | 57600 | 1.7819          | 0.9412 | 0.2820 |
| 0.0449        | 276.19 | 58000 | 1.7765          | 0.9406 | 0.2807 |
| 0.0449        | 278.1  | 58400 | 1.7783          | 0.9366 | 0.2803 |
| 0.0434        | 280.0  | 58800 | 1.7719          | 0.9424 | 0.2809 |
| 0.0447        | 281.9  | 59200 | 1.7700          | 0.9355 | 0.2802 |
| 0.0465        | 283.81 | 59600 | 1.7747          | 0.9395 | 0.2802 |
| 0.0447        | 285.71 | 60000 | 1.7764          | 0.9384 | 0.2811 |
| 0.0447        | 287.62 | 60400 | 1.7799          | 0.9378 | 0.2807 |
| 0.0432        | 289.52 | 60800 | 1.7800          | 0.9384 | 0.2809 |
| 0.0431        | 291.43 | 61200 | 1.7785          | 0.9389 | 0.2807 |
| 0.0422        | 293.33 | 61600 | 1.7792          | 0.9395 | 0.2811 |
| 0.0418        | 295.24 | 62000 | 1.7749          | 0.9384 | 0.2807 |
| 0.0418        | 297.14 | 62400 | 1.7738          | 0.9384 | 0.2805 |
| 0.0416        | 299.05 | 62800 | 1.7751          | 0.9378 | 0.2802 |


### Framework versions

- Transformers 4.31.0
- Pytorch 1.13.1+cu117
- Datasets 2.13.1
- Tokenizers 0.13.3