ivanlau
/

wav2vec2-large-xls-r-300m-cantonese

@@ -1,10 +1,6 @@
 ---
-language:
-- zh-HK
 license: apache-2.0
 tags:
-- automatic-speech-recognition
-- mozilla-foundation/common_voice_8_0
 - generated_from_trainer
 datasets:
 - common_voice
@@ -18,10 +14,10 @@ should probably proofread and complete it, then remove this comment. -->
 #
-This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - ZH-HK dataset.
 It achieves the following results on the evaluation set:
-- Loss: 40.6968
-- Wer: 1.0
 ## Model description
@@ -41,40 +37,31 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0003
-- train_batch_size: 8
-- eval_batch_size: 1
 - seed: 42
 - gradient_accumulation_steps: 2
-- total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 500
-- num_epochs: 1.0
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Wer |
-|:-------------:|:-----:|:----:|:---------------:|:---:|
-| No log        | 0.05  | 10   | 239.0014        | 1.0 |
-| No log        | 0.1   | 20   | 235.8207        | 1.0 |
-| No log        | 0.15  | 30   | 226.9009        | 1.0 |
-| No log        | 0.21  | 40   | 198.0769        | 1.0 |
-| No log        | 0.26  | 50   | 166.6728        | 1.0 |
-| No log        | 0.31  | 60   | 149.1445        | 1.0 |
-| No log        | 0.36  | 70   | 138.4403        | 1.0 |
-| No log        | 0.41  | 80   | 131.7249        | 1.0 |
-| No log        | 0.46  | 90   | 125.5583        | 1.0 |
-| No log        | 0.51  | 100  | 119.7515        | 1.0 |
-| No log        | 0.56  | 110  | 113.7283        | 1.0 |
-| No log        | 0.62  | 120  | 107.2455        | 1.0 |
-| No log        | 0.67  | 130  | 100.2172        | 1.0 |
-| No log        | 0.72  | 140  | 92.5585         | 1.0 |
-| No log        | 0.77  | 150  | 84.2573         | 1.0 |
-| No log        | 0.82  | 160  | 75.2953         | 1.0 |
-| No log        | 0.87  | 170  | 65.6953         | 1.0 |
-| No log        | 0.92  | 180  | 55.7544         | 1.0 |
-| No log        | 0.97  | 190  | 45.7297         | 1.0 |
 ### Framework versions

 ---
 license: apache-2.0
 tags:
 - generated_from_trainer
 datasets:
 - common_voice
 #
+This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the common_voice dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.6726
+- Wer: 0.9815
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0003
+- train_batch_size: 32
+- eval_batch_size: 16
 - seed: 42
 - gradient_accumulation_steps: 2
+- total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 500
+- num_epochs: 10.0
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Wer    |
+|:-------------:|:-----:|:----:|:---------------:|:------:|
+| No log        | 1.0   | 183  | 47.8442         | 1.0    |
+| No log        | 2.0   | 366  | 6.3109          | 1.0    |
+| 41.8902       | 3.0   | 549  | 6.2392          | 1.0    |
+| 41.8902       | 4.0   | 732  | 5.9739          | 1.1123 |
+| 41.8902       | 5.0   | 915  | 4.9014          | 1.9474 |
+| 5.5817        | 6.0   | 1098 | 3.9892          | 1.0188 |
+| 5.5817        | 7.0   | 1281 | 3.5080          | 1.0104 |
+| 5.5817        | 8.0   | 1464 | 3.0797          | 0.9905 |
+| 3.5579        | 9.0   | 1647 | 2.8111          | 0.9836 |
+| 3.5579        | 10.0  | 1830 | 2.6726          | 0.9815 |
 ### Framework versions