AndrewMcDowell
/

wav2vec2-xls-r-1b-japanese-hiragana-katakana

@@ -1,10 +1,6 @@
 ---
-language:
-- ja
 license: apache-2.0
 tags:
-- automatic-speech-recognition
-- mozilla-foundation/common_voice_8_0
 - generated_from_trainer
 datasets:
 - common_voice
@@ -18,11 +14,11 @@ should probably proofread and complete it, then remove this comment. -->
 #
-This model is a fine-tuned version of [facebook/wav2vec2-xls-r-1b](https://huggingface.co/facebook/wav2vec2-xls-r-1b) on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - JA dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6643
-- Wer: 1.0242
-- Cer: 0.1827
 ## Model description
@@ -41,35 +37,25 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0001
 - train_batch_size: 32
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 2000
 - num_epochs: 50.0
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch | Step  | Validation Loss | Wer    | Cer    |
-|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|
-| 1.9321        | 3.14  | 1000  | 1.0116          | 0.9823 | 0.2635 |
-| 2.0934        | 6.29  | 2000  | 1.1241          | 1.0222 | 0.2932 |
-| 2.0389        | 9.43  | 3000  | 1.2067          | 1.1325 | 0.3345 |
-| 1.9569        | 12.58 | 4000  | 0.9818          | 1.0090 | 0.2657 |
-| 1.8409        | 15.72 | 5000  | 1.0382          | 1.6480 | 0.3741 |
-| 1.7449        | 18.87 | 6000  | 0.9962          | 1.6268 | 0.3454 |
-| 1.7349        | 22.01 | 7000  | 0.9560          | 0.9850 | 0.2597 |
-| 1.6857        | 25.16 | 8000  | 0.8722          | 0.9669 | 0.2325 |
-| 1.5637        | 28.3  | 9000  | 0.7636          | 1.8071 | 0.3422 |
-| 1.5088        | 31.45 | 10000 | 0.7290          | 1.0398 | 0.2085 |
-| 1.4298        | 34.59 | 11000 | 0.7576          | 1.0166 | 0.2104 |
-| 1.3716        | 37.74 | 12000 | 0.7046          | 1.1275 | 0.2138 |
-| 1.3185        | 40.88 | 13000 | 0.7011          | 1.1696 | 0.2179 |
-| 1.28          | 44.03 | 14000 | 0.6754          | 1.1316 | 0.2024 |
-| 1.2368        | 47.17 | 15000 | 0.6925          | 1.0517 | 0.1923 |
 ### Framework versions

 ---
 license: apache-2.0
 tags:
 - generated_from_trainer
 datasets:
 - common_voice
 #
+This model is a fine-tuned version of [facebook/wav2vec2-xls-r-1b](https://huggingface.co/facebook/wav2vec2-xls-r-1b) on the common_voice dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.6183
+- Wer: 1.0213
+- Cer: 0.1797
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 7.5e-05
 - train_batch_size: 32
 - eval_batch_size: 8
 - seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 1500
 - num_epochs: 50.0
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Wer    | Cer    |
+|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|
+| 1.7019        | 12.65 | 1000 | 1.0510          | 0.9832 | 0.2589 |
+| 1.6385        | 25.31 | 2000 | 0.6670          | 0.9915 | 0.1851 |
+| 1.4344        | 37.97 | 3000 | 0.6183          | 1.0213 | 0.1797 |
 ### Framework versions