update model

Browse files

Files changed (12) hide show

README.md +36 -10
eval_results.json +6 -6
pytorch_model.bin +1 -1
runs/Feb01_18-08-21_job-336a688f-553a-4e6e-83b3-ad5d10274b51/1643741534.116655/events.out.tfevents.1643741534.job-336a688f-553a-4e6e-83b3-ad5d10274b51.3348585.1 +3 -0
runs/Feb01_18-08-21_job-336a688f-553a-4e6e-83b3-ad5d10274b51/events.out.tfevents.1643741534.job-336a688f-553a-4e6e-83b3-ad5d10274b51.3348585.0 +3 -0
runs/Feb04_14-58-29_job-336a688f-553a-4e6e-83b3-ad5d10274b51/1643989411.4467487/events.out.tfevents.1643989411.job-336a688f-553a-4e6e-83b3-ad5d10274b51.728502.1 +3 -0
runs/Feb04_14-58-29_job-336a688f-553a-4e6e-83b3-ad5d10274b51/events.out.tfevents.1643989411.job-336a688f-553a-4e6e-83b3-ad5d10274b51.728502.0 +3 -0
runs/Feb04_14-58-29_job-336a688f-553a-4e6e-83b3-ad5d10274b51/events.out.tfevents.1644061137.job-336a688f-553a-4e6e-83b3-ad5d10274b51.728502.2 +3 -0
special_tokens_map.json +1 -1
train_results.json +5 -5
trainer_state.json +267 -12
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -77,20 +77,20 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# wav2vec2-xls-r-300m-ca-lm
-This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - CA dataset.
-It achieves the following results on the averaged across datasets test set (without the LM):
-- Loss: 0.2758
-- Wer: 0.1792
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
@@ -98,6 +98,8 @@ More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -110,10 +112,12 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2000
-- num_epochs: 6.0
 - mixed_precision_training: Native AMP
-### Training results (without LM)
 | Training Loss | Epoch | Step  | Validation Loss | Wer    |
 |:-------------:|:-----:|:-----:|:---------------:|:------:|
@@ -162,10 +166,32 @@ The following hyperparameters were used during training:
 | 1.0805        | 11.45 | 21500 | 0.2561          | 0.1524 |
 | 1.0722        | 11.72 | 22000 | 0.2540          | 0.1566 |
 | 1.0763        | 11.99 | 22500 | 0.2549          | 0.1572 |
 ### Framework versions
 - Transformers 4.16.0.dev0
 - Pytorch 1.10.1+cu102
-- Datasets 1.18.1
 - Tokenizers 0.11.0

 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# wav2vec2-xls-r-300m-ca
+This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - CA, the [tv3_parla](https://huggingface.co/datasets/collectivat/tv3_parla) and [parlament_parla](https://huggingface.co/datasets/projecte-aina/parlament_parla) datasets.
+It achieves the following results on the evaluation set (for the three datasets and without the LM):
+- Loss: 0.2472
+- Wer: 0.1499
 ## Model description
+Please check the original [facebook/wav2vec2-xls-r-1b](https://huggingface.co/facebook/wav2vec2-xls-r-1b) Model card. This is just a finetuned version of that model.
 ## Intended uses & limitations
+As any model trained on crowdsourced data, this model can show the biases and particularities of the data and model used to train this model. Moreover, since this is a speech recognition model, it may underperform for some lower-resourced dialects for the catalan language.
 ## Training and evaluation data
 ## Training procedure
+The data is preprocessed to remove characters not on the catalan alphabet. Moreover, numbers are verbalized using code provided by [@ccoreilly](https://github.com/ccoreilly), which can be found on the text/ folder or [here](https://github.com/CollectivaT-dev/catotron-cpu/blob/master/text/numbers_ca.py).
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2000
+- num_epochs: 18.0
 - mixed_precision_training: Native AMP
+### Training results
+Check the Tensorboard tab to check the training profile and evaluation results along training. The model was evaluated on the test splits for each of the datasets used during training.
 | Training Loss | Epoch | Step  | Validation Loss | Wer    |
 |:-------------:|:-----:|:-----:|:---------------:|:------:|
 | 1.0805        | 11.45 | 21500 | 0.2561          | 0.1524 |
 | 1.0722        | 11.72 | 22000 | 0.2540          | 0.1566 |
 | 1.0763        | 11.99 | 22500 | 0.2549          | 0.1572 |
+| 1.0835        | 12.25 | 23000 | 0.2586          | 0.1521 |
+| 1.0883        | 12.52 | 23500 | 0.2583          | 0.1519 |
+| 1.0888        | 12.79 | 24000 | 0.2551          | 0.1582 |
+| 1.0933        | 13.05 | 24500 | 0.2628          | 0.1537 |
+| 1.0799        | 13.32 | 25000 | 0.2600          | 0.1508 |
+| 1.0804        | 13.59 | 25500 | 0.2620          | 0.1475 |
+| 1.0814        | 13.85 | 26000 | 0.2537          | 0.1517 |
+| 1.0693        | 14.12 | 26500 | 0.2560          | 0.1542 |
+| 1.0724        | 14.38 | 27000 | 0.2540          | 0.1574 |
+| 1.0704        | 14.65 | 27500 | 0.2548          | 0.1626 |
+| 1.0729        | 14.92 | 28000 | 0.2548          | 0.1601 |
+| 1.0724        | 15.18 | 28500 | 0.2511          | 0.1512 |
+| 1.0655        | 15.45 | 29000 | 0.2498          | 0.1490 |
+| 1.0608        | 15.98 | 30000 | 0.2487          | 0.1481 |
+| 1.0541        | 16.52 | 31000 | 0.2468          | 0.1504 |
+| 1.0584        | 17.05 | 32000 | 0.2467          | 0.1493 |
+| 1.0507        | 17.58 | 33000 | 0.2481          | 0.1517 |
 ### Framework versions
 - Transformers 4.16.0.dev0
 - Pytorch 1.10.1+cu102
+- Datasets 1.18.3
 - Tokenizers 0.11.0
+# Thanks
+Want to thank both [@ccoreilly](https://github.com/ccoreilly) and [@gullabi](https://github.com/gullabi) who have contributed with their own resources and knowledge into making this model possible.

eval_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
-    "epoch": 12.0,
-    "eval_loss": 0.25491979718208313,
-    "eval_runtime": 392.0567,
     "eval_samples": 4297,
-    "eval_samples_per_second": 10.96,
-    "eval_steps_per_second": 0.344,
-    "eval_wer": 0.15725760362438562
 }

 {
+    "epoch": 18.0,
+    "eval_loss": 0.2472492903470993,
+    "eval_runtime": 373.4142,
     "eval_samples": 4297,
+    "eval_samples_per_second": 11.507,
+    "eval_steps_per_second": 0.362,
+    "eval_wer": 0.14990076581772083
 }

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:38f9952471847b9dbd693d34fa642974ebb6a016e7677a8ccfb3e3458f45e32a
 size 1262112241

 version https://git-lfs.github.com/spec/v1
+oid sha256:382829868e73fa85ab4aea6f9cfa1e2258955546556cfbc4c1aa0ac435d86981
 size 1262112241

runs/Feb01_18-08-21_job-336a688f-553a-4e6e-83b3-ad5d10274b51/1643741534.116655/events.out.tfevents.1643741534.job-336a688f-553a-4e6e-83b3-ad5d10274b51.3348585.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:24cb9f2a1f8cd9f07b463f6996a54a600e48f99b5d21f1cabc83dc60826e1698
+size 4814

runs/Feb01_18-08-21_job-336a688f-553a-4e6e-83b3-ad5d10274b51/events.out.tfevents.1643741534.job-336a688f-553a-4e6e-83b3-ad5d10274b51.3348585.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b43e306271bfa60c6f6bf01ab0eae8c36a521e4e5cb0e8a55687eb99b5562c56
+size 10554

runs/Feb04_14-58-29_job-336a688f-553a-4e6e-83b3-ad5d10274b51/1643989411.4467487/events.out.tfevents.1643989411.job-336a688f-553a-4e6e-83b3-ad5d10274b51.728502.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a74d331c074b656f428d24db8dce4efbe1b7f1dde7e244fac7207dc29ae942c3
+size 4814

runs/Feb04_14-58-29_job-336a688f-553a-4e6e-83b3-ad5d10274b51/events.out.tfevents.1643989411.job-336a688f-553a-4e6e-83b3-ad5d10274b51.728502.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7dd620d37e0221c513ea85f339709d22725b2e11e30fd5c4c973ce761b4e5e24
+size 7529

runs/Feb04_14-58-29_job-336a688f-553a-4e6e-83b3-ad5d10274b51/events.out.tfevents.1644061137.job-336a688f-553a-4e6e-83b3-ad5d10274b51.728502.2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5ba70b885ace8400ed4e1f00fd81e27187a8a5dabaf13d4b441e91a0dff4eb3c
+size 364

special_tokens_map.json CHANGED Viewed

@@ -1 +1 @@

- {"bos_token": "<s>", "eos_token": "</s>", "unk_token": "[UNK]", "pad_token": "[PAD]", "additional_special_tokens": [{"content": "<s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, {"content": "</s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, {"content": "<s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, {"content": "</s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, {"content": "<s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, {"content": "</s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, {"content": "<s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, {"content": "</s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, {"content": "<s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, {"content": "</s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, {"content": "<s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, {"content": "</s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}]}


1	+ {"bos_token": "<s>", "eos_token": "</s>", "unk_token": "[UNK]", "pad_token": "[PAD]", "additional_special_tokens": [{"content": "<s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, {"content": "</s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}]}

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
-    "epoch": 12.0,
-    "train_loss": 0.5676147035501541,
-    "train_runtime": 172546.67,
     "train_samples": 240334,
-    "train_samples_per_second": 16.714,
-    "train_steps_per_second": 0.131
 }

 {
+    "epoch": 18.0,
+    "train_loss": 0.16521977071390054,
+    "train_runtime": 71350.5908,
     "train_samples": 240334,
+    "train_samples_per_second": 60.63,
+    "train_steps_per_second": 0.474
 }

trainer_state.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
   "best_metric": null,
   "best_model_checkpoint": null,
-  "epoch": 11.999600585807482,
-  "global_step": 22524,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
@@ -683,18 +683,273 @@
       "step": 22500
     },
     {
-      "epoch": 12.0,
-      "step": 22524,
-      "total_flos": 6.281601139352125e+20,
-      "train_loss": 0.5676147035501541,
-      "train_runtime": 172546.67,
-      "train_samples_per_second": 16.714,
-      "train_steps_per_second": 0.131
     }
   ],
-  "max_steps": 22524,
-  "num_train_epochs": 12,
-  "total_flos": 6.281601139352125e+20,
   "trial_name": null,
   "trial_params": null
 }

 {
   "best_metric": null,
   "best_model_checkpoint": null,
+  "epoch": 17.99960058580748,
+  "global_step": 33786,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
       "step": 22500
     },
     {
+      "epoch": 12.25,
+      "learning_rate": 2.5471119360724847e-05,
+      "loss": 1.0835,
+      "step": 23000
+    },
+    {
+      "epoch": 12.25,
+      "eval_loss": 0.25863561034202576,
+      "eval_runtime": 369.3533,
+      "eval_samples_per_second": 11.634,
+      "eval_steps_per_second": 0.366,
+      "eval_wer": 0.15212444278188222,
+      "step": 23000
+    },
+    {
+      "epoch": 12.52,
+      "learning_rate": 2.4293714213804817e-05,
+      "loss": 1.0883,
+      "step": 23500
+    },
+    {
+      "epoch": 12.52,
+      "eval_loss": 0.25827670097351074,
+      "eval_runtime": 370.2467,
+      "eval_samples_per_second": 11.606,
+      "eval_steps_per_second": 0.365,
+      "eval_wer": 0.15193740453256024,
+      "step": 23500
+    },
+    {
+      "epoch": 12.79,
+      "learning_rate": 2.3113949537532244e-05,
+      "loss": 1.0888,
+      "step": 24000
+    },
+    {
+      "epoch": 12.79,
+      "eval_loss": 0.2551300823688507,
+      "eval_runtime": 367.9843,
+      "eval_samples_per_second": 11.677,
+      "eval_steps_per_second": 0.367,
+      "eval_wer": 0.15819279487099555,
+      "step": 24000
+    },
+    {
+      "epoch": 13.05,
+      "learning_rate": 2.1934184861259672e-05,
+      "loss": 1.0933,
+      "step": 24500
+    },
+    {
+      "epoch": 13.05,
+      "eval_loss": 0.2628032863140106,
+      "eval_runtime": 369.9671,
+      "eval_samples_per_second": 11.615,
+      "eval_steps_per_second": 0.365,
+      "eval_wer": 0.1537142679011191,
+      "step": 24500
+    },
+    {
+      "epoch": 13.32,
+      "learning_rate": 2.07544201849871e-05,
+      "loss": 1.0799,
+      "step": 25000
+    },
+    {
+      "epoch": 13.32,
+      "eval_loss": 0.2600410580635071,
+      "eval_runtime": 374.9827,
+      "eval_samples_per_second": 11.459,
+      "eval_steps_per_second": 0.36,
+      "eval_wer": 0.150752828953521,
+      "step": 25000
+    },
+    {
+      "epoch": 13.59,
+      "learning_rate": 1.957701503806707e-05,
+      "loss": 1.0804,
+      "step": 25500
+    },
+    {
+      "epoch": 13.59,
+      "eval_loss": 0.26200664043426514,
+      "eval_runtime": 369.1646,
+      "eval_samples_per_second": 11.64,
+      "eval_steps_per_second": 0.366,
+      "eval_wer": 0.14753161465964235,
+      "step": 25500
+    },
+    {
+      "epoch": 13.85,
+      "learning_rate": 1.8397250361794498e-05,
+      "loss": 1.0814,
+      "step": 26000
+    },
+    {
+      "epoch": 13.85,
+      "eval_loss": 0.2537305951118469,
+      "eval_runtime": 368.6655,
+      "eval_samples_per_second": 11.656,
+      "eval_steps_per_second": 0.366,
+      "eval_wer": 0.15170880222783337,
+      "step": 26000
+    },
+    {
+      "epoch": 14.12,
+      "learning_rate": 1.7217485685521926e-05,
+      "loss": 1.0693,
+      "step": 26500
+    },
+    {
+      "epoch": 14.12,
+      "eval_loss": 0.25602129101753235,
+      "eval_runtime": 368.3159,
+      "eval_samples_per_second": 11.667,
+      "eval_steps_per_second": 0.367,
+      "eval_wer": 0.15421303656597773,
+      "step": 26500
+    },
+    {
+      "epoch": 14.38,
+      "learning_rate": 1.6037721009249354e-05,
+      "loss": 1.0724,
+      "step": 27000
+    },
+    {
+      "epoch": 14.38,
+      "eval_loss": 0.2540068030357361,
+      "eval_runtime": 369.0094,
+      "eval_samples_per_second": 11.645,
+      "eval_steps_per_second": 0.366,
+      "eval_wer": 0.15736151376289784,
+      "step": 27000
+    },
+    {
+      "epoch": 14.65,
+      "learning_rate": 1.4857956332976782e-05,
+      "loss": 1.0704,
+      "step": 27500
+    },
+    {
+      "epoch": 14.65,
+      "eval_loss": 0.25483617186546326,
+      "eval_runtime": 365.0658,
+      "eval_samples_per_second": 11.77,
+      "eval_steps_per_second": 0.37,
+      "eval_wer": 0.16258819373006225,
+      "step": 27500
+    },
+    {
+      "epoch": 14.92,
+      "learning_rate": 1.3678191656704208e-05,
+      "loss": 1.0729,
+      "step": 28000
+    },
+    {
+      "epoch": 14.92,
+      "eval_loss": 0.254844069480896,
+      "eval_runtime": 367.5842,
+      "eval_samples_per_second": 11.69,
+      "eval_steps_per_second": 0.367,
+      "eval_wer": 0.16009435040576908,
+      "step": 28000
+    },
+    {
+      "epoch": 15.18,
+      "learning_rate": 1.2498426980431636e-05,
+      "loss": 1.0724,
+      "step": 28500
+    },
+    {
+      "epoch": 15.18,
+      "eval_loss": 0.25110504031181335,
+      "eval_runtime": 367.3861,
+      "eval_samples_per_second": 11.696,
+      "eval_steps_per_second": 0.367,
+      "eval_wer": 0.15124120660452842,
+      "step": 28500
+    },
+    {
+      "epoch": 15.45,
+      "learning_rate": 1.1318662304159062e-05,
+      "loss": 1.0655,
+      "step": 29000
+    },
+    {
+      "epoch": 15.45,
+      "eval_loss": 0.24978148937225342,
+      "eval_runtime": 375.4183,
+      "eval_samples_per_second": 11.446,
+      "eval_steps_per_second": 0.36,
+      "eval_wer": 0.14903831166806944,
+      "step": 29000
+    },
+    {
+      "epoch": 15.98,
+      "learning_rate": 8.963852010319007e-06,
+      "loss": 1.0608,
+      "step": 30000
+    },
+    {
+      "epoch": 15.98,
+      "eval_loss": 0.24873663485050201,
+      "eval_runtime": 370.6074,
+      "eval_samples_per_second": 11.594,
+      "eval_steps_per_second": 0.364,
+      "eval_wer": 0.14812390244916196,
+      "step": 30000
+    },
+    {
+      "epoch": 16.52,
+      "learning_rate": 6.604322657773862e-06,
+      "loss": 1.0541,
+      "step": 31000
+    },
+    {
+      "epoch": 16.52,
+      "eval_loss": 0.2467627078294754,
+      "eval_runtime": 371.5001,
+      "eval_samples_per_second": 11.567,
+      "eval_steps_per_second": 0.363,
+      "eval_wer": 0.15039953448257948,
+      "step": 31000
+    },
+    {
+      "epoch": 17.05,
+      "learning_rate": 4.244793305228717e-06,
+      "loss": 1.0584,
+      "step": 32000
+    },
+    {
+      "epoch": 17.05,
+      "eval_loss": 0.2466605007648468,
+      "eval_runtime": 370.8863,
+      "eval_samples_per_second": 11.586,
+      "eval_steps_per_second": 0.364,
+      "eval_wer": 0.1493084780282012,
+      "step": 32000
+    },
+    {
+      "epoch": 17.58,
+      "learning_rate": 1.8852639526835713e-06,
+      "loss": 1.0507,
+      "step": 33000
+    },
+    {
+      "epoch": 17.58,
+      "eval_loss": 0.2480592578649521,
+      "eval_runtime": 373.0281,
+      "eval_samples_per_second": 11.519,
+      "eval_steps_per_second": 0.362,
+      "eval_wer": 0.15173997526938704,
+      "step": 33000
+    },
+    {
+      "epoch": 18.0,
+      "step": 33786,
+      "total_flos": 9.499341430600616e+20,
+      "train_loss": 0.16521977071390054,
+      "train_runtime": 71350.5908,
+      "train_samples_per_second": 60.63,
+      "train_steps_per_second": 0.474
     }
   ],
+  "max_steps": 33786,
+  "num_train_epochs": 18,
+  "total_flos": 9.499341430600616e+20,
   "trial_name": null,
   "trial_params": null
 }

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d5c09474639eff781a9fbbd58d81fc04a95748d863d0d33f663b0592e8c64a21
 size 3055

 version https://git-lfs.github.com/spec/v1
+oid sha256:eb4880e33458fbd00defffaa2d4a3e6ec806898ed51633382a67c03e0088e649
 size 3055