kingabzpro commited on
Commit
7c41b61
1 Parent(s): bf78042

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -60
README.md CHANGED
@@ -1,57 +1,11 @@
1
  ---
2
- language:
3
- - ur
4
-
5
- license: apache-2.0
6
  tags:
7
- - automatic-speech-recognition
8
- - robust-speech-event
9
  datasets:
10
  - common_voice
11
- metrics:
12
- - wer
13
- - cer
14
  model-index:
15
  - name: wav2vec2-large-xlsr-53-urdu
16
- results:
17
- - task:
18
- type: automatic-speech-recognition # Required. Example: automatic-speech-recognition
19
- name: Urdu Speech Recognition # Optional. Example: Speech Recognition
20
- dataset:
21
- type: common_voice # Required. Example: common_voice. Use dataset id from https://hf.co/datasets
22
- name: Urdu # Required. Example: Common Voice zh-CN
23
- args: ur # Optional. Example: zh-CN
24
- metrics:
25
- - type: wer # Required. Example: wer
26
- value: 66.2 # Required. Example: 20.90
27
- name: Test WER # Optional. Example: Test WER
28
- args:
29
- - learning_rate: 0.0003
30
- - train_batch_size: 16
31
- - eval_batch_size: 8
32
- - seed: 42
33
- - gradient_accumulation_steps: 2
34
- - total_train_batch_size: 32
35
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
36
- - lr_scheduler_type: linear
37
- - lr_scheduler_warmup_steps: 200
38
- - num_epochs: 50
39
- - mixed_precision_training: Native AMP # Optional. Example for BLEU: max_order
40
- - type: cer # Required. Example: wer
41
- value: 31.7 # Required. Example: 20.90
42
- name: Test CER # Optional. Example: Test WER
43
- args:
44
- - learning_rate: 0.0003
45
- - train_batch_size: 16
46
- - eval_batch_size: 8
47
- - seed: 42
48
- - gradient_accumulation_steps: 2
49
- - total_train_batch_size: 32
50
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
51
- - lr_scheduler_type: linear
52
- - lr_scheduler_warmup_steps: 200
53
- - num_epochs: 50
54
- - mixed_precision_training: Native AMP # Optional. Example for BLEU: max_order
55
  ---
56
 
57
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -59,18 +13,25 @@ should probably proofread and complete it, then remove this comment. -->
59
 
60
  # wav2vec2-large-xlsr-53-urdu
61
 
62
- This model is a fine-tuned version of [m3hrdadfi/wav2vec2-large-xlsr-persian-v3](https://huggingface.co/m3hrdadfi/wav2vec2-large-xlsr-persian-v3) on the common_voice dataset.
63
  It achieves the following results on the evaluation set:
64
- - Loss: 1.5727
65
- - Wer: 0.6620
66
- - Cer: 0.3166
 
 
 
 
 
 
 
 
67
 
 
68
 
69
  More information needed
70
- The training and valid dataset is 0.58 hours. It was hard to train any model on lower number of so I decided to take Persian checkpoint and finetune the XLSR model.
71
 
72
  ## Training procedure
73
- Trained on m3hrdadfi/wav2vec2-large-xlsr-persian-v3 due to lesser number of samples. Persian and Urdu are quite similar.
74
 
75
  ### Training hyperparameters
76
 
@@ -91,12 +52,12 @@ The following hyperparameters were used during training:
91
 
92
  | Training Loss | Epoch | Step | Validation Loss | Wer | Cer |
93
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|
94
- | 2.9707 | 8.33 | 100 | 1.2689 | 0.8463 | 0.4373 |
95
- | 0.746 | 16.67 | 200 | 1.2370 | 0.7214 | 0.3486 |
96
- | 0.3719 | 25.0 | 300 | 1.3885 | 0.6908 | 0.3381 |
97
- | 0.2411 | 33.33 | 400 | 1.4780 | 0.6690 | 0.3186 |
98
- | 0.1841 | 41.67 | 500 | 1.5557 | 0.6629 | 0.3241 |
99
- | 0.165 | 50.0 | 600 | 1.5727 | 0.6620 | 0.3166 |
100
 
101
 
102
  ### Framework versions
 
1
  ---
 
 
 
 
2
  tags:
3
+ - generated_from_trainer
 
4
  datasets:
5
  - common_voice
 
 
 
6
  model-index:
7
  - name: wav2vec2-large-xlsr-53-urdu
8
+ results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ---
10
 
11
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
13
 
14
  # wav2vec2-large-xlsr-53-urdu
15
 
16
+ This model is a fine-tuned version of [Harveenchadha/vakyansh-wav2vec2-urdu-urm-60](https://huggingface.co/Harveenchadha/vakyansh-wav2vec2-urdu-urm-60) on the common_voice dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Loss: 11.4593
19
+ - Wer: 0.5772
20
+ - Cer: 0.3384
21
+
22
+ ## Model description
23
+
24
+ More information needed
25
+
26
+ ## Intended uses & limitations
27
+
28
+ More information needed
29
 
30
+ ## Training and evaluation data
31
 
32
  More information needed
 
33
 
34
  ## Training procedure
 
35
 
36
  ### Training hyperparameters
37
 
 
52
 
53
  | Training Loss | Epoch | Step | Validation Loss | Wer | Cer |
54
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|
55
+ | 13.2136 | 8.33 | 100 | 9.5424 | 0.7672 | 0.4381 |
56
+ | 2.6996 | 16.67 | 200 | 8.4317 | 0.6661 | 0.3620 |
57
+ | 1.371 | 25.0 | 300 | 9.5518 | 0.6443 | 0.3701 |
58
+ | 0.639 | 33.33 | 400 | 9.4132 | 0.6129 | 0.3609 |
59
+ | 0.4452 | 41.67 | 500 | 10.8330 | 0.5920 | 0.3473 |
60
+ | 0.3233 | 50.0 | 600 | 11.4593 | 0.5772 | 0.3384 |
61
 
62
 
63
  ### Framework versions