patrickvonplaten commited on
Commit
a633061
·
1 Parent(s): 6ba311a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -13
README.md CHANGED
@@ -20,8 +20,9 @@ should probably proofread and complete it, then remove this comment. -->
20
 
21
  This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the MOZILLA-FOUNDATION/COMMON_VOICE_7_0 - SV-SE dataset.
22
  It achieves the following results on the evaluation set:
23
- - Loss: 0.6482
24
- - Wer: 0.6389
 
25
 
26
  ## Model description
27
 
@@ -41,14 +42,14 @@ More information needed
41
 
42
  The following hyperparameters were used during training:
43
  - learning_rate: 7.5e-05
44
- - train_batch_size: 8
45
- - eval_batch_size: 8
46
  - seed: 42
47
  - distributed_type: multi-GPU
48
  - num_devices: 8
49
- - gradient_accumulation_steps: 4
50
- - total_train_batch_size: 256
51
- - total_eval_batch_size: 64
52
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
53
  - lr_scheduler_type: linear
54
  - lr_scheduler_warmup_steps: 2000
@@ -57,12 +58,7 @@ The following hyperparameters were used during training:
57
 
58
  ### Training results
59
 
60
- | Training Loss | Epoch | Step | Validation Loss | Wer |
61
- |:-------------:|:-----:|:----:|:---------------:|:------:|
62
- | 3.3873 | 11.62 | 500 | 3.3324 | 1.0 |
63
- | 2.9207 | 23.25 | 1000 | 2.8924 | 0.9987 |
64
- | 2.1463 | 34.88 | 1500 | 1.3063 | 0.9139 |
65
- | 1.607 | 46.51 | 2000 | 0.7205 | 0.6856 |
66
 
67
 
68
  ### Framework versions
 
20
 
21
  This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the MOZILLA-FOUNDATION/COMMON_VOICE_7_0 - SV-SE dataset.
22
  It achieves the following results on the evaluation set:
23
+
24
+ - Loss: 0.2604
25
+ - Wer: 0.2334
26
 
27
  ## Model description
28
 
 
42
 
43
  The following hyperparameters were used during training:
44
  - learning_rate: 7.5e-05
45
+ - train_batch_size: 4
46
+ - eval_batch_size: 4
47
  - seed: 42
48
  - distributed_type: multi-GPU
49
  - num_devices: 8
50
+ - gradient_accumulation_steps: 1
51
+ - total_train_batch_size: 32
52
+ - total_eval_batch_size: 32
53
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
54
  - lr_scheduler_type: linear
55
  - lr_scheduler_warmup_steps: 2000
 
58
 
59
  ### Training results
60
 
61
+ See Tensorboard
 
 
 
 
 
62
 
63
 
64
  ### Framework versions