arslanarjumand/wav2vec-read-aloud

Browse files

Files changed (4) hide show

README.md +23 -16
config.json +2 -2
model.safetensors +2 -2
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -15,11 +15,11 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [arslanarjumand/wav2vec-reptiles](https://huggingface.co/arslanarjumand/wav2vec-reptiles) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 180.5618
-- Pcc Accuracy: 0.7344
-- Pcc Fluency: 0.7572
-- Pcc Total Score: 0.7949
-- Pcc Content: 0.7727
 ## Model description
@@ -38,7 +38,7 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 2.5e-05
 - train_batch_size: 4
 - eval_batch_size: 6
 - seed: 42
@@ -46,26 +46,33 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
-- lr_scheduler_warmup_ratio: 0.5
 - num_epochs: 15
-- mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Pcc Accuracy | Pcc Fluency | Pcc Total Score | Pcc Content |
 |:-------------:|:-----:|:----:|:---------------:|:------------:|:-----------:|:---------------:|:-----------:|
-| 323.2938      | 2.13  | 500  | 333.4772        | 0.4645       | 0.5166      | 0.5181          | 0.4915      |
-| 274.2192      | 4.27  | 1000 | 259.5493        | 0.5725       | 0.6371      | 0.6430          | 0.6182      |
-| 287.9362      | 6.4   | 1500 | 291.9187        | 0.6475       | 0.6895      | 0.7121          | 0.6902      |
-| 273.6328      | 8.54  | 2000 | 229.1164        | 0.6884       | 0.7243      | 0.7522          | 0.7285      |
-| 211.4504      | 10.67 | 2500 | 223.4485        | 0.7087       | 0.7420      | 0.7727          | 0.7499      |
-| 162.7622      | 12.81 | 3000 | 180.6950        | 0.7302       | 0.7557      | 0.7918          | 0.7695      |
-| 194.6916      | 14.94 | 3500 | 180.5618        | 0.7344       | 0.7572      | 0.7949          | 0.7727      |
 ### Framework versions
 - Transformers 4.37.0
 - Pytorch 2.1.2
-- Datasets 2.17.1
 - Tokenizers 0.15.1

 This model is a fine-tuned version of [arslanarjumand/wav2vec-reptiles](https://huggingface.co/arslanarjumand/wav2vec-reptiles) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 182.3516
+- Pcc Accuracy: 0.6684
+- Pcc Fluency: 0.6499
+- Pcc Total Score: 0.7110
+- Pcc Content: 0.6788
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5.5e-05
 - train_batch_size: 4
 - eval_batch_size: 6
 - seed: 42
 - total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.4
 - num_epochs: 15
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Pcc Accuracy | Pcc Fluency | Pcc Total Score | Pcc Content |
 |:-------------:|:-----:|:----:|:---------------:|:------------:|:-----------:|:---------------:|:-----------:|
+| 2719.4074     | 0.97  | 500  | 2790.7349       | 0.1171       | 0.1116      | 0.1218          | 0.1245      |
+| 386.8535      | 1.93  | 1000 | 361.3293        | 0.1481       | 0.1332      | 0.1511          | 0.1445      |
+| 273.8093      | 2.9   | 1500 | 304.4040        | 0.2869       | 0.2915      | 0.3062          | 0.2849      |
+| 280.8214      | 3.87  | 2000 | 277.9273        | 0.4065       | 0.4344      | 0.4465          | 0.4131      |
+| 264.1531      | 4.84  | 2500 | 265.5385        | 0.5012       | 0.5234      | 0.5490          | 0.5117      |
+| 211.6362      | 5.8   | 3000 | 226.9335        | 0.5675       | 0.5768      | 0.6171          | 0.5817      |
+| 217.8737      | 6.77  | 3500 | 218.1019        | 0.6089       | 0.5984      | 0.6525          | 0.6194      |
+| 180.3319      | 7.74  | 4000 | 201.4108        | 0.6296       | 0.6142      | 0.6721          | 0.6395      |
+| 174.7695      | 8.7   | 4500 | 201.3474        | 0.6427       | 0.6297      | 0.6872          | 0.6542      |
+| 182.4466      | 9.67  | 5000 | 189.6567        | 0.6566       | 0.6333      | 0.6957          | 0.6619      |
+| 184.7177      | 10.64 | 5500 | 182.7654        | 0.6628       | 0.6405      | 0.7033          | 0.6713      |
+| 174.6915      | 11.61 | 6000 | 181.2284        | 0.6635       | 0.6479      | 0.7077          | 0.6755      |
+| 187.671       | 12.57 | 6500 | 180.5753        | 0.6676       | 0.6486      | 0.7099          | 0.6773      |
+| 166.4409      | 13.54 | 7000 | 181.2506        | 0.6682       | 0.6493      | 0.7105          | 0.6781      |
+| 176.7043      | 14.51 | 7500 | 182.3516        | 0.6684       | 0.6499      | 0.7110          | 0.6788      |
 ### Framework versions
 - Transformers 4.37.0
 - Pytorch 2.1.2
+- Datasets 2.18.0
 - Tokenizers 0.15.1

config.json CHANGED Viewed

@@ -11,7 +11,7 @@
   ],
   "attention_dropout": 0.0094,
   "bos_token_id": 1,
-  "classifier_proj_size": 768,
   "codevector_dim": 768,
   "conformer_conv_dropout": 0.1,
   "contrastive_logits_temperature": 0.1,
@@ -56,7 +56,7 @@
   "num_attention_heads": 16,
   "num_codevector_groups": 2,
   "num_codevectors_per_group": 320,
-  "num_hidden_layers": 24,
   "num_negatives": 100,
   "output_hidden_size": 1024,
   "pad_token_id": 0,

   ],
   "attention_dropout": 0.0094,
   "bos_token_id": 1,
+  "classifier_proj_size": 100,
   "codevector_dim": 768,
   "conformer_conv_dropout": 0.1,
   "contrastive_logits_temperature": 0.1,
   "num_attention_heads": 16,
   "num_codevector_groups": 2,
   "num_codevectors_per_group": 320,
+  "num_hidden_layers": 8,
   "num_negatives": 100,
   "output_hidden_size": 1024,
   "pad_token_id": 0,

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3b0b673eb880d4c8d4ce1a725874267182d7bc3b1ff32d8b5061035cbe10c10a
-size 2325236000

 version https://git-lfs.github.com/spec/v1
+oid sha256:4616ad557e7adb6f769d2533776c1da4db84a1246ef782350b8b429dfe0ea901
+size 794371536

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ad61c98ac9e74083e7bf784e4b8953d284c8a3cf81d10f9c5fd2dfeec8b834da
-size 4664

 version https://git-lfs.github.com/spec/v1
+oid sha256:160842d9ad9fb8aa42c24765e59cfa16095af1c479acc4c5f26155826d12c7d9
+size 4728