Add metrics
Browse files
README.md
CHANGED
@@ -66,4 +66,32 @@ We where using two RTX-3090 video cards for training, and it took about one mont
|
|
66 |
* mel_loss_coeff: 45
|
67 |
* mrd_loss_coeff: 1.0
|
68 |
* batch_size: 20
|
69 |
-
* num_samples: 32768
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
66 |
* mel_loss_coeff: 45
|
67 |
* mrd_loss_coeff: 1.0
|
68 |
* batch_size: 20
|
69 |
+
* num_samples: 32768
|
70 |
+
|
71 |
+
## Evaluation
|
72 |
+
|
73 |
+
|
74 |
+
Evaluation was done using the metrics on the original repo, after 210 epochs we achieve:
|
75 |
+
|
76 |
+
* val_loss: 3.703
|
77 |
+
* f1_score: 0.950
|
78 |
+
* mel_loss: 0.248
|
79 |
+
* periodicity_loss:0.127
|
80 |
+
* pesq_score: 3.399
|
81 |
+
* pitch_loss: 38.26
|
82 |
+
* utmos_score: 3.146
|
83 |
+
|
84 |
+
|
85 |
+
## Citation
|
86 |
+
|
87 |
+
|
88 |
+
If this code contributes to your research, please cite the work:
|
89 |
+
|
90 |
+
```
|
91 |
+
@article{siuzdak2023vocos,
|
92 |
+
title={Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis},
|
93 |
+
author={Siuzdak, Hubert},
|
94 |
+
journal={arXiv preprint arXiv:2306.00814},
|
95 |
+
year={2023}
|
96 |
+
}
|
97 |
+
```
|