Update README.md
Browse files
README.md
CHANGED
@@ -137,7 +137,9 @@ The model was trained on licensed data using the following sources: the [Meta Mu
|
|
137 |
|
138 |
## Evaluation results
|
139 |
|
140 |
-
Below are the objective metrics obtained on MusicCaps with the released model. Note that for the publicly released models, we
|
|
|
|
|
141 |
|
142 |
| Model | Frechet Audio Distance | KLD | Text Consistency |
|
143 |
|---|---|---|---|
|
@@ -150,7 +152,7 @@ More information can be found in the paper [Masked Audio Generation using a Sing
|
|
150 |
|
151 |
## Limitations and biases
|
152 |
|
153 |
-
**Data:** The data sources used to train the model are created by music professionals and covered by legal agreements with the right holders. The model is trained on
|
154 |
|
155 |
**Mitigations:** Tracks that include vocals have been removed from the data source using corresponding tags, and using a state-of-the-art music source separation method, namely using the open source [Hybrid Transformer for Music Source Separation](https://github.com/facebookresearch/demucs) (HT-Demucs).
|
156 |
|
|
|
137 |
|
138 |
## Evaluation results
|
139 |
|
140 |
+
Below are the objective metrics obtained on MusicCaps with the released model. Note that for the publicly released models, we used the state-of-the-art music source separation method,
|
141 |
+
namely the open source [Hybrid Transformer for Music Source Separation](https://github.com/facebookresearch/demucs) (HT-Demucs),
|
142 |
+
in order to keep only instrumental tracks. This explains the difference in objective metrics with the models used in the paper.
|
143 |
|
144 |
| Model | Frechet Audio Distance | KLD | Text Consistency |
|
145 |
|---|---|---|---|
|
|
|
152 |
|
153 |
## Limitations and biases
|
154 |
|
155 |
+
**Data:** The data sources used to train the model are created by music professionals and covered by legal agreements with the right holders. The model is trained on 15K hours of data, we believe that scaling the model on larger datasets can further improve the performance of the model.
|
156 |
|
157 |
**Mitigations:** Tracks that include vocals have been removed from the data source using corresponding tags, and using a state-of-the-art music source separation method, namely using the open source [Hybrid Transformer for Music Source Separation](https://github.com/facebookresearch/demucs) (HT-Demucs).
|
158 |
|