alonzi commited on
Commit
f1961dc
1 Parent(s): d671b54

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -137,7 +137,9 @@ The model was trained on licensed data using the following sources: the [Meta Mu
137
 
138
  ## Evaluation results
139
 
140
- Below are the objective metrics obtained on MusicCaps with the released model. Note that for the publicly released models, we had all the datasets go through a state-of-the-art music source separation method, namely using the open source [Hybrid Transformer for Music Source Separation](https://github.com/facebookresearch/demucs) (HT-Demucs), in order to keep only the instrumental part. This explains the difference in objective metrics with the models used in the paper.
 
 
141
 
142
  | Model | Frechet Audio Distance | KLD | Text Consistency |
143
  |---|---|---|---|
@@ -150,7 +152,7 @@ More information can be found in the paper [Masked Audio Generation using a Sing
150
 
151
  ## Limitations and biases
152
 
153
- **Data:** The data sources used to train the model are created by music professionals and covered by legal agreements with the right holders. The model is trained on 20K hours of data, we believe that scaling the model on larger datasets can further improve the performance of the model.
154
 
155
  **Mitigations:** Tracks that include vocals have been removed from the data source using corresponding tags, and using a state-of-the-art music source separation method, namely using the open source [Hybrid Transformer for Music Source Separation](https://github.com/facebookresearch/demucs) (HT-Demucs).
156
 
 
137
 
138
  ## Evaluation results
139
 
140
+ Below are the objective metrics obtained on MusicCaps with the released model. Note that for the publicly released models, we used the state-of-the-art music source separation method,
141
+ namely the open source [Hybrid Transformer for Music Source Separation](https://github.com/facebookresearch/demucs) (HT-Demucs),
142
+ in order to keep only instrumental tracks. This explains the difference in objective metrics with the models used in the paper.
143
 
144
  | Model | Frechet Audio Distance | KLD | Text Consistency |
145
  |---|---|---|---|
 
152
 
153
  ## Limitations and biases
154
 
155
+ **Data:** The data sources used to train the model are created by music professionals and covered by legal agreements with the right holders. The model is trained on 15K hours of data, we believe that scaling the model on larger datasets can further improve the performance of the model.
156
 
157
  **Mitigations:** Tracks that include vocals have been removed from the data source using corresponding tags, and using a state-of-the-art music source separation method, namely using the open source [Hybrid Transformer for Music Source Separation](https://github.com/facebookresearch/demucs) (HT-Demucs).
158