declare-lab-sutd
commited on
Commit
·
fd78313
1
Parent(s):
65a680d
Update README.md
Browse files
README.md
CHANGED
@@ -66,4 +66,4 @@ This will generate two samples for each of the three text prompts.
|
|
66 |
|
67 |
TANGO is trained on the small AudioCaps dataset so it may not generate good audio samples related to concepts that it has not seen in training (e.g. _singing_). For the same reason, TANGO is not always able to finely control its generations over textual control prompts. For example, the generations from TANGO for prompts _Chopping tomatoes on a wooden table_ and _Chopping potatoes on a metal table_ are very similar. _Chopping vegetables on a table_ also produces similar audio samples. Training text-to-audio generation models on larger datasets is thus required for the model to learn the composition of textual concepts and varied text-audio mappings.
|
68 |
|
69 |
-
We are training another
|
|
|
66 |
|
67 |
TANGO is trained on the small AudioCaps dataset so it may not generate good audio samples related to concepts that it has not seen in training (e.g. _singing_). For the same reason, TANGO is not always able to finely control its generations over textual control prompts. For example, the generations from TANGO for prompts _Chopping tomatoes on a wooden table_ and _Chopping potatoes on a metal table_ are very similar. _Chopping vegetables on a table_ also produces similar audio samples. Training text-to-audio generation models on larger datasets is thus required for the model to learn the composition of textual concepts and varied text-audio mappings.
|
68 |
|
69 |
+
We are training another version of TANGO on larger datasets to enhance its generalization, compositional and controllable generation ability.
|