Update README.md
Browse files
README.md
CHANGED
|
@@ -8,9 +8,14 @@ tags:
|
|
| 8 |
|
| 9 |
This is an on-going project. it is a modified version of Higgs-Boson audio tokenizer, you can fully train it. all scripts have been tested.
|
| 10 |
a Few notes however:
|
|
|
|
| 11 |
- this is not backward compatible with the original checkpoint (I think you can tweak it to be, but you have to adhere to Boson community license if you do.)
|
|
|
|
| 12 |
- I highly recommend you to pretrain the model without the mel and adversarial setup first. it saves you a significant amount of compute, time and speed-up your convergence. raise the batch size as much as you can before the adversarial phase.
|
|
|
|
| 13 |
- for the semantic teacher, I am using ```utter-project/mHuBERT-147``` which has a good multilingual support. if you want the original setup you can change it in the config.
|
|
|
|
|
|
|
| 14 |
|
| 15 |
I will train a checkpoint on a larger enough dataset one of these days after figuring out a few things first. but the setup is solid.
|
| 16 |
|
|
@@ -30,4 +35,4 @@ take a look at the notebook
|
|
| 30 |
# Batch inference
|
| 31 |
take a look at boson_codeit.py
|
| 32 |
|
| 33 |
-
Happy using / training (~~inshallah~~).
|
|
|
|
| 8 |
|
| 9 |
This is an on-going project. it is a modified version of Higgs-Boson audio tokenizer, you can fully train it. all scripts have been tested.
|
| 10 |
a Few notes however:
|
| 11 |
+
|
| 12 |
- this is not backward compatible with the original checkpoint (I think you can tweak it to be, but you have to adhere to Boson community license if you do.)
|
| 13 |
+
|
| 14 |
- I highly recommend you to pretrain the model without the mel and adversarial setup first. it saves you a significant amount of compute, time and speed-up your convergence. raise the batch size as much as you can before the adversarial phase.
|
| 15 |
+
|
| 16 |
- for the semantic teacher, I am using ```utter-project/mHuBERT-147``` which has a good multilingual support. if you want the original setup you can change it in the config.
|
| 17 |
+
|
| 18 |
+
- The loss weights and hyperparameters may not be ideal, feel free to play around with different values.
|
| 19 |
|
| 20 |
I will train a checkpoint on a larger enough dataset one of these days after figuring out a few things first. but the setup is solid.
|
| 21 |
|
|
|
|
| 35 |
# Batch inference
|
| 36 |
take a look at boson_codeit.py
|
| 37 |
|
| 38 |
+
Happy using / training (~~inshallah~~).
|