Respair
/

Higgs_Codec_Extended

audio_tokenizer

Model card Files Files and versions

Respair commited on Aug 13

Commit

58828de

·

verified ·

1 Parent(s): f6dc3d5

Update README.md

Files changed (1) hide show

README.md +6 -1

README.md CHANGED Viewed

@@ -8,9 +8,14 @@ tags:
 This is an on-going project. it is a modified version of Higgs-Boson audio tokenizer, you can fully train it. all scripts have been tested.
 a Few notes however:
   - this is not backward compatible with the original checkpoint (I think you can tweak it to be, but you have to adhere to Boson community license if you do.)
   - I highly recommend you to pretrain the model without the mel and adversarial setup first. it saves you a significant amount of compute, time and speed-up your convergence. raise the batch size as much as you can before the adversarial phase.
   - for the semantic teacher, I am using ```utter-project/mHuBERT-147``` which has a good multilingual support. if you want the original setup you can change it in the config.
 I will train a checkpoint on a larger enough dataset one of these days after figuring out a few things first. but the setup is solid.
@@ -30,4 +35,4 @@ take a look at the notebook
 # Batch inference
 take a look at boson_codeit.py
-Happy using / training (~~inshallah~~).

 This is an on-going project. it is a modified version of Higgs-Boson audio tokenizer, you can fully train it. all scripts have been tested.
 a Few notes however:
   - this is not backward compatible with the original checkpoint (I think you can tweak it to be, but you have to adhere to Boson community license if you do.)
   - I highly recommend you to pretrain the model without the mel and adversarial setup first. it saves you a significant amount of compute, time and speed-up your convergence. raise the batch size as much as you can before the adversarial phase.
   - for the semantic teacher, I am using ```utter-project/mHuBERT-147``` which has a good multilingual support. if you want the original setup you can change it in the config.
+  - The loss weights and hyperparameters may not be ideal, feel free to play around with different values.
 I will train a checkpoint on a larger enough dataset one of these days after figuring out a few things first. but the setup is solid.
 # Batch inference
 take a look at boson_codeit.py
+Happy using / training (~~inshallah~~).