ecker commited on
Commit
d82ee7c
·
verified ·
1 Parent(s): 374fde8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -6
README.md CHANGED
@@ -100,12 +100,9 @@ This repo contains the following configurations under `./models/`:
100
  * The "confidence" issue on voices it hasn't seen / hasn't seen much of is much more noticeable as RVQ level 0 is much more susceptable to it.
101
  * Unlike the base model, this is trained with the current dataset without iteratively dripfeeding additional sources (like tacking on Emilia afterwards).
102
  * ...except STT, this received no STT training out of fear of botching the model.
103
- * ~~Weights will be added as the model is trained.~~
104
- * I don't think the model can perform well at the current size.
105
- * Longer utterances degrade and stutter.
106
- * While more training seems to make it adhere to the prompt better, more training does not make the output more stable.
107
- * It seems the exact same as the previous-erroneously-trained model (where it was actually trained to predict the next token, rather than the token in place).
108
- * I would say that a bigger model might help, ignoring RVQ levels 1+ and solely focusing on NAR RVQ level 0 does not seem to matter.
109
 
110
  Some additional configurations have been explored with, but experiments have not been fruitful:
111
  * Exotic wrappers like `BitNet` seemed to yield little gains in inferencing, somehow. The memory savings is pretty much unneccessary as the models are already manageable at ~200M parameters.
 
100
  * The "confidence" issue on voices it hasn't seen / hasn't seen much of is much more noticeable as RVQ level 0 is much more susceptable to it.
101
  * Unlike the base model, this is trained with the current dataset without iteratively dripfeeding additional sources (like tacking on Emilia afterwards).
102
  * ...except STT, this received no STT training out of fear of botching the model.
103
+ * Weights will be added as the model is trained.
104
+ * This *was* expected to be a dud, but one very, very small oversight in the sampling code proved to be the culrpit......
105
+ * In other words, the model *does* work.
 
 
 
106
 
107
  Some additional configurations have been explored with, but experiments have not been fruitful:
108
  * Exotic wrappers like `BitNet` seemed to yield little gains in inferencing, somehow. The memory savings is pretty much unneccessary as the models are already manageable at ~200M parameters.