license: mit | |
# stories15M_MOE | |
This model is [ModelCloud/tinyllama-15M-stories](https://huggingface.co/ModelCloud/tinyllama-15M-stories) repeated 4 times to make 4 experts. | |
The model is used for testing, not intended to be used in production (unless your product is some kind of bedtime story teller) | |
Weight of router is initialized randomly | |