lolicore-test / README.md
Rorical
init
aff89a0
|
raw
history blame
253 Bytes

LoliCore 1B

This is a very small MoE (Mixture Of Expert) model that I will experiment with in different MLP settings. Particularly in this repo I used a Jump module (passing the hidden state directly to the next layer) to test if it will work in MoE.