Models with 59M parameters trained with 7.0B tokens.
-
StepLaw/StepLaw-N_59M-D_7.0B-LR1.105e-02-BS262144
Text Generation • Updated • 5 -
StepLaw/StepLaw-N_59M-D_7.0B-LR1.105e-02-BS524288
Text Generation • Updated • 5 -
StepLaw/StepLaw-N_59M-D_7.0B-LR1.105e-02-BS65536
Text Generation • Updated • 4 -
StepLaw/StepLaw-N_59M-D_7.0B-LR1.105e-02-BS1048576
Text Generation • Updated • 4