Models with 119M parameters trained with 7.0B tokens.
-
StepLaw/StepLaw-N_119M-D_7.0B-LR1.381e-03-BS262144
Text Generation • Updated -
StepLaw/StepLaw-N_119M-D_7.0B-LR1.381e-03-BS524288
Text Generation • Updated -
StepLaw/StepLaw-N_119M-D_7.0B-LR1.381e-03-BS65536
Text Generation • Updated -
StepLaw/StepLaw-N_119M-D_7.0B-LR1.381e-03-BS1048576
Text Generation • Updated