baichuan / train args
rhineJoke's picture
Create train args
a6513c7
Hyperparameter Value
n_parameters 7000559616
n_layers 32
n_heads 32
d_model 4096
vocab size 64000
sequence length 4096