mamba-7b-rw / config.json
sedrick-keh-tri
push jsons
2501def
raw
history blame
80 Bytes
{
"d_model": 4096,
"n_layer": 64,
"vocab_size": 50432,
"seq_len": 2048
}