TRI-ML/mamba-7b-rw · Fix keys in `model.safetensors` file to make it loadable with AutoModelForCausalLM.from

These are the changes I made to the keys in the model.safetensors file, to make the model load properly using AutoModelForCausalLM.from_pretrained:

if k.startswith('layers.'):
    k = k.replace('layers.', 'backbone.layers.')
elif k.startswith('norm_f.'):
    k = k.replace('norm_f.', 'backbone.norm_f.')
elif k.startswith('embeddings.'):
    k = k.replace('embeddings.', 'backbone.embeddings.')
elif k.startswith('model.'):
    k = k.replace('model.', '')

Additionally, I used the model.save_pretrained function to shard the file into 6 smaller safetensors files (instead of one large one).

After this PR is merged, the existing model.safetensors file should be deleted (and perhaps the pytorch_model.bin file as well)

TRI-ML
/

mamba-7b-rw

Fix keys in `model.safetensors` file to make it loadable with AutoModelForCausalLM.from_pretrained