Errors in loading Mamba-370M: Unexpected key(s) in state_dict: "backbone.norm_f.bias" and MambaLMHeadModel.__init__() got an unexpected keyword argument 'iteration'

#2
by luciaquirke - opened

Hello, I have installed the mamba-ssm fork but I'm coming across two errors while trying to load the model. The first occurs during normal loading:

model = MambaLMHeadModel.from_pretrained("Zyphra/Mamba-370M")

The error message:

   model = MambaLMHeadModel.from_pretrained(
  File "/home/lucia/miniconda3/envs/3.10/lib/python3.10/site-packages/mamba_ssm/models/mixer_seq_simple.py", line 246, in from_pretrained
    model.load_state_dict(load_state_dict_hf(pretrained_model_name, device=device, dtype=dtype))
  File "/home/lucia/miniconda3/envs/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2153, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for MambaLMHeadModel:
        Unexpected key(s) in state_dict: "backbone.norm_f.bias". 

The second occurs while loading a specified iteration:

model = MambaLMHeadModel.from_pretrained("Zyphra/Mamba-370M", iteration=10_000)

The error message:

    model = MambaLMHeadModel.from_pretrained(
  File "/home/lucia/miniconda3/envs/3.10/lib/python3.10/site-packages/mamba_ssm/models/mixer_seq_simple.py", line 245, in from_pretrained
    model = cls(config, device=device, dtype=dtype, **kwargs)
TypeError: MambaLMHeadModel.__init__() got an unexpected keyword argument 'iteration'

It seems like a potential workaround to get a specified iteration could be downloading it to the HF cache manually and then loading from file (hf_hub_download(repo_id="Zyphra/Mamba-370M", filename=f"iter_{str(iteration).zfill(7)}/pytorch_model.bin")), I will try this if I manage to resolve the first issue.

Any ideas for resolving these would be super helpful!

Hello! Error messages indicate you're not actually using our fork. How did you install it?

You should be installing it in the following way:

  1. git clone https://github.com/Zyphra/mamba.git
  2. cd mamba
  3. pip install -e . - to really install from the cloned repo in editable mode. I've double checked and pip install . should also work from the inside the folder with our fork.

Hello, I thought I was using pip install . from inside the folder with the fork without success, but pip install -e . fixed it! (?) Thanks so much!

luciaquirke changed discussion status to closed

Sign up or log in to comment