mosaicml/mpt-1b-redpajama-200b · `AttributeError: 'ModuleDict' object has no attribute 'get_input

When trying to finetune using Mosaic's composer train.py (with a more or less default finetuning yaml) I get: AttributeError: 'ModuleDict' object has no attribute 'get_input_embeddings'
Any ideas how to circumvent the issue?
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/matthias/src/replit-finetune/../llm-foundry/scripts/train/train.py:326 in <module>         │
│                                                                                                  │
│   323 │   │   yaml_cfg = om.load(f)                                                              │
│   324 │   cli_cfg = om.from_cli(args_list)                                                       │
│   325 │   cfg = om.merge(yaml_cfg, cli_cfg)                                                      │
│ ❱ 326 │   main(cfg)                                                                              │
│   327                                                                                            │
│                                                                                                  │
│ /home/matthias/src/replit-finetune/../llm-foundry/scripts/train/train.py:215 in main             │
│                                                                                                  │
│   212 │   │   │   │   cfg.model, cfg.lora, tokenizer)                                            │
│   213 │   │   │   print_trainable_parameters(model)  # should not be 100%                        │
│   214 │   │   else:  # standard model                                                            │
│ ❱ 215 │   │   │   model = build_composer_model(cfg.model, tokenizer)                             │
│   216 │   cfg.n_params = sum(p.numel() for p in model.parameters())                              │
│   217 │   print(f'{cfg.n_params=:.2e}')                                                          │
│   218                                                                                            │
│                                                                                                  │
│ /home/matthias/src/replit-finetune/../llm-foundry/scripts/train/train.py:70 in                   │
│ build_composer_model                                                                             │
│                                                                                                  │
│    67 │   if model_cfg.name not in COMPOSER_MODEL_REGISTRY:                                      │
│    68 │   │   raise ValueError(                                                                  │
│    69 │   │   │   f'Not sure how to build model with name={model_cfg.name}')                     │
│ ❱  70 │   return COMPOSER_MODEL_REGISTRY[model_cfg.name](model_cfg, tokenizer)                   │
│    71                                                                                            │
│    72                                                                                            │
│    73 def build_composer_peft_model(                                                             │
│                                                                                                  │
│ /home/matthias/src/llm-foundry/llmfoundry/models/hf/hf_causal_lm.py:181 in __init__              │
│                                                                                                  │
│   178 │   │   │   │   f'om_model_config must be either a DictConfig, PeftModel, or PreTrainedM   │
│   179 │   │   │   )                                                                              │
│   180 │   │                                                                                      │
│ ❱ 181 │   │   composer_model = super().__init__(model=model,                                     │
│   182 │   │   │   │   │   │   │   │   │   │     shift_labels=True,                               │
│   183 │   │   │   │   │   │   │   │   │   │     tokenizer=tokenizer,                             │
│   184 │   │   │   │   │   │   │   │   │   │     metrics=train_metrics,                           │
│                                                                                                  │
│ /home/matthias/src/llm-foundry/llmfoundry/models/hf/model_wrapper.py:65 in __init__              │
│                                                                                                  │
│    62 │   │                                                                                      │
│    63 │   │   # Note: We need to add the FSDP related attributes to the model AFTER the super    │
│    64 │   │   # so that the (possible) embedding resizing doesn't destroy them                   │
│ ❱  65 │   │   prepare_hf_model_for_fsdp(self.model, init_device)                                 │
│    66 │   │                                                                                      │
│    67 │   │   # This provides support for meta initialization when using FSDP                    │
│    68 │   │   self.model.param_init_fn = lambda module: self.model._init_weights(                │
│                                                                                                  │
│ /home/matthias/src/llm-foundry/llmfoundry/models/hf/hf_fsdp.py:118 in prepare_hf_model_for_fsdp  │
│                                                                                                  │
│   115 │   else:                                                                                  │
│   116 │   │   # many common decoder-only model do not set the flag                               │
│   117 │   │   # model.config.is_decoder, so we can't trust it                                    │
│ ❱ 118 │   │   prepare_hf_causal_lm_model_for_fsdp(model, init_device)                            │
│   119                                                                                            │
│   120                                                                                            │
│   121 def prepare_hf_causal_lm_model_for_fsdp(model: PreTrainedModel,                            │
│                                                                                                  │
│ /home/matthias/src/llm-foundry/llmfoundry/models/hf/hf_fsdp.py:136 in                            │
│ prepare_hf_causal_lm_model_for_fsdp                                                              │
│                                                                                                  │
│   133 │   lm_head = model.get_output_embeddings()                                                │
│   134 │   # some models (OPT) implement .get_input_embeddings for the causal subclass            │
│   135 │   # but all of them implement it for the base model                                      │
│ ❱ 136 │   tied_embeddings = causal_base_model.get_input_embeddings()  # type: ignore             │
│   137 │   modules = {                                                                            │
│   138 │   │   'base_model': causal_base_model,                                                   │
│   139 │   │   'model_block': model_block,                                                        │
│                                                                                                  │
│ /home/matthias/src/replit-finetune/env/lib/python3.10/site-packages/torch/nn/modules/module.py:1 │
│ 614 in __getattr__                                                                               │
│                                                                                                  │
│   1611 │   │   │   modules = self.__dict__['_modules']                                           │
│   1612 │   │   │   if name in modules:                                                           │
│   1613 │   │   │   │   return modules[name]                                                      │
│ ❱ 1614 │   │   raise AttributeError("'{}' object has no attribute '{}'".format(                  │
│   1615 │   │   │   type(self).__name__, name))                                                   │
│   1616 │                                                                                         │
│   1617 │   def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:             │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: 'ModuleDict' object has no attribute 'get_input_embeddings'
mosaicml
/

mpt-1b-redpajama-200b

`AttributeError: 'ModuleDict' object has no attribute 'get_input_embeddings'` when finetuning