emozilla/mpt-7b-storysummarizer · How to load saved model after fine tuning

I fine tuned mpt-7b-storysummarizer on my own dataset and I was wondering how I'm supposed to load the model back in after saving. The contents of the saved directory are the folowing

    config.json
    training_args.bin
    generation_config.json
    pytorch_model.bin

Which isn't the standard output for QLORA adapaters, so when I try to use the standard method for QLORA models

adapter_name = <path_to_saved_model_dir>
tokenizer, model = load_model() # I've got it loading successfully
model =  PeftModel.from_pretrained(model, adapter_name)

of course it complains adapter_config.json is not there.

So how do I load the fine tuned model I saved? Loading them like a normal huggingface model doesn't work any better. Any help would be most greatly appreciated.

Update

I'm using this to load in the saved model

import torch from my_package import load_model tokenizer, model = load_model() state_dict = torch.load('/path/to/saved/pytorch_model.bin') model.load_state_dict(state_dict) # Do a bunch of inference now

but something seems to be broken because the weights from my saved model cause immediate termination. Like it just generates EOS and that's it. Is this the correct way to load in the saved weights? Feels like a hack as I've always used some form of from_pretrained out of transformers.