Cannot merge and save unsupervised model

#1
by Mengyao00 - opened

Hi I followed your code and wanna merge then save the unsupervised model, instead of using adapter config, but got an error

model.save_pretrained("llm2vec/McGill-NLP/to_del")
 File "/usr/local/lib/python3.8/dist-packages/transformers/modeling_utils.py", line 2352, in save_pretrained
   state_dict = model_to_save.get_adapter_state_dict()
 File "/usr/local/lib/python3.8/dist-packages/transformers/integrations/peft.py", line 417, in get_adapter_state_dict
   adapter_name = self.active_adapter()
 File "/usr/local/lib/python3.8/dist-packages/transformers/integrations/peft.py", line 395, in active_adapter
   return self.active_adapters()[0]
 File "/usr/local/lib/python3.8/dist-packages/transformers/integrations/peft.py", line 385, in active_adapters
   if isinstance(active_adapters, str):
UnboundLocalError: local variable 'active_adapters' referenced before assignment

Here is my code to reproduce the error:

from llm2vec import LLM2Vec

import torch
from transformers import AutoTokenizer, AutoModel, AutoConfig
from peft import PeftModel

# Loading base Mistral model, along with custom code that enables bidirectional connections in decoder-only LLMs. MNTP LoRA weights are merged into the base model.
tokenizer = AutoTokenizer.from_pretrained(
    "McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp"
)
config = AutoConfig.from_pretrained(
    "McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp", trust_remote_code=True
)
model = AutoModel.from_pretrained(
    "McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp",
    trust_remote_code=True,
    config=config,
    torch_dtype=torch.bfloat16,
    device_map="cuda" if torch.cuda.is_available() else "cpu",
)
model = PeftModel.from_pretrained(
    model,
    "McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp",
)
model = model.merge_and_unload()  # This can take several minutes on cpu

# Loading unsupervised SimCSE model. This loads the trained LoRA weights on top of MNTP model. Hence the final weights are -- Base model + MNTP (LoRA) + SimCSE (LoRA).
model = PeftModel.from_pretrained(
    model, "McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-unsup-simcse"
)
model = model.merge_and_unload()  # This can take several minutes on cpu
model.save_pretrained("llm2vec/McGill-NLP/to_del")
McGill NLP Group org

Hi @Mengyao00 ,

Can you share which versions of peft, transformers and huggingface-hub you are using?

The code is working on our environment which has the following library versions
transformers==4.38.1
peft==0.8.2
huggingface-hub==0.19.4

Hi @vaibhavad , thank you for your response, could you please confirm you can run my code above? I added model.save_pretrained("llm2vec/McGill-NLP/to_del") at the end, only this line caused the error, your code of loading Peft model is working.

McGill NLP Group org

Hi @Mengyao00 ,

Sorry I misunderstood your issue before. Yes, your code throws the same error on my system. On further inspection, it seems to be an issue of peft library where all peft attributes are not properly removed after merge_and_unload function. Right now, the quickest solution is to save it in the following way.

model = model.merge_and_unload()  # This can take several minutes on cpu
model._hf_peft_config_loaded = False
model.save_pretrained("llm2vec/McGill-NLP/to_del")

I will raise an issue with peft library after more detailed examination.

vaibhavad changed discussion status to closed

Sign up or log in to comment