Can't load phi-2 model anymore after recent code updates

#79
by luciodery - opened

Hi, I am using transformers 4.28.0 (I need to keep this version of transformer for reproducibility).
It seems that the most recent code updates broke my code :(

Leading to tensor name mismatches as shown in the snapshot attached.
Could you help me figure out how to fix this ?
Screen Shot 2024-01-12 at 1.42.19 AM.png

Hi,
I am having a similar issue.
I haven't figured out a solution yet, but it seems like this is due to an update 9 hours ago.
The update is this commit (cb2f4533604d8b67de604e7df03bfe6f3ca22869).

My issue was fixed by setting trust_remote_code=True.
FYI, mine was probably unrelated to the update. I had not set the argument that I previously used to.

Microsoft org

Hello @kiyoonyoo and @luciodery !

Please always use trust_remote_code=True if below transformers==4.37.0.

Regards,
Gustavo.

gugarosa changed discussion status to closed
luciodery changed discussion status to open

Hi @gugarosa ,
This was not the issue. I already had trust_remote_code.
Also, I was not using the automodel -- I am directly using the phiForCasualLLM import.
Please advise. Thanks

Hello @kiyoonyoo and @luciodery !

Please always use trust_remote_code=True if below transformers==4.37.0.

Regards,
Gustavo.

Hi gugarosa,

the setup you described does not run with the most recent update of phi2. I've tried to get it to run with transformer versions < 4.37.0 to no avail. I've not encountered any issues at all before this update. Here's the basic setup and error message:

from transformers import AutoModelForCausalLM, AutoTokenizer

Initialize your model and tokenizer

model_name = "microsoft/phi-2"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="cuda", trust_remote_code=True)


ImportError Traceback (most recent call last)

in <cell line: 4>()
2 model_name = "microsoft/phi-2"
3 tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
----> 4 model = AutoModelForCausalLM.from_pretrained(model_name,
5 torch_dtype="auto",
6 device_map="cuda",

11 frames

/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py in
52 )
53 from .safetensors_conversion import auto_conversion
---> 54 from .utils import (
55 ADAPTER_SAFE_WEIGHTS_NAME,
56 ADAPTER_WEIGHTS_NAME,

ImportError: cannot import name 'is_torch_sdpa_available' from 'transformers.utils' (/usr/local/lib/python3.10/dist-packages/transformers/utils/init.py)

maybe !pip install transformers==4.36.1

How do I load a finetuned phi-2 model from before this update? @gugarosa

Or better yet, convert it so there’s no friction when other people try to load it? Just rename the layers based on this commit?

Hello @kiyoonyoo and @luciodery !

Please always use trust_remote_code=True if below transformers==4.37.0.

Regards,
Gustavo.

ok, so I got a workaround found in a thread here [1] - instead of "pip install transformers==" git clone from source of hugging face transformer repo:

!pip install git+https://github.com/huggingface/transformers

Reference:
[1] https://huggingface.co/DiscoResearch/mixtral-7b-8expert/discussions/9

is it workaround with revision use?

is it workaround with revision use?

yes.

just reminder if you try workaround with revision also set code_revision

Microsoft org

How do I load a finetuned phi-2 model from before this update? @gugarosa

Or better yet, convert it so there’s no friction when other people try to load it? Just rename the layers based on this commit?

This conversion script should work out for your case: https://github.com/huggingface/transformers/blob/main/src/transformers/models/phi/convert_phi_weights_to_hf.py

Microsoft org

Besides, I have updated the README.md to clear indicate the two alternatives we have for loading Phi-2 (until 4.37.0 is officially released).

How do I load a finetuned phi-2 model from before this update? @gugarosa

Or better yet, convert it so there’s no friction when other people try to load it? Just rename the layers based on this commit?

This conversion script should work out for your case: https://github.com/huggingface/transformers/blob/main/src/transformers/models/phi/convert_phi_weights_to_hf.py

Legend. Thank you!

gugarosa changed discussion status to closed

Sign up or log in to comment