How did you create this GGUF?

#1
by Venkman42 - opened

I'm trying to do it myself but i keep getting errors like

Traceback (most recent call last):
File "/content/llama.cpp/convert.py", line 1483, in
main()
File "/content/llama.cpp/convert.py", line 1419, in main
model_plus = load_some_model(args.model)
File "/content/llama.cpp/convert.py", line 1280, in load_some_model
model_plus = merge_multifile_models(models_plus)
File "/content/llama.cpp/convert.py", line 730, in merge_multifile_models
model = merge_sharded([mp.model for mp in models_plus])
File "/content/llama.cpp/convert.py", line 709, in merge_sharded
return {name: convert(name) for name in names}
File "/content/llama.cpp/convert.py", line 709, in
return {name: convert(name) for name in names}
File "/content/llama.cpp/convert.py", line 684, in convert
lazy_tensors: list[LazyTensor] = [model[name] for model in models]
File "/content/llama.cpp/convert.py", line 684, in
lazy_tensors: list[LazyTensor] = [model[name] for model in models]
KeyError: 'transformer.embd.wte.weight'
Traceback (most recent call last):
File "/content/./quantizeHFmodel/quantizeHFmodel.py", line 33, in
download_and_quantize_model(model_id)
File "/content/./quantizeHFmodel/quantizeHFmodel.py", line 18, in download_and_quantize_model
subprocess.run(["python", "llama.cpp/convert.py", local_dir, "--outtype", "f16", "--outfile", fp16_file], check=True)
File "/usr/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['python', 'llama.cpp/convert.py', 'phi-2-orange', '--outtype', 'f16', '--outfile', 'phi-2-orange/phi-2-orange.f16.gguf']' returned non-zero exit status 1.

Could you please tell me if you did anything different?
Thanks in advance :)

hey there @Venkman42 !

I believe since Microsoft made some modeling code changes Phi-2's architecture, llama.cpp was updated to reflect that, and as a side-effect, this model is no longer compatible with the current implementation as far as conversion goes. I suppose you might be able to go back to the revision I specified in the README which was before those changes and things should still work β€” at least, in theory!

Converted using llama.cpp revision de473f5, the last compatible version before Microsoft's incompatible modeling changes were introduced to llama.cpp.

This won't make any difference on the latest version (I tried it myself to be sure!) but additionally, I would also recommend running the convert-hf-to-gguf.py script as my (admittedly fuzzy) understanding is that the regular convert.py is intended for llama/mistral architectures, while the hf-focused converter makes certain assumptions that are compatible w running models using hf's transformers python library, roughly. I have worked this logic into a simple script here.

Let me know if you have other questions!

Britt

hey there @Venkman42 !

I believe since Microsoft made some modeling code changes Phi-2's architecture, llama.cpp was updated to reflect that, and as a side-effect, this model is no longer compatible with the current implementation as far as conversion goes. I suppose you might be able to go back to the revision I specified in the README which was before those changes and things should still work β€” at least, in theory!

Converted using llama.cpp revision de473f5, the last compatible version before Microsoft's incompatible modeling changes were introduced to llama.cpp.

This won't make any difference on the latest version (I tried it myself to be sure!) but additionally, I would also recommend running the convert-hf-to-gguf.py script as my (admittedly fuzzy) understanding is that the regular convert.py is intended for llama/mistral architectures, while the hf-focused converter makes certain assumptions that are compatible w running models using hf's transformers python library, roughly. I have worked this logic into a simple script here.

Let me know if you have other questions!

Britt

I got it to work! Thank you a lot for your help and insight :)

Glad to hear it! Enjoy :D

brittlewis12 changed discussion status to closed

I credited you on my new model merge https://huggingface.co/Venkman42/Phiter
for helping me with the GGUFs, I hope that's okay :) feel free to check it out. It outperforms orange on the "YALL" Leaderboard by Maxime Labonne

congrats!! seems quite promising from some light early tests, gotta love small models that punch above their weight! appreciate the nod :D

Thank you, it was a complete surprise. It was my first working Modelmerge, so I guess I can't complain 😁 surprisingly good for a simple merge done by a rookie haha

Sign up or log in to comment