https://huggingface.co/Cran-May/Shi-Ci-Vision

#213
by Cran-May - opened

Cran-May/Shi-Ci-Vision

I do not think thsi is supported by llama.cpp, but I will try.

mradermacher changed discussion status to closed

Unfortunately, the model can't be downloaded:

requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: ...

Feels like a bug on the huggingface side to me.

I'm experiencing the same situation, it's ridiculous, I'll finish re-uploading in an hour.

mradermacher changed discussion status to open

It's available now.

Yup, but, as I feared:

ERROR:hf-to-gguf:Model MiniCPMV is not supported

the "nearest" supported architecture seems to be MiniCPMForCausalLM

mradermacher changed discussion status to closed

if the non-vision part is fully compatible to MiniCPMForCausalLM I could try changing the architecture

Not quite.

ValueError: Can not map tensor 'llm.model.embed_tokens.weight'

Possibly this is a regression in llama.cpp that simply has never been fixed: https://github.com/ggerganov/llama.cpp/issues/5276

okay…… But openbmb/MiniCPM-Llama3-V-2_5-gguf can run in lmstudio.

Maybe we need to use this fork https://github.com/OpenBMB/llama.cpp/tree/minicpm-v2.5/examples/minicpmv

No sorry I just tried using the OpenBMB/llama.cpp fork and am getting the following error telling me that MiniCPMV is not supported when trying to convert this model into a GGUF.

python .\convert_hf_to_gguf.py Shi-Ci-Vision
INFO:hf-to-gguf:Loading model: Shi-Ci-Vision
ERROR:hf-to-gguf:Model MiniCPMV is not supported

When changing the architecture to MiniCPMForCausalLM I'm too getting the following error:

ValueError: Can not map tensor 'llm.model.embed_tokens.weight'

Being able to convert the model into a GGUF file is a requirement for being able to quantize it. This doesn't seem to be possible with current llama.cpp versions. Luckily the model is also not that big so using an RTX 3090/RTX 4090 you should be easily able to run it unquantized.

As their description
'''[2024.08.10] πŸš€πŸš€πŸš€ MiniCPM-Llama3-V 2.5 is now fully supported by official llama.cpp! GGUF models of various sizes are available here.'''
https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5
But i cant quant it too now. Maybe we should try again a few days later.

As their description
'''[2024.08.10] πŸš€πŸš€πŸš€ MiniCPM-Llama3-V 2.5 is now fully supported by official llama.cpp! GGUF models of various sizes are available here.'''
https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5
But i cant quant it too now. Maybe we should try again a few days later.

Someone just created issue #8977 ("Feature Request: MiniCPM 2.6 model support?) in llama.cpp. I recommend you follow it and let us know if they decide if they add support for it. Regarding openbmb/MiniCPM I have no idea why it's currently not supported despite them claiming it is. You could create an issue there and ask them if you want. In any case mradermacher is only doing quants based on official llama.cpp releases as far I'm aware so the conclusion of #8977 is more important.

You got even luckier. There now is PR #8967 ("support MiniCPM-V-2.6"). Once merged we can try to quantize this model.

Sign up or log in to comment