Using Ctransformers for inference gives an error

by ML610 - opened Oct 23, 2023

Oct 23, 2023

Using Ctransformers for inference gives the below error:

RuntimeError: Failed to create LLM 'llama' from '/root/.cache/huggingface/hub/models--TheBloke--CausalLM-7B-GGUF/blobs/b4cc1474bca5044e278ca58c443b13881dc5c7f4beef151f02f3afe278510f6d'.

even if I set model_type='qwen', I get a similar error:

Failed to create LLM 'qwen' from '/root/.cache/huggingface/hub/models--TheBloke--CausalLM-7B-GGUF/blobs/b4cc1474bca5044e278ca58c443b13881dc5c7f4beef151f02f3afe278510f6d'.

TheBloke

Owner Oct 23, 2023

The originally uploaded GGUFs had an error. Please try re-downloading; the newly uploaded GGUFs are confirmed to work with latest llama.cpp.

However I can't guarantee support with ctransformers at this time as it's not been updated in over 6 weeks. But try it, see what happens.

ML610

Oct 24, 2023

•

edited Oct 24, 2023

Thanks for updating! I was using ctransformers in a colab notebook to try out these models. But now the updated models are crashing the notebook for some reason. I guess we will have to wait for some update toctransformers.

mirAi05

Nov 7, 2023

Thanks for updating! I was using ctransformers in a colab notebook to try out these models. But now the updated models are crashing the notebook for some reason. I guess we will have to wait for some update toctransformers.

Is the issue resolved brother? Could you please tell where are the newly uploaded models?

ML610

Nov 7, 2023

Is the issue resolved brother? Could you please tell where are the newly uploaded models?

Nope. Loading the model crashes the colab notebook. Ctransformers hasn't been updated for over 2 months, I think it needs to be updated to run this model.

mirAi05

Nov 8, 2023

Is the issue resolved brother? Could you please tell where are the newly uploaded models?

Nope. Loading the model crashes the colab notebook. Ctransformers hasn't been updated for over 2 months, I think it needs to be updated to run this model.

If you can share any alternate you found for running these quantized models?

TheBloke

Owner Nov 8, 2023

Unfortunately ctransformers has not been updated in a while, and doesn't support this model. But llama-cpp-python was updated the day before yesterday, and should support it. Please try that.

aryachakraborty

Mar 11, 2024

I am facing the same error , I am using GGUF version of a "fine tuned GEMMA-2B-it model. " model link--> https://huggingface.co/Shritama/GEMMA-2b-GGUF/tree/main
Now while inferencing it 's showing something like this ~
[RuntimeError: Failed to create LLM 'gguf' from 'D:\ISnartech Folder\Project_Folder\Streamlit APP\GgufModels\Q4_K_M.gguf'. ]
please help.

YaTharThShaRma999

Mar 11, 2024

@aryachakraborty as TheBloke said, dont use ctransformers. It is pretty outdated.

Use llama-cpp-python for inferencing in python or just llama.cpp cli inference.

Its faster, supports much more sampling, and more things like grammar, regex.

Ctransformers uses llama.cpp from behind the scene but a much more outdated version.

bhavyatashah

Dec 2, 2024

This comment has been hidden

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment