Text Generation
Transformers
GGUF
English
Chinese
llama
llama2
qwen
text-generation-inference

Using Ctransformers for inference gives an error

#3
by ML610 - opened

Using Ctransformers for inference gives the below error:

RuntimeError: Failed to create LLM 'llama' from '/root/.cache/huggingface/hub/models--TheBloke--CausalLM-7B-GGUF/blobs/b4cc1474bca5044e278ca58c443b13881dc5c7f4beef151f02f3afe278510f6d'.

even if I set model_type='qwen', I get a similar error:

Failed to create LLM 'qwen' from '/root/.cache/huggingface/hub/models--TheBloke--CausalLM-7B-GGUF/blobs/b4cc1474bca5044e278ca58c443b13881dc5c7f4beef151f02f3afe278510f6d'.

The originally uploaded GGUFs had an error. Please try re-downloading; the newly uploaded GGUFs are confirmed to work with latest llama.cpp.

However I can't guarantee support with ctransformers at this time as it's not been updated in over 6 weeks. But try it, see what happens.

Thanks for updating! I was using ctransformers in a colab notebook to try out these models. But now the updated models are crashing the notebook for some reason. I guess we will have to wait for some update toctransformers.

Thanks for updating! I was using ctransformers in a colab notebook to try out these models. But now the updated models are crashing the notebook for some reason. I guess we will have to wait for some update toctransformers.

Is the issue resolved brother? Could you please tell where are the newly uploaded models?

Is the issue resolved brother? Could you please tell where are the newly uploaded models?

Nope. Loading the model crashes the colab notebook. Ctransformers hasn't been updated for over 2 months, I think it needs to be updated to run this model.

Is the issue resolved brother? Could you please tell where are the newly uploaded models?

Nope. Loading the model crashes the colab notebook. Ctransformers hasn't been updated for over 2 months, I think it needs to be updated to run this model.

If you can share any alternate you found for running these quantized models?

Unfortunately ctransformers has not been updated in a while, and doesn't support this model. But llama-cpp-python was updated the day before yesterday, and should support it. Please try that.

I am facing the same error , I am using GGUF version of a "fine tuned GEMMA-2B-it model. " model link--> https://huggingface.co/Shritama/GEMMA-2b-GGUF/tree/main
Now while inferencing it 's showing something like this ~
[RuntimeError: Failed to create LLM 'gguf' from 'D:\ISnartech Folder\Project_Folder\Streamlit APP\GgufModels\Q4_K_M.gguf'. ]
please help.

@aryachakraborty as TheBloke said, dont use ctransformers. It is pretty outdated.

Use llama-cpp-python for inferencing in python or just llama.cpp cli inference.

Its faster, supports much more sampling, and more things like grammar, regex.

Ctransformers uses llama.cpp from behind the scene but a much more outdated version.

Sign up or log in to comment