How to run these quantised model.

#17
by Tarun1986 - opened

I tried running the model using release of lamma.cpp git. I get the floowing error:-

main.exe -m ggml-vic13b-uncensored-q4_3.bin -n -1 --color -r "User:" --in-prefix " " -e --prompt "User: Hi\nAI: Hello. I am an AI chatbot. Would you like to talk?\nUser: Sure!\nAI: What would you like to talk about?\nUser:"

main: build = 518 (1f48b0a)
main: seed = 1683545508
llama.cpp: loading model from ggml-vic13b-uncensored-q4_3.bin
error loading model: unrecognized tensor type 5

llama_init_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load m

q4_3 has been removed - use q5_1

Hey eachadea,

I too face same issue even after changing the model to q5_1, any idea ? before facing this issue i used to run ggml-vic13b-q4_1.bin model

main: seed = 1683693461
llama.cpp: loading model from ./models/ggml-vic13b-q5_1.bin
error loading model: unrecognized tensor type 7

llama_init_from_file: failed to load model
main: error: failed to load model './models/ggml-vic13b-q5_1.bin'

That means you're running an old version of llama.cpp. Do a git pull then make

I am trying to run in text-generation-webui but fails with this error: any idea?

$ python server.py --auto-devices --chat --model eachadea_ggml-vicuna-13b-4bit
INFO:generated new fontManager
INFO:Gradio HTTP request redirected to localhost :)
bin /data/miniconda3/envs/textgen/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so
/data/miniconda3/envs/textgen/lib/python3.10/site-packages/bitsandbytes/cextension.py:33: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
warn("The installed version of bitsandbytes was compiled without GPU support. "
INFO:Loading eachadea_ggml-vicuna-13b-4bit...
Traceback (most recent call last):
File "/data/miniconda3/envs/textgen/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 259, in hf_raise_for_status
response.raise_for_status()
File "/data/miniconda3/envs/textgen/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/models/eachadea_ggml-vicuna-13b-4bit/resolve/main/config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/data//envs/textgen/lib/python3.10/site-packages/transformers/utils/hub.py", line 417, in cached_file
resolved_file = hf_hub_download(
File "/data/miniconda3/envs/textgen/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
return fn(*args, **kwargs)
File "/data/miniconda3/envs/textgen/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1195, in hf_hub_download
metadata = get_hf_file_metadata(
File "/data/miniconda3/envs/textgen/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
return fn(*args, **kwargs)
File "/data/miniconda3/envs/textgen/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1541, in get_hf_file_metadata
hf_raise_for_status(r)
File "/data/miniconda3/envs/textgen/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 291, in hf_raise_for_status
raise RepositoryNotFoundError(message, response) from e
huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-645bdf2a-21dd20f9390cd6aa10267b52)

Repository Not Found for url: https://huggingface.co/models/eachadea_ggml-vicuna-13b-4bit/resolve/main/config.json.
Please make sure you specified the correct repo_id and repo_type.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/data//text-generation-webui/server.py", line 919, in
shared.model, shared.tokenizer = load_model(shared.model_name)
File "/data/text-generation-webui/modules/models.py", line 74, in load_model
shared.model_type = find_model_type(model_name)
File "/data//text-generation-webui/modules/models.py", line 62, in find_model_type
config = AutoConfig.from_pretrained(Path(f'{shared.args.model_dir}/{model_name}'), trust_remote_code=shared.args.trust_remote_code)
File "/data//miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 928, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/data//miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/configuration_utils.py", line 574, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/data//miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/configuration_utils.py", line 629, in _get_config_dict
resolved_config_file = cached_file(
File "/data//miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/utils/hub.py", line 433, in cached_file
raise EnvironmentError(
OSError: models/eachadea_ggml-vicuna-13b-4bit is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with use_auth_token or log in with huggingface-cli login and pass use_auth_token=True.

Sign up or log in to comment