Trying to use llama-2-7b-chat.Q4_K_M.gguf with/without tensorflow weights

by cgthayer - opened

n00bie question:
The libs think this has tensorflow weights, but "from_tf=True" doesn't resolve.
What am I doing wrong here?

from transformers import AutoModelForCausalLM
model_file = "llama-2-7b-chat.Q4_K_M.gguf"
model = AutoModelForCausalLM.from_pretrained(
    "TheBloke/Llama-2-7b-Chat-GGUF", model_file=model_file, model_type="llama", gpu_layers=50, from_tf=True)'''

Gives me (on google colab):

OSError Traceback (most recent call last)
in <cell line: 3>()
1 from transformers import LlamaForCausalLM, LlamaTokenizer, AutoModelForCausalLM
2 model_file = "llama-2-7b-chat.Q4_K_M.gguf"
----> 3 model = AutoModelForCausalLM.from_pretrained(
4 "TheBloke/Llama-2-7b-Chat-GGUF", model_file=model_file, model_type="llama", gpu_layers=50, from_tf=True)

1 frames
/usr/local/lib/python3.10/dist-packages/transformers/ in from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs)
3384 }
3385 if has_file(pretrained_model_name_or_path, TF2_WEIGHTS_NAME, **has_file_kwargs):
-> 3386 raise EnvironmentError(
3387 f"{pretrained_model_name_or_path} does not appear to have a file named"
3388 f" {_add_variant(WEIGHTS_NAME, variant)} but there is a file for TensorFlow weights."

OSError: TheBloke/Llama-2-7b-Chat-GGUF does not appear to have a file named pytorch_model.bin but there is a file for TensorFlow weights. Use from_tf=True to load this model from those weights.

I get this error with or without the "from_tf=True", did this parameter name change without an update to the EnvironmentError?

@cgthayer yeah the problem is huggingface does not support gguf models, and also I would not recommend using llama 2 7b since a MUCH better llama 3 8b came out. Its at least 2-3x better and not as censored. For gguf files, just search llama 3 8b gguf in huggingface.

To use gguf models, you can use llama.cpp or anything that uses it(text generation web ui, llama cpp python, lm studio, and much more)

Sign up or log in to comment