Aug 1, 2023

OSError: TheBloke/Llama-2-7B-GGML does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.

getting this error while loading the model using the same code mentioned

Load model directly

from transformers import AutoModel
model = AutoModel.from_pretrained("TheBloke/Llama-2-7B-GGML")

YaTharThShaRma999

Aug 1, 2023

this is not for transformers, this is for llama cpp, ctransformers, llama cpp python, text generation webui.

prakash1524

Aug 2, 2023

This comment has been hidden

prakash1524

Aug 2, 2023

can we fine tune this model (ggml model) or we will have to fine tune original model then convert it to ggml format as mentioned on some github repo,just a beginner doubt?
i finetuned a sharded model but not able to fine tune this model getting error TheBloke/Llama-2-7B-GGML does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.

TheBloke

Owner Aug 2, 2023

You can't fine tune a GGML model using Python/transformers code. There is some training support in llama.cpp that you might be able to use, but I don't have any experience of it.

The general procedure is:

Fine tune the original unquantised model. This can be done as:
- a full training in float16 - very expensive
- a LoRA adapter training in float16 - less expensive
- or a LoRA training in 4-bit, known as QLoRA - much cheaper.
Whichever is chosen, the result will be a new unquantised float16 model
That can then be quantised in GGML, and that GGML then used for inference.

TheBloke
/

Llama-2-7B-GGML

error in loading the model using colab

Load model directly