Model file 'goliath-120b.Q4_K_M.gguf' not found

#2
by belhal - opened

Code

from ctransformers import AutoModelForCausalLM
llm = AutoModelForCausalLM.from_pretrained("TheBloke/goliath-120b-GGUF", model_file="goliath-120b.Q4_K_M.gguf", model_type="llama", gpu_layers=50)

triggers

Model file 'goliath-120b.Q4_K_M.gguf' not found in '/root/.cache/huggingface/hub/models--TheBloke--goliath-120b-GGUF/snapshots/48761cf00d6a797942f66bb9c120ed6c18998c86'

The files are too big to upload to Hugging Face as a single file. You'll need to download the separate splits and then join them. Instructions are in the README

The files are too big to upload to Hugging Face as a single file. You'll need to download the separate splits and then join them. Instructions are in the README

What way were GGUF files splitted? I want to split it to even smaller sizes so that I could push it as different layers into docker and reassemble later.

If you can cat the files together that means there is nothing special about the split

cat goliath-120b.Q6_K.gguf-split-* > goliath-120b.Q6_K.gguf

So just split them further with the linux split command or even a simple C program

https://superuser.com/questions/160364/what-is-the-fastest-and-most-reliable-way-to-split-a-50gb-binary-file-into-chunk

Chat GPT's answer
https://chat.openai.com/share/452b83e6-61d7-4233-90e1-4b05cbef6abf

Yes that's correct. They are simple UNIX split bye-split files, so you can split them again if you want.

Hey @TheBloke is there an easier way to use the split command to split the file evenly in half than having to specify in bytes?

Sign up or log in to comment