GGUF Version

by avilum - opened Apr 9, 2024

Apr 9, 2024

Hey Yam :)

Thanks for the amazing work. I really want to run this model on my macbook (32GB) using INT4/5/6 and llama.cpp.

I wondered if the 11B model can successfully convert to GGUF format with existing tools (or others?)
Google created GGUF checkpoints to the original model in FP32, followed by many quantized versions.

I think it is possible to quantize it and run on commodity HW such as Macbook with M3, 36GB (for offline work without GPUs).

Have you tried converting it to GGUF checkpoint? Anything I should have in mind before I start?
Or is it better to just use bitsandbytes with your exising model?

Thanks,
Avi

avilum

Apr 9, 2024

@yam-peleg
Update: I converted the models successfully to GGUF format using mainstream branch of llama.cpp (that has GemmaForCausalLM support)

Do you think we should upload them here or use a new model?
Let me know what you think. Here are the files:

-rw-r--r--  1 avi  staff   8.0G Apr  9 15:19 models/yam-peleg--Hebrew-Gemma-11B-Instruct-Q6_0.gguf
-rw-r--r--  1 avi  staff    39G Apr  9 15:10 models/yam-peleg--Hebrew-Gemma-11B-Instruct-f16.gguf
-rw-r--r--  1 avi  staff   8.0G Apr  9 13:47 models/yam-peleg--Hebrew-Gemma-11B-V2-Q6_0.gguf
-rw-r--r--  1 avi  staff    20G Apr  9 13:41 models/yam-peleg--Hebrew-Gemma-11B-V2-f16.gguf

Noamiko2004

5 days ago

•

edited 5 days ago

@yam-peleg
Update: I converted the models successfully to GGUF format using mainstream branch of llama.cpp (that has GemmaForCausalLM support)

Do you think we should upload them here or use a new model?
Let me know what you think. Here are the files:
-rw-r--r--  1 avi  staff   8.0G Apr  9 15:19 models/yam-peleg--Hebrew-Gemma-11B-Instruct-Q6_0.gguf
-rw-r--r--  1 avi  staff    39G Apr  9 15:10 models/yam-peleg--Hebrew-Gemma-11B-Instruct-f16.gguf
-rw-r--r--  1 avi  staff   8.0G Apr  9 13:47 models/yam-peleg--Hebrew-Gemma-11B-V2-Q6_0.gguf
-rw-r--r--  1 avi  staff    20G Apr  9 13:41 models/yam-peleg--Hebrew-Gemma-11B-V2-f16.gguf

Hey @avilum ! Im new to all of this, and I'm getting trouble trying to convert to gguf, do you by chance have the gguf to send to me somehow? I would really appriciate, thanks in advance!

avilum

1 day ago

Here is the script I used:
https://github.com/ggml-org/llama.cpp/blob/master/convert_hf_to_gguf.py

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment