TheDrummer/Big-Tiger-Gemma-27B-v1 · Could not load this model

Jul 18, 2024

I tried to load it into text-generation-webui via Transformers, but the model either did not load and froze (at 7/12 or even at 12/12) or produced a blue screen with a Memory_Menegment error.
Maybe my computer just can't handle it.
CONFIG:
CPU: Ryzen 2700
Vram: RTX 4070)12 gb)
Ram: 32 gb

TheDrummer

Owner Jul 18, 2024

Works on Kobold

NetralGD

Jul 18, 2024

Emm, no, not work

TheDrummer

Owner Jul 18, 2024

Did you get the latest version? https://github.com/LostRuins/koboldcpp/releases/tag/v1.70.1

NetralGD

Jul 18, 2024

Ha? No.
I try to use Kobold Client AI. Which works through the browser.
I also tried this koboldcpp but it only loads GGML and GGUF models. It doesn’t work with the original ones or I haven’t found how to run them.
The gguf format model worked for me on text-generation-webui, but even the q4 model was slow and with only half the context.

ToastyPigeon

Jul 18, 2024

Hey there,

The model is about 55GB in size at full precision, so it will not load via Transformers on your specs - even at 4-bit it will be too large for your GPU.
You'll need to use a GGUF file and split between your GPU and CPU for inference.

The "Kobold AI" Main branch is very out of date. We recommend KoboldCPP if you want to run this locally.
Oobabooga Text Gen UI with llamacpp can work too, but make sure it's updated to include Gemma 2 support.

You can get GGUF versions of the model at: https://huggingface.co/TheDrummer/Big-Tiger-Gemma-27B-v1-GGUF (this is also listed in the model card, along with other quants!)