VRAM

#53
by DataSoul - opened

I totally agree this is a great model, but I'm wondering, why it requires significantly more VRAM when running compared to other models with similar parameter sizes. It's to the point where I can't use longer contexts on my setup. ( I am using the Q4 version of gguf )

Sign up or log in to comment