VRAM

#53

by DataSoul - opened Apr 18, 2024

Apr 18, 2024

I totally agree this is a great model, but I'm wondering, why it requires significantly more VRAM when running compared to other models with similar parameter sizes. It's to the point where I can't use longer contexts on my setup. ( I am using the Q4 version of gguf )

shivalikasingh

Cohere For AI org Jun 18, 2024

Hey @DataSoul

Thanks for the feedback. Could you tell me which models did you find consuming lesser VRAM in comparison ?

10100101j

Jul 31, 2024

this is because of no group query attention right?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment