GGUF version

by johnnnna - opened Feb 3, 2024

Discussion

johnnnna

Feb 3, 2024

Please 🥺

senseable

Feb 5, 2024

Uploading now.

https://huggingface.co/senseable/Smaug-72B-v0.1-gguf

Model is legit

windkkk

Feb 5, 2024

@senseable we need Smaug-72B-v0.1-q2_k_m.gguf (Q2) (and i love you)

senseable

Feb 5, 2024

@windkkk It's uploading now, I tested it in 2-bit.. it's quite similar good still. The responses are framed the same way but do seem to lack a bit of depth FYI.

eramax

Feb 5, 2024

@windkkk It's uploading now, I tested it in 2-bit.. it's quite similar good still. The responses are framed the same way but do seem to lack a bit of depth FYI.

Is if OK to try SOTA quantization IQ2_XS

lunesco

Feb 5, 2024

Will Smaug 2 bit fit onto 16 GB GPU?

johnnnna

Feb 5, 2024

Thx @senseable <3

Do you guys think the q2 version is so much "dumber" than q4/q5? I might be able to run Smaug 72B q2 on my machine, will compare it to Smaug 34B Q4_M

lunesco

Feb 5, 2024

Thx @senseable <3

Do you guys think the q2 version is so much "dumber" than q4/q5? I might be able to run Smaug 72B q2 on my machine, will compare it to Smaug 34B Q4_M

share the results when you're done ;)

JohanAR

Feb 12, 2024

Will Smaug 2 bit fit onto 16 GB GPU?

No, you'll need something like 40GB to use it, but if you have enough RAM you might get a few tokens per seconds with partial offloading.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment