VRAM requirements

by practical-dreamer - opened Nov 12, 2023

Discussion

practical-dreamer

Nov 12, 2023

Pancho, any way these 120B models would run on the dual 3090 4090 setups?

Even at 3bit there’s just no way right?…

Would llamacpp+ with gpu VRAM offload even be an option or would performance suck so hard not with it?

Just wonderin

-Generic Username

mpasila

Nov 13, 2023

•

edited Nov 13, 2023

3bpw will need about 48gb if you run it with 8bit cache, it's just enough memory to work. Using it with GGUF is pretty slow so I wouldn't recommend that.

Panchovix

Owner Nov 13, 2023

•

edited Nov 13, 2023

Hi there! Long time!

As @mpasila said, it will work with 48GB VRAM. It can work with full fp16 cache with 3k context, or 8bit cache with, maybe, 4096 context (I haven't done enough tests on 2x48 GB), but 3bpw was made with 48GB VRAM in my mind.

Panchovix changed discussion status to closed Nov 19, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment