Which Quant of this model will fit in VRAM entirely on a single 24GB video card (4090)
#1
by
clevnumb
- opened
Running OOBE or SillyTavern (with Tabby) in Windows 11 if that helps. Thanks.
I am running it at 5bpw on a single 3090, which at 8192 context uses around 21GB, so plenty room for more context if nothing else is using the vram. (windows can vary how much it uses, check task manager)
lucyknada
changed discussion status to
closed
Thanks! Do you notice 5BPW being notable better than 4BPW?
4bpw should be fine too, I haven't compared side-by-side however.