System requirements for this model?

by Kelmeilia - opened Jan 22, 2024

Jan 22, 2024

I didn't find any information about the requirements this model has for computational capacity.. Maybe someone with better skills can deduct them from parameter count or something?

I am wondering if this can be run locally with a rtx 4080 graphics card and 16 GBs of memory?

mpasila

Jan 22, 2024

If you want to run it unquantized then no (it will need a minimum of 48gb for that) but if you use bitsandbytes and load it in 4bits then it needs about 20gb and with GPTQ at 3bits it needs about 15gb but it still ran out of memory on a 16GB GPU when I tried loading it with this TheBloke/Poro-34B-GPTQ:gptq-3bit-128g-actorder_True. So the best bet might be using a GGUF version by TheBloke. If someone makes an EXL2 variants of this model at around 3bpw or lower it might actually fit on a 16gb card no problem.

jonabur changed discussion status to closed Feb 21, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment