quantized question

#2
by jokemodel1710 - opened

I'm a beginner at exllamav2 quantization i've successfully gotten any Exl2 below 72B and 70B but 123B gives me trouble I have a 48 VRAM and 64 RAM it always fails when it's writing the shards do you think i need more RAM? maybe 128? I like this model and would like to make my own quantizations.

I'm a beginner at exllamav2 quantization i've successfully gotten any Exl2 below 72B and 70B but 123B gives me trouble I have a 48 VRAM and 64 RAM it always fails when it's writing the shards do you think i need more RAM? maybe 128? I like this model and would like to make my own quantizations.

Hey @jokemodel1710 !
I'm not sure what is causing you issues? I created these quants with 48gb of VRAM and 50GB of RAM.. it's odd that it dies at the very end when it writes the shards. No disk space issues? Have you tried smaller shards?

Are you starting with the original unquantized files as your source?

+1 I've manage to quant 123b models with my second rig, 64gb system ram, 24gb vram (a single rtx3090)

Check disk space with

df -h

verify you can write to the destination as well:
touch /path/to/destination/test
(then delete the test file rm -f /path/to/test)

One more thing that comes to mind, you're not using FAT32 are you (with the 4GB filesize limit)?

That being said, see click on BigHuggyD's profile, chances are he's already made a quant close enough to the size you're after

Sign up or log in to comment