[Model request] Saily 100b, Saily 220b

#5
by Perpetuity7 - opened

The size of saily-100b-Q2_K.gguf is 49.6 GB. I expect that I can load the full layer on rtx 3090, 4090 dual graphic card or Google Colab Pro if you quantize this model with IQ2_XXS (bpw2.06).

And the size of saily_220b.Q2_K.gguf is also 87.80 GB, so I think it will be possible to load the full layer on runpod a100 80gb if you quantize it with IQ2_XXS (bpw2.06).

Besides that, my personal wish is that you would make mixtral-7b×8-instruct-LimaRP-zloss and Openbuddy-mixtral-7b×8-v16.3-32k models into IQ2_XSS GGUF as well. This seems to be useful for 16GB graphics card users. Thank you.

100B+ models go beyond the computational resources I have available. But I have now contributed everything that is required
to prepare such models to the llama.cpp project, so hopefully someone who has access to computers with more RAM/VRAM/compute can do this.

Sign up or log in to comment