Word of thanks and suggestions for more 120B models

#6
by Xonaz81 - opened

Hello there!

I recently found your collection and I love it! Thanks for all your great work.

Might I ask if you can consider adding the new IQ 2-bit XS quants for Goliath-120B and Tess-XL-v1.0-120B. These models seem to be even better from both benchmarks and anecdotal evidence posted by others. 120B is also more resistant to the loss of 2-Bit quants. Thanks in advance and keep up the good work :)

I wasn't planning to make a lot of more models. Currently trying to upload a miqu 120b model. I also want to do bigweave. but that's pretty much it. there are now also others doing this kind of quantizations :)

KnutJaegersberg changed discussion status to closed

I wasn't planning to make a lot of more models. Currently trying to upload a miqu 120b model. I also want to do bigweave. but that's pretty much it. there are now also others doing this kind of quantizations :)

Okay that's fine. Do you use local or cloud to create these quants? and can you still tell me how (on what) you generated the imatrix files? :)

local workstation, on cpu only, i9 but already 3 years old. I don't think ram requirements are high, though. with 100 chunks, making an imatrix on the 20k random words dataset takes 10 hours up to a day, depending on model size. quantization takes 1-2 hours.

I simply took the file 20k_random_data.txt from this discussion

https://github.com/ggerganov/llama.cpp/discussions/5006

appreciated! thanks

Sign up or log in to comment