20K context on this AWQ

by djuna - opened Sep 3

Sep 3

I haven't tried the original model, but this awq failed to follow instructions, ask it to looking for "private data" in ~20K token input, reply with "thanks... ". Tried older Celeste awq, it works fine.

Suparious

SolidRusT Networks org Sep 3

I am not sure if this is an issue with the Quant or the original model. I usually don't care what the model does, if my inference testing / validation is passing.

I was able to get great output, using the example in this repos README.md.

Do you have a method I could use, to reproduce the error you have discovered?

If it is a problem with the quant, I could improve my validation efforts, otherwise, it would be an issue outside of my ability to control.

djuna

Sep 3

I tried 3 model, twinllama dpo 3, this model, and Celeste 1.5.
Well. Only this model didn't pass the test.
Test that I use is something like this

"Find private information in this block of text that shouldn't be put in the text."
And then put something like name, birthday, and other thing. In random places in the text.
I use dummy text generator, then I paste it repeatedly to get ~20K token or ~18k words

Suparious

SolidRusT Networks org Sep 3

•

edited Sep 3

I just built AutoAWQ_kernels from source, took forever.
I will rename this repo, and then reprocess it.

This is the basic idea, to unpin / unlock the torch and transformers versions in the AutoAWQ project: https://github.com/SolidRusT/srt-model-quantizing/blob/main/awq/create_virtualenv.sh

All the machines are busy right now, but this will be the next quant in the queue.

djuna

Sep 3

Can you quant roleplay Hermes?, that model seems interesting

Suparious

SolidRusT Networks org Sep 3

Absolutely! now that the pipeline is working, and stable, I am LFQ hard core. I want to hit 500 models today! ^^
I'll queue up anything realted to Hermes, that I haven't quanted.
if you have any specific models, I cam prioritize them
I have 2 machines dedicated to AWQ right now and they take about 20mins or so each