This model is not working

#1
by rombodawg - opened

In text gen web ui with "Truncate the prompt up to this length" set above 2k it will only output 1 token but at 2k it will repeat tokens over and over. I dont know how to fix it

Did you apply the monkey patch? Refer to https://huggingface.co/kaiokendev/superhot-30b-8k-no-rlhf-test/tree/main on the .py file to use it on transformers.

Yup i just saw this, I asked the bloke and he told me this. Ill leave it here for anyone else that needs help

Grab the two .py files from any of (TheBloke's) fp16 SuperHOT models and save them with the rest of the model files
Edit config.json for Panchovix's fp16, and add this:
"auto_map": {
"AutoModel": "modelling_llama.LlamaModel",
"AutoModelForCausalLM": "modelling_llama.LlamaForCausalLM",
"AutoModelForSequenceClassification": "modelling_llama.LlamaForSequenceClassification"
},
Now load the model with trust_remote_code=True
Test again with that, it should work.

It did fix the issue by the way

Nice! Gonna keep the discussion open if someone elses sees it.

Closing as fixed!

Panchovix changed discussion status to closed

Sign up or log in to comment