Transformers
GGUF
English
mistral
text-generation-inference

These models do not seem to work with large context

#3
by BloodOfTheRock - opened

See here: https://huggingface.co/LoneStriker/Yarn-Mistral-7b-128k-8.0bpw-h8-exl2/discussions/1

I am using ooba to test these and the exl2 version. They do not seem to function and just spit out gibberish when you attempt to use larger context windows. Not sure if it is a settings issue (I have looked them over and made sure seq length etc all are well above what I am feeding it) or what, but since the large context window seemed like the purpose of the models, I am not sure what the benefit of these models are if larger context does not work.

Same for me.

What do you mean gibberish? btw this is a base model so it wont perform well at any instructionst. How are you using it?

Sign up or log in to comment