Text Generation
Transformers
Safetensors
dbrx
conversational
text-generation-inference

Very sensitve to any repetition penalty!

#52
by jukofyork - opened

Just in case anybody tries to use the quantized GGUF files with llama.cpp that had the DBRX PR merged in today:

Definitely make sure you reduce the repetition penalty down from the default!

Even with 1.05 is does all sorts of strange stuff like stopping mid-distance, etc and with a default of 1.1 or 1.2 it's hilariously lazy and bad! (see my post at the bottom of the llama.cpp PR).

Sign up or log in to comment