Odd behaviour with '.

#1
by FiditeNemini - opened

Hi, not a major, but I've found that the model seems to consistently use '###s instead of just ' when generating text for possessives. Perhaps training data was contaminated by a script to cleanse other data? Example. Instead of generating "John's motorbike", it will generate "John'###s motorbike." Other than that, it's a really decent model, very nicely steerable!

DreamGen org

Hi there!

I did not yet have the chance to properly evaluate v0.5, so it's possible that I have introduced some bug.
Did you have a chance to try the v0 (https://huggingface.co/dreamgen/opus-v0-70b), and if so, do you see the same problem there as well?

And would you mind sharing more information on how you are running the model (the raw input (prompt), sampling params, and the software you use to run it)? (You can DM me on Discord if you prefer).

Best,
DreamGen.

G'day. I haven't tried it on the v0 yet, but I'll try it as soon as I can. I'm running it through LMStudio for testing, but have also tried through the CLI using llama.cpp with temp 0.5-0.8, n_predict -1, top_p 90, top_k 40, repeat_penalty 1.125., and the default system prompt format. I added "Use ' not '###s to denote possessives." to the setting part of the prompt to fix the generated text. Works very well now. ps, I'm also using q5_k_m gguf quantised model, if that helps.

Sign up or log in to comment