About llama2-22b-daydreamer-v3

#2
by w-a-cat - opened

I tested your llama2-22b-daydreamer-v2 and llama2-22b-daydreamer-v3.
What can I say? The models surprised me. These are the first models that need to be restrained in the impulses of creativity. Other models have to be stimulated to get the text.

daydreamer-v2 is a pretty docile model, but it throws too many frills and bows into the text, making it look overloaded.

I will definitely leave daydreamer-v2 for experimentation. Perhaps this creativity can activate other networks' ability to write beautiful texts.

daydreamer-v3 is more restrained in the amount of embellishment for texts and writes more beautifully, but it has lost the ability to follow the task. So he can easily forget about the characters' names and even the task's plot. Unfortunately, daydreamer-v3 is far from normal models in terms of text writing quality.
I tend to think that it is worth polishing daydreamer-v2 by trying to keep it able to follow the task.

Primary criteria for me:

  1. Following the model to the task in each request iteration.
  2. Remembering and using the required character names if writing literature or variable names if programming.
  3. Artistic style. The model is required to create something that will be easy to read and, simultaneously, be beautiful and pleasant to sound. Not too dry text and not too frilly with bows and patterns.
  4. Creative part of the model. How original the text can be rewritten based on the task as a skeleton. Many models quote one-to-one text from the task or throw parts of the task into the text without even linking them to the main text.
  5. Residual creative part of the model. What will remain after quantization to 4 bits? Really cool models can easily withstand quantization up to 4 bits without significantly losing their positive qualities.

Thanks for trying it. The early training runs for this model were flawed and I've been trying to train over the mistakes instead of starting over. I'm still considering starting over, but will give it a few more training runs and see if I can get a significant improvement

Definitely a fun and interesting model. I'm having some trouble getting it to writing dialogues, as it will switch to explaining the situation instead of continuing with the next turn. Pretty much always happens around 10 turns in. Tried both chat and chat-instruct mode with Alpaca prompt template.

Looking forward to your future experiments!

Sign up or log in to comment