how to control spew?

#9
by tkramer - opened

At a certain point, about 500 tokens or so, the model just starts spewing random words. I've tried very small and very large temperature values. Does anyone have any suggestions about the correct parameters for:

self.generate_kwargs = {
"temperature": 0.1,
"top_p": 0.92,
"top_k": 0,
"max_new_tokens": 1024,
"use_cache": True,
"do_sample": True,
"eos_token_id": self.tokenizer.eos_token_id,
"pad_token_id": self.tokenizer.pad_token_id,
"repetition_penalty": 1.1, # 1.0 means no penalty, > 1.0 means penalty, 1.2 from CTRL paper
}

Mosaic ML, Inc. org

I have noticed a tendency for the model to repeat itself as it continues to generate. Here are some parameters I used which tended to have reasonable results, but it's very possible that there's room for improvement. I'll be curious to know how they work for you.

self.generate_kwargs = {
"temperature": 0.8, <-- Increased substantially
"top_p": 0.95, <-- Small increase
"top_k": 50, <--- Uses this
"max_new_tokens": 1024,
"use_cache": True,
"do_sample": True,
"eos_token_id": self.tokenizer.eos_token_id,
"pad_token_id": self.tokenizer.pad_token_id,
"repetition_penalty": 1.02, <--- Decreased
}

For whatever reason, low temperatures seem to produce more repetition (which I think tends to be true with LLMs in general, but maybe to a lesser extent than here). My recommendations also are slightly different wrt top_p, top_k, and repetition_penalty.

You might also try using no_repeat_ngram_size. It's pretty heavy-handed, though. So it might be hard to get right.

Good luck! Please let us know if you have better luck with those.

Thanks for your suggestions. I’m afraid I wasn’t getting any better results. In case it help, my work here:

https://colab.research.google.com/drive/1MqNxp36gFzuHrQP1lbkSYRjsIJPEqVkS?usp=sharing

Mosaic ML, Inc. org

Oof, yeah, that doesn't look too great, does it?!

I did notice that you have all the parameters set to those I recommended except repetition_penalty is set to 1.2 instead of 1.02. Maybe that was a typo?
I'm optimistic that the 1.2 value is to blame. In my (limited) search over the generation parameters, I quickly found that even 1.1 was producing bad results.

I'm tinkering with a demo, which I hope to make public soon. In the meantime, I fed your prompt from the Colab and observe results like this:

Once upon a time there was a poor blacksmith. He discovered a novel type of metal from a miner in the Ur mountains. Upon forging a blade from it he found that it could withstand any edge more finely honed than any man had ever made before, or would ever make again. This was a superior sword to anything made even of the purest steel." The old man paused to wipe his nose and look down at his belly. "No weapon forged from this metal could ever be broken. It could only be destroyed in the process of cutting through it."

Everyone stared. Even the children were wearying of the tale. "Go on," said Stallion. "I'm intrigued now."

"The blacksmith kept this to himself, feeling he could never duplicate it. He tried numerous times with various materials and methods of smelting, but he was forced to admit that none of them could achieve this lofty goal. In time the blacksmith died, and his wife took her son into her own house. Her son, whom she named Feather, grew up to become a weapon smith as well. He worked with the same spirit and love for excellence as his father, yet his mind and body were not strong enough to wrest mastery from this special alloy. His blades were fine, but he could not keep up a continuous flow of them. Hence, he became a baker of bread and the maker of glass. He tried many things and failed in most of them, but one day Feather found a spell that allowed him to melt his glass. It required a difficult incantation, with each successive gasp of breath being an essential part of the words. As Feather made his way through the chant he held a piece of the molten glass in front of his face, and one thing led to another, until he started making different types of bubbles in the glass. These bubbles were unusual; they were multicolored and had no apparent shape. One day, while playing with these bubbles, Feather found that if he pressed one bubble with his thumb, then pulled away quickly, it would shatter.

"This occurred to Feather again and again until he had a sheet of glass with a single bubble in the center. When Feather finally had nothing else to do, he made the glass, then cut out an arcane symbol. It was a simple design, like a letter Z, but Feather had never seen this sort of letter before. The next morning he went to work, where he created the most beautiful vases and bowls ever crafted by mortal hands."

Stallion looked up. "A magical symbol? What purpose did it serve?"

The old man smiled. "Feather never cared to know. He knew only one thing: that the bubble in the middle of the glass he had been trying to form for so long had finally appeared on the sheet of glass. He applied the symbol and an hour later he had two vases. When he turned them over they seemed to be seamless, with no hint of how they had been formed. He called the new vases Z's Vases, and they became famous throughout the land."

Tarry sighed and scratched his head. "But we are much like Feather, in that we strive to create something undeniably fine that will be enjoyed by all. We do not know why the magic works, or what the power is. But if you wish to make sword blades or vases, I can give you the ingredients and the lessons to make them work. But what about bats?"

"Ah yes, my friend. You are right to ask this question! All through my life when people asked me how I came to have such good luck, I answered that I believed in karma, which is a universal law of cause and effect. I think that all men and women, all animals and plants, all things were connected together by invisible cords. And when we are lucky, our connection with these threads is strengthened by positive deeds." The old man stood up and put his arm around Tarry's shoulder. "Just remember this: When there is bad luck, it is due to some external action. Something has happened to ruin the karmic balance. Recently, I had a terrible dream. I dreamed that I was on a dark road, and I was being followed by a great shadow. Every time I tried to run, someone would grab me and force me back. So I strode forward boldly, and the shadow followed. Finally I realized that I was walking into the jaws of the great shadow. In fact, it caught up to me and swallowed me whole. I was trapped, and there was no way for me to escape, unless I accepted the shadow as my destiny. I woke up, frightened and confused. I remembered that night plainly, but it has taken me weeks to seek answers. What I have done since I awakened is an example of karma at work. I threw my life into service to the community, and I prayed for the chance to help you."

Hourra nodded. "You're our golden dragon!"

Stallion thought of the dreams he'd had, the spirits who had come to him, the visions he had seen. He smiled down at Hourra and shook his hand. "We appreciate your kindness."

Abe leaped to his feet. "Wait just a moment! I have an idea. When the bats go into Zone One, they should find the evil masterminds of the conspiracy. Once they have gathered at the designated location, they must leave and return to their home. The evil ones will soon realize that they cannot get to the bats without breaking their own secret oaths. The truth will be known and trouble will come! We will see the conspirators executed and justice will be found!"

On hearing the plan Abe smiled widely. "Perfect! That is what I've been dreaming about!"

Stallison glanced at Hourra. "You agree, don't you?"

"Of course!"

"Do you trust the wisdom of my cousin?"

Hourra shrugged. "I don't know. But we need a plan, and Abe seems to have one. Let's stick with it, and see what happens."

With this, the group eagerly embraced one another. They hugged and discussed the plan

Hopefully that's more along the lines of what you'd want to see?

Thanks for your reply. That fixed things for me! I'm seeing similar output now. :-)

tkramer changed discussion status to closed

Just a note that I noticed some logical failings. At one point the blacksmith was killed and later featured in the story as if that hadn't happened. I expect that will occur in models of this size.

Mosaic ML, Inc. org

The demo that @atrott mentioned is public now: StoryWriter Demo

The link to the demo no longer works.

Am I missing any configurations? I am setting the max_token_length to 80k+, but the story is really short.

No offense, but the length of the story isn’t the failing of this model. It seems unable to stick to a narrative. It is rambling and more like a drunk story teller who forgets what they said two minutes ago.

Sign up or log in to comment