The model just repeats part of the input

#83

by summerstay - opened Apr 26, 2024

Apr 26, 2024

I have tried many different prompts and settings but whatever I do, I can't get a long response from this base model without it just repeating itself over and over. I thought it might have to do with the special tokens, but adding in didn't seem to help. Any idea what is going wrong? I have tried repetition and presence parameters, but I can't seem to find a setting between "does nothing" and "makes it descend into gibberish".

realdanielbyrne

Apr 26, 2024

This is the base model hence it is only trained to predict the next sequence of words. Generally base models know a lot about language, and nothing about chatting. You will need to either fine-tune this model on instruction following or try starting with the instruction tuned version of this model, meta-llama/Meta-Llama-3-8B-Instruct.

summerstay

Apr 26, 2024

Thanks. I have used GPT-3 as a base model. It seems like this is much more prone to repetition than GPT-3 was. I have finally gotten it working okay, but only by turning up the repetition penalty to more than 1. Much higher and the penalty stops it from being able to end sentences (because . is penalized) and soon loses all sense entirely.

realdanielbyrne

Jun 20, 2024

Interesting. Typically in the instruction tuned models then encode a stop token which accomplishes what you are attempting to do with the repeat penalty.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment