Model pads response with newlines up to max_length

#6
by borzunov - opened

Hi @philschmid @osanseviero ,

Thanks for all your work on making this model available on the Model Hub!

I have an issue while using the model - quite often, the model starts to generate \n indefinitely instead of generating </s> and stopping.

This is using the standard generation params (temperature=0.2, top_p=0.95) with the prompt format and the example prompt suggested in the official repository:

<s>[INST] <<SYS>>
A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
<</SYS>>

In Bash, how do I list all text files in the current directory (excluding subdirectories) that have been modified in the last month? [/INST]

This issue was also reported on GitHub: https://github.com/facebookresearch/codellama/issues/26

This comment has been hidden

+1. I'm observing that the model doesn't seem to follow instructions and generates poor answers when served with HF TGI 1.0.2.

Sign up or log in to comment