mosaicml/mpt-7b-chat · Experience

Jun 5, 2023

After ~2k tokens it starts to ramble and loses context:

Jun 14, 2023

There are 2 issues here: meta-conversation (self-referential messages like "do you still remember the original task" which are not present in the training data) and that it doesn't extrapolate well beyond its training length.

For meta-conversation: this is a data issue, I am thinking of ways to generate conversations that will help with tasks like this.

For length extrapolation: yes, unlike some models it doesn't error when you go above the training sequence length, but it doesn't do as well. I recommend deleting previous messages to make space.

sam-mosaic changed discussion status to closed Jun 14, 2023