GPT4ALL: Nous Hermes Model consistently loses memory by fourth question ( GPT4-x-Vicuna-13b-4bit does not have problems)

#5
by boqsc - opened

It wasn't too long before I sensed that something is very wrong once you keep on having conversation with Nous Hermes.
Nous Hermes might produce everything faster and in richer way in on the first and second response than GPT4-x-Vicuna-13b-4bit,
However once the exchange of conversation between Nous Hermes gets past a few messages - the Nous Hermes completely forgets things and responds as if having no awareness of its previous content.

GPT4-x-Vicuna-13b-4bit does not seem to have such problem and its responses feel better.

Prompt Template used while testing both Nous Hermes and GPT4-x-Vicuna-13b-4bit :

### Instruction:
%1
### Response:

Settings while testing: can be any. But I here include Settings image.
Note: Save chats to disk option in GPT4ALL App Applicationtab is irrelevant here and have been tested to not have any effect on how models perform.

image.png

The actual test for the problem, should be reproducable every time:

Nous Hermes Losses memory

Currently I feel like I've tried and tested more than enough of variety to conclude that in my use cases Nous Hermes Losses memory after a few responses.
(Besides giving the first two responses rich in detail and performant)

Screenshot (525).png

GPT4-x-Vicuna-13b-4bit continues to behave and acts very well.

Screenshot (524).png

NousResearch org

It wasn't too long before I sensed that something is very wrong once you keep on having conversation with Nous Hermes.
Nous Hermes might produce everything faster and in richer way in on the first and second response than GPT4-x-Vicuna-13b-4bit,
However once the exchange of conversation between Nous Hermes gets past a few messages - the Nous Hermes completely forgets things and responds as if having no awareness of its previous content.

GPT4-x-Vicuna-13b-4bit does not seem to have such problem and its responses feel better.

Prompt Template used while testing both Nous Hermes and GPT4-x-Vicuna-13b-4bit :

### Instruction:
%1
### Response:

Settings while testing: can be any. But I here include Settings image.
Note: Save chats to disk option in GPT4ALL App Applicationtab is irrelevant here and have been tested to not have any effect on how models perform.

image.png

The actual test for the problem, should be reproducable every time:

Nous Hermes Losses memory

Currently I feel like I've tried and tested more than enough of variety to conclude that in my use cases Nous Hermes Losses memory after a few responses.
(Besides giving the first two responses rich in detail and performant)

Screenshot (525).png

GPT4-x-Vicuna-13b-4bit continues to behave and acts very well.

Screenshot (524).png

I dont know what is going on behind the scenes with your ux. Like I said in another post, for me with my discord roleplaying bots, I line up all past prompts and responses as <user'sname> conversation inside the instruction field. I dont know how the instructions get collected in this UX to know what you would need to do

NousResearch org

If its simply adding together each string:

### Instruction:
You are an elf
### Response:
<some response>
### Instruction:
Understood
### Response:
<some response>

like this, it is not something I have tested.

If its simply adding together each string:

### Instruction:
You are an elf
### Response:
<some response>
### Instruction:
Understood
### Response:
<some response>

like this, it is not something I have tested.

It would be great if you could confirm if it is reproducable in your environment and settings.

NousResearch org

If its simply adding together each string:

### Instruction:
You are an elf
### Response:
<some response>
### Instruction:
Understood
### Response:
<some response>

like this, it is not something I have tested.

It would be great if you could confirm if it is reproducable in your environment and settings.

I don't use gpt4all, I use gptq for gpu inference, and a discord bot for the ux.
My discord bot keeps a constant preprompt for roleplaying task, and a trailing N chat messages for chat history, and produces this kind of output:

image.png

NousResearch org

If you can, you should add a preprompt/system prompt like prompt giving the AI a roleplay task, that it keeps always at the top of it's context, so no matter what, it will remember who it is.

NousResearch org

I dont have a discord bot code available for GGML, but this is how my inference code looks:

image.png

NousResearch org

and it pulls all those datapoints from a character card, but I don't know if these things are possible to set up in GPT4All interface

boqsc changed discussion title from Nous Hermes Model consistently loses memory by fourth question ( GPT4-x-Vicuna-13b-4bit does not have problems) to GPT4ALL: Nous Hermes Model consistently loses memory by fourth question ( GPT4-x-Vicuna-13b-4bit does not have problems)

@teknium
The answer to why this happens and why Nous Hermes behaves like that is maybe that Hermes is not a chat model, but an instruct model.
If that's true cause, I would really like to see a chat oriented model one day. So that it would work like GPT4-x-Vicuna-13b-4bit

image.png image.png image.png

Oh, this explains why I cant ask any follow up questions. The one shot abilities are impressive though!

NousResearch org

Oh, this explains why I cant ask any follow up questions. The one shot abilities are impressive though!

Please see how I setup the prompt format, it can hold a conversation. I will work on a new format for the next version to try to improve this capability though. As well as some more experimental workarounds

NousResearch org

It can hold conversation, the format is just different.

Multiple data types will be supported in the future so we can have the model retain both "instruct" and "chat" capabilities like our previous release.

karan4d changed discussion status to closed

Not sure about all that, but there is chronos-hermes merge that works like Nous-gpt4-x-vicuna.
However it is limited and maybe even censored by chronos model and is not the same as pure hermes.
https://huggingface.co/TheBloke/chronos-hermes-13B-GGML
image.png

boqsc changed discussion status to open

Sign up or log in to comment