GPT4ALL: Nous Hermes Model consistently loses memory by fourth question ( GPT4-x-Vicuna-13b-4bit does not have problems)

by boqsc - opened Jun 6, 2023

Jun 6, 2023

•

edited Jun 6, 2023

It wasn't too long before I sensed that something is very wrong once you keep on having conversation with Nous Hermes.
Nous Hermes might produce everything faster and in richer way in on the first and second response than GPT4-x-Vicuna-13b-4bit,
However once the exchange of conversation between Nous Hermes gets past a few messages - the Nous Hermes completely forgets things and responds as if having no awareness of its previous content.

GPT4-x-Vicuna-13b-4bit does not seem to have such problem and its responses feel better.

Prompt Template used while testing both Nous Hermes and GPT4-x-Vicuna-13b-4bit :

### Instruction:
%1
### Response:

Settings while testing: can be any. But I here include Settings image.
Note: Save chats to disk option in GPT4ALL App Applicationtab is irrelevant here and have been tested to not have any effect on how models perform.

The actual test for the problem, should be reproducable every time:

Nous Hermes Losses memory

Currently I feel like I've tried and tested more than enough of variety to conclude that in my use cases Nous Hermes Losses memory after a few responses.
(Besides giving the first two responses rich in detail and performant)

GPT4-x-Vicuna-13b-4bit continues to behave and acts very well.

teknium

NousResearch org Jun 6, 2023

It wasn't too long before I sensed that something is very wrong once you keep on having conversation with Nous Hermes.
Nous Hermes might produce everything faster and in richer way in on the first and second response than GPT4-x-Vicuna-13b-4bit,
However once the exchange of conversation between Nous Hermes gets past a few messages - the Nous Hermes completely forgets things and responds as if having no awareness of its previous content.

GPT4-x-Vicuna-13b-4bit does not seem to have such problem and its responses feel better.

Prompt Template used while testing both Nous Hermes and GPT4-x-Vicuna-13b-4bit :
### Instruction:
%1
### Response:
Settings while testing: can be any. But I here include Settings image.
Note: Save chats to disk option in GPT4ALL App Applicationtab is irrelevant here and have been tested to not have any effect on how models perform.

The actual test for the problem, should be reproducable every time:

Nous Hermes Losses memory

Currently I feel like I've tried and tested more than enough of variety to conclude that in my use cases Nous Hermes Losses memory after a few responses.
(Besides giving the first two responses rich in detail and performant)

GPT4-x-Vicuna-13b-4bit continues to behave and acts very well.

I dont know what is going on behind the scenes with your ux. Like I said in another post, for me with my discord roleplaying bots, I line up all past prompts and responses as <user'sname> conversation inside the instruction field. I dont know how the instructions get collected in this UX to know what you would need to do

teknium

NousResearch org Jun 6, 2023

If its simply adding together each string:

### Instruction:
You are an elf
### Response:
<some response>
### Instruction:
Understood
### Response:
<some response>

like this, it is not something I have tested.

boqsc

Jun 6, 2023

If its simply adding together each string:
### Instruction:
You are an elf
### Response:
<some response>
### Instruction:
Understood
### Response:
<some response>
like this, it is not something I have tested.

It would be great if you could confirm if it is reproducable in your environment and settings.

teknium

NousResearch org Jun 6, 2023

If its simply adding together each string:
### Instruction:
You are an elf
### Response:
<some response>
### Instruction:
Understood
### Response:
<some response>
like this, it is not something I have tested.
It would be great if you could confirm if it is reproducable in your environment and settings.

I don't use gpt4all, I use gptq for gpu inference, and a discord bot for the ux.
My discord bot keeps a constant preprompt for roleplaying task, and a trailing N chat messages for chat history, and produces this kind of output:

teknium

NousResearch org Jun 6, 2023

If you can, you should add a preprompt/system prompt like prompt giving the AI a roleplay task, that it keeps always at the top of it's context, so no matter what, it will remember who it is.

teknium

NousResearch org Jun 6, 2023

I dont have a discord bot code available for GGML, but this is how my inference code looks:

teknium

NousResearch org Jun 6, 2023

and it pulls all those datapoints from a character card, but I don't know if these things are possible to set up in GPT4All interface

boqsc changed discussion title from Nous Hermes Model consistently loses memory by fourth question ( GPT4-x-Vicuna-13b-4bit does not have problems) to GPT4ALL: Nous Hermes Model consistently loses memory by fourth question ( GPT4-x-Vicuna-13b-4bit does not have problems) Jun 6, 2023

boqsc

Jun 6, 2023

•

edited Jun 6, 2023

@teknium
The answer to why this happens and why Nous Hermes behaves like that is maybe that Hermes is not a chat model, but an instruct model.
If that's true cause, I would really like to see a chat oriented model one day. So that it would work like GPT4-x-Vicuna-13b-4bit

rayyd

Jun 7, 2023

Oh, this explains why I cant ask any follow up questions. The one shot abilities are impressive though!

teknium

NousResearch org Jun 7, 2023

Oh, this explains why I cant ask any follow up questions. The one shot abilities are impressive though!

Please see how I setup the prompt format, it can hold a conversation. I will work on a new format for the next version to try to improve this capability though. As well as some more experimental workarounds

karan4d

NousResearch org Jun 7, 2023

It can hold conversation, the format is just different.

Multiple data types will be supported in the future so we can have the model retain both "instruct" and "chat" capabilities like our previous release.

karan4d changed discussion status to closed Jun 15, 2023

boqsc

Jun 18, 2023

Not sure about all that, but there is chronos-hermes merge that works like Nous-gpt4-x-vicuna.
However it is limited and maybe even censored by chronos model and is not the same as pure hermes.
https://huggingface.co/TheBloke/chronos-hermes-13B-GGML

boqsc changed discussion status to open Jun 18, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment