GPT Like and Very Good Reasoning

by orick96 - opened Feb 18, 2024

Feb 18, 2024

Very good but concerning resource usage it has slow inference of up to 47 seconds. This smaller model would be worthy of a mixture of experts model fine tuned with a system message it would be close to GPT 3.5.

Locutusque

Owner Feb 18, 2024

This is one of my older models, and does not perform as well as my newer models. You can check out https://huggingface.co/M4-ai/NeuralReyna-Mini-1.8B-v0.2 which is of similar size and performs excellently. It's one of the best models available that is under 2 billion parameters. It should also have faster inference because it uses optimizations such as flash attention and sliding windows.

Locutusque changed discussion status to closed Feb 18, 2024

orick96

Feb 18, 2024

•

edited Feb 18, 2024

That model appears to be set-up for Text Completion I also tried
<|USER|> Craft me a list of some nice places to visit around the world. <|ASSISTANT|>
and no text was generated.
If it uses the conversational template, a recent update broke it for now.
How would you recommend prompting the model?
Is it for a pipeline where you start the answer and let the model complete it?

On second test it appears to work as long as their is a line break after the user message

<|USER|> Craft me a list of some nice places to visit around the world.
<|ASSISTANT|> Some cool places you might want to go:
- Paris, France
- Tokyo, Japan

Locutusque

Owner Feb 18, 2024

orick96

Feb 18, 2024

•

edited Feb 18, 2024

Thanks for your time! Now is there any way to inject a system message or chat history for extra context?

Locutusque

Owner Feb 18, 2024

This model is indeed trained to take in system messages. If you want to add a system message, you can do something like this:

orick96

Feb 18, 2024

I have this agent framework that engineers the LLMs system message with text vision, memories, time and identity. Might try to experiment with that to see how self-aware the model can be. But I was wondering what is the maximum context window for this model?

Locutusque

Owner Feb 18, 2024

The maximum context window is 1600 tokens.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment

GPT Like and Very Good Reasoning

<|USER|> Craft me a list of some nice places to visit around the world.<|ASSISTANT|> Some cool places you might want to go:- Paris, France- Tokyo, Japan

<|USER|> Craft me a list of some nice places to visit around the world.
<|ASSISTANT|> Some cool places you might want to go:
- Paris, France
- Tokyo, Japan