TheBloke/vicuna-13B-v1.5-16K-GGML · Model answer ends in repeating word

Aug 4, 2023

Eg in LM Studio (0.1.11)

curl http://localhost:1234/v1/chat/completions
-H "Content-Type: application/json"
-d '{
"messages": [{"role": "user", "content": "Introduce yourself."}],
"temperature": 0.7,
"max_tokens": -1,
"stream": false
}'

Get following response

[2023-08-04 18:33:17.287] [INFO] Generated prediction: {
"id": "chatcmpl-anobk33c2ezhuggk932",
"object": "chat.completion",
"created": 1691166667,
"model": "/Users/martinrichardt/.cache/lm-studio/models/TheBloke/vicuna-13B-v1.5-16K-GGML/vicuna-13b-v1.5-16k.ggmlv3.q4_1.bin",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "\nMy name is Anastasiya. I am a student of master's program in the field of marketing at the University of Economics in Varna, Bulgaria. I have always been interested in the world of business and how it can affect the economy. That's why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why why ...

Also tried llama.cpp with similar results.
Does anyone have a solution for that?

TheBloke

Owner Aug 5, 2023

This happens when the RoPE settings aren't correct

In llama.cpp try:
-c 16384 --rope-freq-base 10000 --rope-freq-scale 0.25 for 16K context, or:
-c 8192 --rope-freq-base 10000 --rope-freq-scale 0.5 for 8K context.

Don't know how this is applied in LM Studio - might be there's no option for it yet. Check settings for anything mentioning rope frequency base and rope frequency scale

mrichardt

Aug 5, 2023

Thank you, that worked out great!

Azamorn

Aug 5, 2023

I've been setting it 4 or 8 for 16k and 32k, thank you so much for this!

boqsc

Aug 9, 2023

Saving this to the notes by commenting.