TheBloke/LlongOrca-7B-16K-GGML · repeats the same word in the output

Aug 13, 2023

Running the model on oobabooga/text-generation-webui
using q4_K_M and q5. the results are always constant repetitions after just three or four words. Even in chat or notebook .
Example:
text context input :
In photosynthetic bacteria, the proteins that gather light for photosynthesis are embedded in cell membranes. In its simplest form, this involves the membrane surrounding the cell itself.[19]
However, the membrane may be tightly folded into cylindrical sheets called thylakoids,[20] or bunched up into round vesicles called intracytoplasmic membranes.[21]
model output :
In photosynthetic bacteria, the proteins that are involved in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in

mirek190

Aug 13, 2023

•

edited Aug 13, 2023

you have to use a starting parameters --rope-freq-scale 0.25 and -c 16384

-c 16384 as it is llama2 has 4096 as default -c 4096
--rope-freq-scale 0.25 as it has x4 context

TheBloke

Owner Aug 13, 2023

•

edited Aug 13, 2023

Yeah this repeating word thing is when rope frequency scale is set wrong. I mentioned this in the README under the llama.cpp command, but I guess it needs to be clearer.

I haven't checked text-generation-webui's llama.cpp loader recently - presumably it has parameters for that

But you set -c to the desired context I believe? So -c 16384 to make use of the full 16K

mirek190

Aug 13, 2023

ups ... you are right ... corrected ;D

eraldohug

Aug 13, 2023

Works now.
in oobabooga/text-generation-webui , the parameter compress_pos_emb must be set to 8.