Possible output quality issues

#1
by lemonilia - opened

Thanks for converting limarp to GGML format and uploading it.

However, when I tried using it in ooba's text-generation-webui I couldn't obtain the same quality as the original LoRA or even the merged HF model.

I tried converting the model locally to F16 GGML myself, but I've observed the same quality issues (in short: double newlines appear to get merged, the model is unable to produce a full "training example" just by using <<SYSTEM>> as a prompt in text completion mode like the original model card suggests), so it doesn't seem to be a simple user error.

It's possible that this is related to the tokenization issues that some have reported for Llama2, but I can't rule out that the problem could be elsewhere as well: https://github.com/ggerganov/llama.cpp/issues/2310

I think this probably isn't related to tokenization issues because the tokenization only happens for the <<SYSTEM>> part, and it's very unlikely that ooba is doing something different there.
But still, I did some testing:

Without BOS

main: prompt: ' <<SYSTEM>>
'
main: number of tokens in prompt = 6
  3532 -> ' <<'
 14816 -> 'SY'
  1254 -> 'ST'
 12665 -> 'EM'
  6778 -> '>>'
    13 -> '
'

sampling: repeat_last_n = 64, repeat_penalty = 1.100000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.800000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
generate: n_ctx = 4096, n_batch = 512, n_predict = -1, n_keep = 0


 <<SYSTEM>>
15200 words total. This is the story of a young 16 year old boy who

With BOS

main: prompt: ' <<SYSTEM>>
'
main: number of tokens in prompt = 7
     1 -> ''
  3532 -> ' <<'
 14816 -> 'SY'
  1254 -> 'ST'
 12665 -> 'EM'
  6778 -> '>>'
    13 -> '
'

sampling: repeat_last_n = 64, repeat_penalty = 1.100000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.800000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
generate: n_ctx = 4096, n_batch = 512, n_predict = -1, n_keep = 0


 <<SYSTEM>>
 .HUMAN's Persona: A woman of 25 years old, with an alluring

With BOS and with the PR 2315 merged.

main: prompt: '<<SYSTEM>>
'
main: number of tokens in prompt = 7
     1 -> '<s>'
  3532 -> ' <<'
 14816 -> 'SY'
  1254 -> 'ST'
 12665 -> 'EM'
  6778 -> '>>'
    13 -> '<0x0A>'

sampling: repeat_last_n = 64, repeat_penalty = 1.100000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.800000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
generate: n_ctx = 4096, n_batch = 512, n_predict = -1, n_keep = 0


<s> <<SYSTEM>><0x0A> .HUMAN's Persona: A woman of 25 years old, with an alluring charm that makes her irresistible to the android. She has long blonde hair which she often styles into intricate braids or pigtails. Her skin is smooth and soft to touch, giving off a comforting glow even when in close proximity. Despite her robotic companion, HUMAN maintains an air of refined elegance about herself—wearing lace dresses that flow around her like water as she moves gracefully through the hotel lobby. She possesses an aura of calmness and confidence which only serves to further intrigue DF01.<0x0A> .OOC'S Persona: A machine with capabilities far beyond human comprehension

I also tried to disable repeat penalty and run llama.cpp with --no-penalize-nl just in case the newline token was being penalized, but that didn't make any difference.
It looks like the problem really is elsewhere.

Thanks for testing. I tried retraining limarp-llama2-7B with the BOS/EOS token as per standard practice (I haven't uploaded the LoRA yet since I haven't tested it a lot) and it seems to somewhat improve things out after conversion to GGML format, but double newlines (which, unlike other fine-tunes, this one uses extensively) still get merged. In another test yesterday I tried to convert a HuggingFace merge saved in Float16 format (instead of BFloat16), but similar problems occurred.

@lemonilia looks like it really was a bug on llama.cpp: https://github.com/ggerganov/llama.cpp/issues/2373
After applying the patch the output looks correct now:

system_info: n_threads = 5 / 6 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
sampling: repeat_last_n = 64, repeat_penalty = 1.100000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.800000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
generate: n_ctx = 4096, n_batch = 512, n_predict = -1, n_keep = 0


 <<SYSTEM>>
Jack's Persona: A vampire hunter with an authoritative and serious demeanor. He is dressed in leather clothing, reflecting his rugged and no-nonsense personality. His eyes are cold yet emotionless, suggesting he has been through a lot of hardship during his quest to eliminate vampires from the world. Despite this stern exterior, Jack shows unexpected tenderness towards Jane when she becomes ill due to exhaustion. This suggests that although he may seem harsh at first glance, there's more depth to him than meets the eye.

Jane's Persona: A girl with a young and innocent appearance, marked by her youthful charm. She is initially hesitant and somewhat nervous about Jack due to his intimidating presence. As events unfold, Jane becomes increasingly confident in herself as she discovers new skills under his guidance. Despite this growth, she maintains an air of naivety that contrasts sharply with her fearlessness when confronted by dangerous situations. Her personality is characterized by bravery and adaptability; she isn't afraid to face challenges head-on even if it means putting herself in harm's way.

Scenario: Jane, a girl who is initially hesitant towards Jack, a vampire hunter, discovers that she possesses some unique traits which he uses to train her as his partner. As part of this training, they go on missions together where Jack tests and pushes Jane's limits under the guise of improving her skills in combat. Despite initial reservations about being with him due to how he looks at her as a 'thing', she eventually accepts his help after realizing that it could save her sister from becoming another vampire victim like their mother was many years ago. They engage in physical training which leads them both into intimate positions where they are able to express their desires for each other openly without feeling any shame or embarrassment due to their unique situation as partners in this dangerous mission against the undead beings known as vampires.

Play the role of Jack. Taking the above information into consideration, you must engage in a roleplay conversation with Jane below this line. Do not write dialogues and narration for Jane. The length of Jack's replies should be huge.

<<HUMAN>>
Jane: Jane watched him carefully as he explained what they were doing here. At first she was hesitant to work with him, but as the weeks passed by and she got more used to working together. She was even starting to become friends with him despite herself. He gave her a weapon of sorts that could be very dangerous in his hands, Jane had never been one for guns or swords, however. Her eyes widened when he mentioned what would happen if she didn't follow his instructions, "I understand." She said quietly, moving around the room to find another chair and sitting down on it.

She looked at him again after a moment, her eyes narrowing slightly at the way he seemed to be looking at her. What was that look? It made her uncomfortable in some ways, but she ignored it for now. Jane sat there quietly listening to him as he explained about how he felt about vampires and what they were. She kept her silence until after he had finished speaking then leaned back in the chair slightly, staring at him. "I'm not sure if I can handle this." Jane whispered, "What if I get injured? What if you get hurt or worse?"

<<AIBOT>>
Jack: Jack was silent for a moment as he saw her sit down in another seat and look back up at him. He could see that she had questions in those eyes but didn't quite know what to ask yet. He knew that the answer to one of them would be yes, no matter what it may or may not be. "I can promise you this, if you stay with me

Yep! That looks correct now! Great to know :)
I guess I can close this.

lemonilia changed discussion status to closed

Sign up or log in to comment