Why this model kept generating \n when loaded with text generation web ui?

#2
by fahadh4ilyas - opened

I'm trying to load the model using text generation web ui but the result is always repeated "\n" everytime I ask it short like "Give me explanation about AI!". I load it with this params:

python server.py --verbose --model-menu --loader transformers --load-in-4bit --compute_dtype float16 --quant_type nf4 --use_double_quant --listen

Is there something that I do wrong here?

Hi,

I think it might be better that if you ask question following the prompt format during the supervised fine-tuning.
https://github.com/dvlab-research/LongLoRA/blob/5056749a37833c1303129ddff3fde6ee26dfe86f/demo.py#L161

In addition, I think this might be caused from the reason that the model has been fine-tuned to fit the long context inputs, and its capacity on short context might be influence. Because we used the position interpolation (https://arxiv.org/abs/2306.15595) for position embedding. Its limitation on short text is a known issue.

Regards,
Yukang Chen

Oh, thank you for the explanation. So this model is limited to only for long input?

You also said that you use position interpolation to fine tune your model, but your github script didn't show anything about position interpolation. Instead you use attention shifting. Or do you mean attention shifting is a position interpolation?

Hi,

I think it should be not good at very short input. I have tested the input with hundreds of tokens, it can provide a reasonable answers. In addition, the shorter ones are bad, like the question you do.

We have included the position interpolation. Please refer to the lines below. It is introduced in this work (https://arxiv.org/abs/2306.15595).

https://github.com/dvlab-research/LongLoRA/blob/5056749a37833c1303129ddff3fde6ee26dfe86f/demo.py#L220

Regards,
Yukang Chen

Ah, maybe that is the reason why the answer is always "\n". That makes sense. Okay thank you for the clarification...

fahadh4ilyas changed discussion status to closed

Sign up or log in to comment