The model seems not have a general ability

#3
by yuansiwe - opened

Hi, Yukang:
Thanks for the amazing work you have done in this project as well as the paper. However, by deploying your model, I find it hard for the model to answer some general questions, such that
image.png

image.png

I am looking forward to talking with you🙏

Hi,

Thanks for your interests in our work. I think it might be better that if you ask question following the prompt format during the supervised fine-tuning.
https://github.com/dvlab-research/LongLoRA/blob/5056749a37833c1303129ddff3fde6ee26dfe86f/demo.py#L161

I think this might be caused from the reason that the model has been fine-tuned to fit the long context inputs, and its capacity on short context might be influence. Because we used the position interpolation (https://arxiv.org/abs/2306.15595) for position embedding. Its limitation on short text is a known issue.

Regards,
Yukang Chen

@Yukang When you guys did the SFT, did you take the format of the system prompt into consideration? It's possible that your prompt format for the SFT is incorrect and therefore the chat model generates from instructions poorly.

Hi,

I think this poor instruction-following issue is resulted from the sft training data for this model is too limited in both amount and the prompt format. We have prepared a much better dataset with much more data amount and free prompt. We are training the models with the stronger datasets. We will release both the dataset and models next week. These models will be much better. Thanks for your patience.

Regards,
Yukang Chen

Thanks for the speedy reply.

As this checkpoint is an extension from llama2, I'd recommend that one sticks with its <s>[INST]<<SYS>> $system_prompt <</SYS>> $instruction [/SYS] $response </s> format when experimenting SFT with llama2.

Looking forward to the new updated model,
FT

Thanks for your suggestion. We are comparing different types of prompt formats, including the llama2. We will present the best one then. Thanks.

Hi,

We have release our data for long instruction following, LongAlpaca-12k, and the update models, LongAlpaca-7B/13B/70B. They are available in the following links. These models should be much better than the original SFT models. We use the alpaca prompt format it is more general than what we used previously.

https://huggingface.co/datasets/Yukang/LongAlpaca-12k
https://huggingface.co/Yukang/LongAlpaca-7B
https://huggingface.co/Yukang/LongAlpaca-13B
https://huggingface.co/Yukang/LongAlpaca-70B-lora

Regards,
Yukang Chen

Sign up or log in to comment