Any plans on expanding the context length with landmark attention?

#7
by itachiluan - opened

Hi ehartford,

First of all, thanks so much for bringing this model to us! I think it is by far the best model suited for all my tasks.
One of the problem that I experienced, and I think a lot of people also are experiencing is the 2048 token length limit.
Before seeing this repo, I thought it was not possible to expand llama models' context lengths until I found this:
https://github.com/epfml/landmark-attention

As I have been looking through the wizard vicuna dataset, I've found that there are prompts that are way over 2048 tokens (maximum I found was 8225), although my method of calculating the token size could be wrong (by using llama tokenizer to tokenize the combination of history and prompt), I think we can still come to a conclusion that some of the data from wizard vicuna were truncated due to token length limitations.

Do you think it was worth a shot for trying landmark attention on the wizard vicuna 13b model to see if we can expand its context length?
Thanks!

Cognitive Computations org

I'll look into it, sounds interesting!

I didn't even know that was possible, but that would be amazing!
Whenever I make quirky charcters with example context / past chats, it always eats up quite a decent amount of the 2048 tokens. Extending that would be a dream come true!

I didn't even know that was possible, but that would be amazing!
Whenever I make quirky charcters with example context / past chats, it always eats up quite a decent amount of the 2048 tokens. Extending that would be a dream come true!

Hi,

If you check TheBloke’s page, he has published many models that now merged with superHOT 8k Lora that extends the context length to 8k+, worth giving it a go!
The slight problem for me was that the superHOT Lora’s didn’t train with the wizard vicuña dataset so the wizard vicuña merged model has a slight less accuracy for me (for Chinese generation), but definitely worth trying !

Sign up or log in to comment