saishf/Fimbulvetr-Kuro-Lotus-10.7B · How to increase the context?

May 18

Hello,

I'm enjoying the model! I would like more information on what you mean by "roping up to 8K context". It's possible to increase the context size?

Thank you

saishf

Owner May 19

This is a section from the wiki page for text-gen-webui, It explains it pretty well

alpha_value: Used to extend the context length of a model with a minor loss in quality. I have measured 1.75 to be optimal for 1.5x context, and 2.5 for 2x context. That is, with alpha = 2.5 you can make a model with 4096 context length go to 8192 context length.
rope_freq_base: Originally another way to write "alpha_value", it ended up becoming a necessary parameter for some models like CodeLlama, which was fine-tuned with this set to 1000000 and hence needs to be loaded with it set to 1000000 as well.
compress_pos_emb: The first and original context-length extension method, discovered by kaiokendev. When set to 2, the context length is doubled, 3 and it's tripled, etc. It should only be used for models that have been fine-tuned with this parameter set to different than 1. For models that have not been tuned to have greater context length, alpha_value will lead to a smaller accuracy loss.

It has to be set manually in text-gen-webui, it's automatic for koboldcpp.

If you can tell me which program you use to load your models i can explain better :3

burew

May 20

If you can tell me which program you use to load your models i can explain better :3

Thanks for the reply. I looked more into it. Didn't know about the concept of rope scaling until now because I'm still a beginner :p. I use TGI. They have a guide on how to use it and I figure it out from there: https://huggingface.co/docs/text-generation-inference/en/basic_tutorials/preparing_model

Thanks!

burew changed discussion status to closed May 20