Does max_position_embeddings really the parameter to be changed?
When I'm looking at the config.json
from longchat, the value of max_position_embeddings
still 2048, but the max_sequence_length
is set to 16384. I don't understand what is the difference? And why your config.json
did not contain max_sequence_length
?
From what i understand those values are what the model needs to run, you can just change the actual values for the model to generate text at 8k or 16k tokens inside text generation web ui and not worry about the config files
Yes you don't need to touch config.json if using text-generation-webui with ExLlama, as it has UI parameters for sequence length and compression emb.
But you do need to change max_position_embeddings if you're using AutoGPTQ, as that's how it knows what sequence length to use. That's detailed in my README.
Yeah, after seeing how longchat load its model, It seems they use ratio parameter to change max_position_embeddings from 2048 to 16k. Kinda weird why they didn't set max_position_embedding to 16k at the beginning.