Does this finetune let you use 4096 context?

#2
by Panchovix - opened

Hi there, very impressive results!

Was wondering, checking the file config.json, the

max_position_embeddings variable is set to 2048, while for llama-2 (https://huggingface.co/meta-llama/Llama-2-70b-hf/blob/main/config.json), this value is set to 4096.

Would this model be able to do 4096 context, as llama-2-70b?

upstage org

Hi,

Yes, it would be possible to set the max_seq_len to 4096.

The reason our max_position_embeddings in the config is set to 2048 is because we based our work on a previous version of the Llama2 model, as you can see in this link (https://huggingface.co/meta-llama/Llama-2-70b-hf/blob/de00c41a63fb46d805f85f92fe5418a9633bb97d/config.json).

Panchovix changed discussion status to closed

Sign up or log in to comment