max_position_embeddings = 2048?

#29
by zzzac - opened

I saw from the config that max_position_embeddings is set to 2048, but the original llama2 model has 4096 maximum input length. Is there a particular reason to reduce the input length of these quantized model?

Thanks for this great work!

No sorry that's just a mistake. Or rather, the original Llama 2 config.json's had that set to 2048 so that's what mine were set to. Then they updated theirs to 4096.

I did update mine too, but I see now I only did that for the main branch config.json, not the additional branch alternative GPTQs. I'll fix that now.

To be honest it doesn't matter for most clients, which set the length independently. The max_position_embeddings is more a default, not a maximum. But anyway, I'll fix it.

Sign up or log in to comment