Nice work! The config.json seems to have "max_position_embeddings": 8192, but it doesn't work fine after 4096

by Panchovix - opened Nov 21, 2023

Nov 21, 2023

Hi there, noticed on the config.json

"max_position_embeddings": 8192

But when trying to do inference at context above 4K, I get gibberish. Are we supposed to use a rope value different than 1?

Thanks!

hamishivi

Ai2 org Nov 21, 2023

Hi, we definitely trained with a length of 8192, although I think most of the data was <= 4k in terms of length, so this might be what's going on. Additionally, the DPO training data didn't go over 6k tokens. So probably some extra work is required to get these models to work well with such long contexts - we didn't really test long context settings much.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment