512 max positional embeddings, but 8192 context length

#2
by Fizzarolli - opened

hi!! this is fantastic and i love that someone finally made a series of models like this and i love you all
However. the model card notes that it was annealed up to 8192 context length which is great-- but then the config.json specifies 512 on the max positional embeddings. Am I missing something obvious? Does RoPE need to be manually configured? I am unsure

Answer.AI org

@Fizzarolli Good catch. That was a research code to hugging face transformers code porting mistake which I fixed in 5756c58 and f87846.

bwarner changed discussion status to closed

Sign up or log in to comment