512 max positional embeddings, but 8192 context length
#2
by
Fizzarolli
- opened
hi!! this is fantastic and i love that someone finally made a series of models like this and i love you all
However. the model card notes that it was annealed up to 8192 context length which is great-- but then the config.json specifies 512 on the max positional embeddings. Am I missing something obvious? Does RoPE need to be manually configured? I am unsure
@Fizzarolli Good catch. That was a research code to hugging face transformers code porting mistake which I fixed in 5756c58 and f87846.
bwarner
changed discussion status to
closed