Max Context length?

#5
by lazyDataScientist - opened

Just wondering what the max context length for this model has at the moment

It doesn’t have a hard-coded max context length like a transformer. It works kind of like a LSTM. You can just keep adding input and it will keep going. It “remembers” the mast context selectively so it doesn’t loose too much performance.

see: https://arxiv.org/pdf/2312.00752.pdf

Here they talk about it in the part about synthetic tasks

Sign up or log in to comment