What is the maximum token length of the model?

#2
by mstachow - opened

I've run it so far with fairly long inputs, but what is the max token length of the model?

Hey @mstachow ! The model uses relative positional embeddings, instead of absolute positional embeddings, so the max length is not bound by the architecture. Theoretically, it has infinite max length if you have infinite compute. Instead, you'll be bound by the memory of the model, which scales with input length squared, and will depend on your hardware. However, you the performance might degrade for super large input lengths as you lose prosody over such long sequences.

Sign up or log in to comment