What is the maximum input length for Bloomz and MT0?

#39
by Charm3link - opened

T5, BERT, and many other models seem to have strict limits on their input lengths. I haven't seen this discussed for Bloomz or Mt0, and they seem to handle very long inputs. Is there a hard limit like the other models I mentioned, or is it only limited by the hardware running the models?

BigScience Workshop org

For BLOOMZ models it's unlimited due to their alibi position embeddings, tho I havn't done/seen an evaluation of how well it works for contexts > 2048 tokens.
For mT0 models, it's 512 tokens max input length like for mT5 / T5.

Charm3link changed discussion status to closed

Sign up or log in to comment