Does FLAN-T5 support masked language modeling (similar to how t5 does)?

#15
by ToasterLeavin - opened

With the standard t5 model I can replace words with <extra_id_0>, <extra_id_1>, etc. and then tokenizing. Passing the tokens into model.generate with a t5 model gives me the word(s) that should replace each special token.

When using flan-t5 this no longer works, and I haven't found a prompt that works either.

Am I missing something, or does FLAN-T5 just not support this by default?

Google org

Hi @ToasterLeavin
Flan-T5 checkpoints have not been fine-tuned on that objective, it has been fine-tuned on a mix of instruction-based datasets - therefore does not support MLM out of the box unfortunately.
For using T5 models trained on that objective, please consider using t5_v1_x model family: https://huggingface.co/google/t5-v1_1-xl

Sign up or log in to comment