Does FLAN-T5 support masked language modeling (similar to how t5 does)?
#15
by
ToasterLeavin
- opened
With the standard t5 model I can replace words with <extra_id_0>, <extra_id_1>, etc. and then tokenizing. Passing the tokens into model.generate with a t5 model gives me the word(s) that should replace each special token.
When using flan-t5 this no longer works, and I haven't found a prompt that works either.
Am I missing something, or does FLAN-T5 just not support this by default?
Hi
@ToasterLeavin
Flan-T5 checkpoints have not been fine-tuned on that objective, it has been fine-tuned on a mix of instruction-based datasets - therefore does not support MLM out of the box unfortunately.
For using T5 models trained on that objective, please consider using t5_v1_x
model family: https://huggingface.co/google/t5-v1_1-xl