UL2 Fine-tuned summarization model

#1
by jumerckx - opened

Hey @yhavinga ,
I was wondering whether you are planning on uploading the UL2 models that have been fine-tuned for summarization (mentioned here)? I see the fine-tuned translation models are available, but don't see the summarization models, or I might be missing it?
Either way, your models are tremendously helpful already, we're currently using t5-v1.1-base-dutch-cnn-test in a school project!

Hi @jumerckx - none of the evaluation-fine-tuned models are released, these are not optimal and not intended for inference. (for evaluation of the pre-trained models I ran a limited fixed number of fine-tuning steps) At some time in the past I became a bit obsessed with trying to improve en->nl dutch translation, the best or most interesting models I've released on the hub. (e.g. ul2-large-en-nl which I am quite happy with.. still things could be improved but I think its better than google translate and on par/better than deepl). In the past I've also done that for summarization, but didn't do a recent summarization fine-tune of e.g. ul2-base. I guess there are some things to do there for Dutch, e.g. compare bart with t5/ul2 models, and also trying to deal with longer (input) context lengths, especially seen in e.g. the cnn dataset. Maybe the long-t5 models (which didn't do too well when I compared them to the other dutch models) shine in the long context arena? OTOH the ul2-dutch models have a new tokenizer that is about 30% more efficient than the old tokenizer I used, so I still guess ul2 will be a good competitor there are well).
Thanks for the kind words about my models, nice to get feedback that they're useful!
-- Yeb

yhavinga changed discussion status to closed

Sign up or log in to comment