yhavinga's picture
Add generated texts
fa73be6

Dutch T5 models : UL2, T5, ByT5 and Long-T5 πŸ‡³πŸ‡±πŸ‡§πŸ‡ͺ

TL;DR: Dutch T5 and UL2 Models Trained with Google's TPU Research Cloud and mC4 Dataset Show Outstanding Performance in NLP Tasks. See below for model lists and comparison.

During the HuggingFace Flax/Jax community week in the summer of 2021, I was granted access to Google's TPU Research Cloud (TRC), a cloud-based platform for machine learning research and development that provides access to Google's Tensor Processing Units (TPUs). My goal was to address the (then) shortage of T5 models for the Dutch language. -- T5 is a state-of-the-art AI model architecture that can handle text as input and output, making it an ideal tool for NLP tasks such as summarization, translation, and question-answering -- Since then, with extended access to the TRC, I have been able to train a variety of T5 models for Dutch.

Relevant papers are:

Background on Google's TPU VM's and how to use the Huggingface transformers library to pre-train models can be found at the following links