t5-base-norwegian / README.md
patrickvonplaten's picture
Update README.md
31b5780
This model trains [T5-V1_1-base](https://huggingface.co/google/t5-v1_1-base) on the Norwegian dataset of [Oscar](https://huggingface.co/datasets/oscar). Note that the original configuration is slightly change (dropout is set to 0).
The official [run_t5_mlm_flax.py](https://github.com/huggingface/transformers/blob/master/examples/flax/language-modeling/run_t5_mlm_flax.py) is copied into the repository and is run using the hyperparameters as defined in *run_t5.sh*.
Training loss can be seen directly on the model card. The full training runs in finished in ca. 4 hours and 30 minutes.