This model trains T5-V1_1-base on the Norwegian dataset of Oscar. Note that the original configuration is slightly change (dropout is set to 0).

The official is copied into the repository and is run using the hyperparameters as defined in

Training loss can be seen directly on the model card. The full training runs in finished in ca. 4 hours and 30 minutes.


