--- license: apache-2.0 tags: - generated_from_trainer model-index: - name: byt5-small-wikipron-eng-latn-au-broad results: [] --- # byt5-small-wikipron-eng-latn-au-broad This model is a fine-tuned version of [google/byt5-small](https://huggingface.co/google/byt5-small) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.1875 - Per: 0.3296 - Gen Len: 16.2507 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0002 - train_batch_size: 128 - eval_batch_size: 32 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_ratio: 0.1 - num_epochs: 10.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | Per | Gen Len | |:-------------:|:-----:|:----:|:---------------:|:------:|:-------:| | 2.4225 | 1.0 | 243 | 0.3885 | 0.5182 | 16.046 | | 0.386 | 2.0 | 486 | 0.2610 | 0.4201 | 16.1472 | | 0.2876 | 3.0 | 729 | 0.2242 | 0.3699 | 16.1972 | | 0.2459 | 4.0 | 972 | 0.2073 | 0.3501 | 16.25 | | 0.2236 | 5.0 | 1215 | 0.1966 | 0.3402 | 16.2254 | | 0.207 | 6.0 | 1458 | 0.1953 | 0.337 | 16.2453 | | 0.1971 | 7.0 | 1701 | 0.1879 | 0.3339 | 16.2523 | | 0.1888 | 8.0 | 1944 | 0.1879 | 0.3319 | 16.2565 | | 0.1829 | 9.0 | 2187 | 0.1869 | 0.3305 | 16.2509 | | 0.1783 | 10.0 | 2430 | 0.1875 | 0.3296 | 16.2507 | ### Framework versions - Transformers 4.28.0.dev0 - Pytorch 1.13.1+cu117 - Datasets 2.9.0 - Tokenizers 0.13.2