Edit model card

byt5-base-es_hch

This model is a fine-tuned version of google/byt5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0999
  • Bleu: 8.9448
  • Gen Len: 96.522

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 65
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 398 1.1655 0.1048 19.0
1.5993 2.0 796 1.0294 0.0762 19.0
1.1714 3.0 1194 0.9575 0.0863 19.0
1.0539 4.0 1592 0.9043 0.0769 19.0
1.0539 5.0 1990 0.8519 0.0792 19.0
0.9762 6.0 2388 0.8147 0.0563 19.0
0.9072 7.0 2786 0.7833 0.0856 19.0
0.8502 8.0 3184 0.7526 0.091 19.0
0.8081 9.0 3582 0.7389 0.1344 19.0
0.8081 10.0 3980 0.7187 0.1271 19.0
0.7683 11.0 4378 0.7038 0.1299 19.0
0.7318 12.0 4776 0.6901 0.1213 19.0
0.6998 13.0 5174 0.6753 0.1583 19.0
0.6683 14.0 5572 0.6631 0.145 19.0
0.6683 15.0 5970 0.6530 0.1516 19.0
0.6406 16.0 6368 0.6454 0.1599 19.0
0.6128 17.0 6766 0.6383 0.1478 19.0
0.5911 18.0 7164 0.6369 0.1571 19.0
0.5721 19.0 7562 0.6339 0.1668 19.0
0.5721 20.0 7960 0.6295 0.1611 19.0
0.547 21.0 8358 0.6267 0.1722 19.0
0.529 22.0 8756 0.6275 0.1656 19.0
0.5115 23.0 9154 0.6285 0.1684 19.0
0.4934 24.0 9552 0.6269 0.1696 19.0
0.4934 25.0 9950 0.6358 0.182 19.0
0.4773 26.0 10348 0.6338 0.1699 19.0
0.4591 27.0 10746 0.6358 0.1855 19.0
0.4449 28.0 11144 0.6440 0.1759 19.0
0.4285 29.0 11542 0.6438 0.1786 19.0
0.4285 30.0 11940 0.6474 0.1874 19.0
0.4137 31.0 12338 0.6517 0.1968 19.0
0.4012 32.0 12736 0.6562 0.1735 19.0
0.3858 33.0 13134 0.6581 0.18 19.0
0.3753 34.0 13532 0.6714 0.1837 19.0
0.3753 35.0 13930 0.6750 0.177 19.0
0.3613 36.0 14328 0.6773 0.177 19.0
0.3493 37.0 14726 0.6915 0.1859 19.0
0.339 38.0 15124 0.7032 0.1756 19.0
0.3263 39.0 15522 0.7003 0.1844 19.0
0.3263 40.0 15920 0.7169 0.1795 19.0
0.3153 41.0 16318 0.7181 0.1903 19.0
0.3047 42.0 16716 0.7283 0.1864 19.0
0.2933 43.0 17114 0.7462 0.188 19.0
0.2888 44.0 17512 0.7420 0.1841 19.0
0.2888 45.0 17910 0.7574 0.1748 19.0
0.2762 46.0 18308 0.7617 0.1747 19.0
0.2671 47.0 18706 0.7678 0.1743 19.0
0.2585 48.0 19104 0.7697 0.1902 19.0
0.252 49.0 19502 0.7865 0.208 19.0
0.252 50.0 19900 0.8059 0.1777 19.0
0.2411 51.0 20298 0.7906 0.212 19.0
0.2358 52.0 20696 0.8143 0.1778 19.0
0.2273 53.0 21094 0.8184 0.218 19.0
0.2273 54.0 21492 0.8261 0.2243 19.0
0.223 55.0 21890 0.8429 0.2196 19.0
0.2131 56.0 22288 0.8475 0.2402 19.0
0.2083 57.0 22686 0.8618 0.2163 19.0
0.202 58.0 23084 0.8572 0.2164 19.0
0.202 59.0 23482 0.8736 0.217 19.0
0.1968 60.0 23880 0.8894 0.2166 19.0
0.1904 61.0 24278 0.8928 0.2241 19.0
0.1847 62.0 24676 0.9058 0.2219 19.0
0.1803 63.0 25074 0.9057 0.2336 19.0
0.1803 64.0 25472 0.9174 0.2156 19.0
0.1758 65.0 25870 0.9230 0.1951 19.0
0.1701 66.0 26268 0.9350 0.2249 19.0
0.1673 67.0 26666 0.9417 0.2224 19.0
0.1614 68.0 27064 0.9509 0.2161 19.0
0.1614 69.0 27462 0.9653 0.2183 19.0
0.1578 70.0 27860 0.9633 0.2113 19.0
0.1536 71.0 28258 0.9783 0.2177 19.0
0.1513 72.0 28656 0.9755 0.2179 19.0
0.147 73.0 29054 0.9911 0.2273 19.0
0.147 74.0 29452 0.9855 0.2157 19.0
0.1443 75.0 29850 0.9998 0.2169 19.0
0.1401 76.0 30248 1.0128 0.2124 19.0
0.1377 77.0 30646 1.0114 0.2159 19.0
0.1342 78.0 31044 1.0249 0.2152 19.0
0.1342 79.0 31442 1.0258 0.2233 19.0
0.1336 80.0 31840 1.0309 0.2194 19.0
0.1307 81.0 32238 1.0321 0.2122 19.0
0.1277 82.0 32636 1.0340 0.2191 19.0
0.1262 83.0 33034 1.0493 0.2123 19.0
0.1262 84.0 33432 1.0545 0.2273 19.0
0.1233 85.0 33830 1.0550 0.2184 19.0
0.1233 86.0 34228 1.0546 0.2241 19.0
0.1205 87.0 34626 1.0696 0.2246 19.0
0.1189 88.0 35024 1.0730 0.2237 19.0
0.1189 89.0 35422 1.0688 0.2308 19.0
0.1173 90.0 35820 1.0783 0.2267 19.0
0.1154 91.0 36218 1.0767 0.2262 19.0
0.115 92.0 36616 1.0835 0.2214 19.0
0.1136 93.0 37014 1.0788 0.2284 19.0
0.1136 94.0 37412 1.0876 0.2269 19.0
0.1126 95.0 37810 1.0936 0.2212 19.0
0.1118 96.0 38208 1.0918 0.2207 19.0
0.111 97.0 38606 1.0944 0.2217 19.0
0.1106 98.0 39004 1.0962 0.2203 19.0
0.1106 99.0 39402 1.0994 0.2182 19.0
0.1088 100.0 39800 1.0999 0.2193 19.0

Framework versions

  • Transformers 4.29.2
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
2