Edit model card

latin_gpt2

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 5.4614

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 3000
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
7.172 0.0057 500 7.1975
6.7223 0.0115 1000 6.9021
6.4626 0.0172 1500 6.6753
6.2743 0.0230 2000 6.5041
6.1669 0.0287 2500 6.3793
6.1212 0.0344 3000 6.3379
6.0511 0.0402 3500 6.3103
5.9919 0.0459 4000 6.2384
5.9364 0.0516 4500 6.2122
5.887 0.0574 5000 6.1205
5.8564 0.0631 5500 6.1243
5.8264 0.0689 6000 6.0895
5.7942 0.0746 6500 6.0713
5.7826 0.0803 7000 6.0359
5.7475 0.0861 7500 6.0351
5.7329 0.0918 8000 6.0039
5.7127 0.0975 8500 5.9915
5.6971 0.1033 9000 5.9835
5.6825 0.1090 9500 5.9810
5.6695 0.1148 10000 5.9542
5.6537 0.1205 10500 5.9367
5.6455 0.1262 11000 5.9178
5.6254 0.1320 11500 5.9097
5.62 0.1377 12000 5.9092
5.608 0.1434 12500 5.8881
5.5993 0.1492 13000 5.8807
5.5891 0.1549 13500 5.8707
5.5791 0.1607 14000 5.8809
5.5701 0.1664 14500 5.8585
5.5651 0.1721 15000 5.8436
5.5595 0.1779 15500 5.8607
5.5509 0.1836 16000 5.8308
5.5401 0.1894 16500 5.8381
5.535 0.1951 17000 5.8749
5.5281 0.2008 17500 5.8331
5.5231 0.2066 18000 5.8139
5.5148 0.2123 18500 5.8078
5.5112 0.2180 19000 5.8016
5.5049 0.2238 19500 5.8034
5.5006 0.2295 20000 5.8025
5.4909 0.2353 20500 5.8017
5.4835 0.2410 21000 5.7782
5.4841 0.2467 21500 5.7862
5.4794 0.2525 22000 5.7690
5.476 0.2582 22500 5.7689
5.4668 0.2639 23000 5.7806
5.4585 0.2697 23500 5.7678
5.4573 0.2754 24000 5.7499
5.4551 0.2812 24500 5.7696
5.451 0.2869 25000 5.7564
5.4465 0.2926 25500 5.7508
5.4396 0.2984 26000 5.7414
5.4356 0.3041 26500 5.7354
5.4321 0.3098 27000 5.7471
5.427 0.3156 27500 5.7296
5.4242 0.3213 28000 5.7294
5.4192 0.3271 28500 5.7252
5.4168 0.3328 29000 5.7183
5.4135 0.3385 29500 5.7241
5.4077 0.3443 30000 5.7148
5.4051 0.3500 30500 5.7215
5.3994 0.3557 31000 5.7140
5.3992 0.3615 31500 5.7079
5.3902 0.3672 32000 5.7057
5.3848 0.3730 32500 5.7047
5.3865 0.3787 33000 5.6973
5.3824 0.3844 33500 5.6938
5.3769 0.3902 34000 5.6950
5.3733 0.3959 34500 5.6885
5.3694 0.4017 35000 5.6819
5.3638 0.4074 35500 5.6770
5.3611 0.4131 36000 5.6819
5.3615 0.4189 36500 5.6705
5.354 0.4246 37000 5.6757
5.3522 0.4303 37500 5.6718
5.3422 0.4361 38000 5.6679
5.343 0.4418 38500 5.6655
5.3434 0.4476 39000 5.6591
5.3385 0.4533 39500 5.6608
5.3325 0.4590 40000 5.6629
5.3315 0.4648 40500 5.6581
5.3317 0.4705 41000 5.6534
5.3275 0.4762 41500 5.6447
5.3202 0.4820 42000 5.6451
5.3149 0.4877 42500 5.6348
5.313 0.4935 43000 5.6366
5.3122 0.4992 43500 5.6384
5.3065 0.5049 44000 5.6326
5.299 0.5107 44500 5.6226
5.2997 0.5164 45000 5.6301
5.2959 0.5221 45500 5.6172
5.2907 0.5279 46000 5.6232
5.2889 0.5336 46500 5.6239
5.288 0.5394 47000 5.6069
5.277 0.5451 47500 5.6154
5.2765 0.5508 48000 5.6157
5.2739 0.5566 48500 5.6035
5.2693 0.5623 49000 5.6009
5.2635 0.5681 49500 5.5978
5.2682 0.5738 50000 5.5987
5.258 0.5795 50500 5.5971
5.26 0.5853 51000 5.5994
5.2568 0.5910 51500 5.5873
5.2446 0.5967 52000 5.5771
5.2469 0.6025 52500 5.5824
5.2459 0.6082 53000 5.5853
5.239 0.6140 53500 5.5781
5.2355 0.6197 54000 5.5729
5.2296 0.6254 54500 5.5737
5.2301 0.6312 55000 5.5656
5.2277 0.6369 55500 5.5716
5.2197 0.6426 56000 5.5583
5.2131 0.6484 56500 5.5639
5.2132 0.6541 57000 5.5537
5.2103 0.6599 57500 5.5656
5.2078 0.6656 58000 5.5524
5.203 0.6713 58500 5.5470
5.2035 0.6771 59000 5.5454
5.1963 0.6828 59500 5.5428
5.1932 0.6885 60000 5.5355
5.1906 0.6943 60500 5.5338
5.1864 0.7000 61000 5.5352
5.1823 0.7058 61500 5.5295
5.179 0.7115 62000 5.5296
5.1752 0.7172 62500 5.5259
5.1726 0.7230 63000 5.5291
5.1692 0.7287 63500 5.5183
5.1694 0.7345 64000 5.5173
5.1624 0.7402 64500 5.5162
5.161 0.7459 65000 5.5104
5.1588 0.7517 65500 5.5145
5.156 0.7574 66000 5.5057
5.1539 0.7631 66500 5.5036
5.1474 0.7689 67000 5.5093
5.1457 0.7746 67500 5.5052
5.1444 0.7804 68000 5.4979
5.1437 0.7861 68500 5.4979
5.1402 0.7918 69000 5.5009
5.1346 0.7976 69500 5.4907
5.1308 0.8033 70000 5.4905
5.1325 0.8090 70500 5.4880
5.1297 0.8148 71000 5.4866
5.1246 0.8205 71500 5.4871
5.1232 0.8263 72000 5.4846
5.1232 0.8320 72500 5.4840
5.1219 0.8377 73000 5.4811
5.1126 0.8435 73500 5.4791
5.1168 0.8492 74000 5.4782
5.1175 0.8549 74500 5.4763
5.1108 0.8607 75000 5.4771
5.1082 0.8664 75500 5.4742
5.1053 0.8722 76000 5.4738
5.107 0.8779 76500 5.4718
5.1074 0.8836 77000 5.4694
5.1074 0.8894 77500 5.4692
5.1038 0.8951 78000 5.4697
5.105 0.9008 78500 5.4684
5.1019 0.9066 79000 5.4665
5.1015 0.9123 79500 5.4663
5.101 0.9181 80000 5.4672
5.1022 0.9238 80500 5.4654
5.1 0.9295 81000 5.4632
5.0981 0.9353 81500 5.4637
5.0981 0.9410 82000 5.4630
5.0951 0.9468 82500 5.4619
5.0941 0.9525 83000 5.4628
5.0947 0.9582 83500 5.4621
5.0949 0.9640 84000 5.4625
5.0971 0.9697 84500 5.4618
5.0895 0.9754 85000 5.4619
5.0937 0.9812 85500 5.4616
5.0978 0.9869 86000 5.4615
5.0958 0.9927 86500 5.4614
5.095 0.9984 87000 5.4614

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.2.0
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
93.4M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.