enlm-r
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.4837
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0006
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 128
- total_train_batch_size: 8192
- total_eval_batch_size: 64
- optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-06
- lr_scheduler_type: polynomial
- lr_scheduler_warmup_steps: 24000
- num_epochs: 81
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
6.4 | 0.33 | 160 | 10.7903 |
6.4 | 0.66 | 320 | 10.1431 |
6.4 | 0.99 | 480 | 9.8708 |
6.4 | 0.33 | 640 | 9.3884 |
6.4 | 0.66 | 800 | 8.7352 |
6.4 | 0.99 | 960 | 8.3341 |
6.4 | 1.33 | 1120 | 8.0614 |
6.4 | 1.66 | 1280 | 7.8582 |
4.2719 | 1.99 | 1440 | 7.4879 |
3.2 | 3.3 | 1600 | 7.2689 |
3.2 | 3.63 | 1760 | 7.1434 |
3.2 | 3.96 | 1920 | 7.0576 |
3.2 | 4.29 | 2080 | 7.0030 |
3.2 | 4.62 | 2240 | 6.9612 |
3.2 | 4.95 | 2400 | 6.9394 |
3.2 | 5.28 | 2560 | 6.9559 |
3.2 | 5.61 | 2720 | 6.8964 |
3.2 | 5.94 | 2880 | 6.8939 |
3.2 | 6.27 | 3040 | 6.8871 |
3.2 | 6.6 | 3200 | 6.8771 |
3.2 | 6.93 | 3360 | 6.8617 |
3.2 | 7.26 | 3520 | 6.8472 |
3.2 | 7.59 | 3680 | 6.8283 |
3.2 | 7.92 | 3840 | 6.8082 |
3.2 | 8.25 | 4000 | 6.8119 |
3.2 | 8.58 | 4160 | 6.7962 |
3.2 | 8.91 | 4320 | 6.7751 |
3.2 | 9.24 | 4480 | 6.7405 |
3.2 | 9.57 | 4640 | 6.7412 |
3.2 | 9.9 | 4800 | 6.7279 |
3.2 | 10.22 | 4960 | 6.7069 |
3.2 | 10.55 | 5120 | 6.6998 |
3.2 | 10.88 | 5280 | 6.6875 |
3.2 | 11.22 | 5440 | 6.6580 |
3.2 | 11.55 | 5600 | 6.6402 |
3.2 | 11.88 | 5760 | 6.6281 |
3.2 | 12.21 | 5920 | 6.6181 |
3.2 | 12.54 | 6080 | 6.5995 |
3.2 | 12.87 | 6240 | 6.5970 |
3.2 | 13.2 | 6400 | 6.5772 |
3.2 | 13.53 | 6560 | 6.5594 |
3.2 | 13.85 | 6720 | 6.5400 |
3.2 | 14.19 | 6880 | 6.5396 |
3.2 | 14.51 | 7040 | 6.5211 |
3.2 | 14.84 | 7200 | 6.5140 |
3.2 | 15.18 | 7360 | 6.4002 |
3.2 | 15.5 | 7520 | 6.3170 |
3.2 | 15.83 | 7680 | 6.2621 |
3.2 | 16.16 | 7840 | 6.2253 |
3.2 | 16.49 | 8000 | 6.1722 |
3.2 | 16.82 | 8160 | 6.1106 |
3.2 | 17.15 | 8320 | 6.1281 |
3.2 | 17.48 | 8480 | 6.0019 |
3.2 | 17.81 | 8640 | 5.9069 |
3.2 | 18.14 | 8800 | 5.7105 |
3.2 | 18.47 | 8960 | 5.2741 |
3.2 | 18.8 | 9120 | 5.0369 |
5.0352 | 19.13 | 9280 | 4.8148 |
4.5102 | 19.26 | 9440 | 4.3175 |
4.1247 | 19.59 | 9600 | 3.9518 |
3.8443 | 20.12 | 9760 | 3.6712 |
3.6334 | 20.45 | 9920 | 3.4654 |
3.4698 | 20.78 | 10080 | 3.2994 |
3.3267 | 21.11 | 10240 | 3.1638 |
3.2173 | 21.44 | 10400 | 3.0672 |
3.1255 | 21.77 | 10560 | 2.9687 |
3.0344 | 22.1 | 10720 | 2.8865 |
2.9645 | 22.43 | 10880 | 2.8104 |
2.9046 | 22.76 | 11040 | 2.7497 |
2.8707 | 23.09 | 11200 | 2.7040 |
2.7903 | 23.42 | 11360 | 2.6416 |
2.7339 | 23.75 | 11520 | 2.5891 |
2.6894 | 24.08 | 11680 | 2.5370 |
2.6461 | 24.41 | 11840 | 2.4960 |
2.5976 | 24.74 | 12000 | 2.4508 |
2.5592 | 25.07 | 12160 | 2.4194 |
2.5305 | 25.4 | 12320 | 2.3790 |
2.4993 | 25.73 | 12480 | 2.3509 |
2.465 | 26.06 | 12640 | 2.3173 |
2.4455 | 26.39 | 12800 | 2.2934 |
2.4107 | 26.72 | 12960 | 2.2701 |
2.3883 | 27.05 | 13120 | 2.2378 |
2.3568 | 27.38 | 13280 | 2.2079 |
2.3454 | 27.71 | 13440 | 2.1919 |
2.3207 | 28.04 | 13600 | 2.1671 |
2.2963 | 28.37 | 13760 | 2.1513 |
2.2738 | 28.7 | 13920 | 2.1326 |
2.2632 | 29.03 | 14080 | 2.1176 |
2.2413 | 29.36 | 14240 | 2.0913 |
2.2193 | 29.69 | 14400 | 2.0772 |
2.2169 | 30.02 | 14560 | 2.0692 |
2.1848 | 30.35 | 14720 | 2.0411 |
2.1693 | 30.68 | 14880 | 2.0290 |
2.1964 | 31.01 | 15040 | 2.0169 |
2.1467 | 31.34 | 15200 | 2.0016 |
2.1352 | 31.67 | 15360 | 1.9880 |
2.1152 | 32.0 | 15520 | 1.9727 |
2.1098 | 32.33 | 15680 | 1.9604 |
2.0888 | 32.66 | 15840 | 1.9521 |
2.0837 | 32.99 | 16000 | 1.9394 |
2.0761 | 33.32 | 16160 | 1.9366 |
2.0635 | 33.65 | 16320 | 1.9200 |
2.0631 | 33.98 | 16480 | 1.9147 |
2.0448 | 34.31 | 16640 | 1.9053 |
2.0452 | 34.64 | 16800 | 1.8937 |
2.0303 | 34.97 | 16960 | 1.8801 |
2.0184 | 35.3 | 17120 | 1.8752 |
2.0115 | 35.63 | 17280 | 1.8667 |
2.0042 | 35.96 | 17440 | 1.8626 |
2.002 | 36.29 | 17600 | 1.8565 |
1.9918 | 36.62 | 17760 | 1.8475 |
1.9868 | 36.95 | 17920 | 1.8420 |
1.9796 | 37.28 | 18080 | 1.8376 |
1.976 | 37.61 | 18240 | 1.8318 |
1.9647 | 37.94 | 18400 | 1.8225 |
1.9561 | 38.27 | 18560 | 1.8202 |
1.9544 | 38.6 | 18720 | 1.8084 |
1.9454 | 38.93 | 18880 | 1.8057 |
1.9333 | 39.26 | 19040 | 1.8030 |
1.9411 | 39.59 | 19200 | 1.7966 |
1.9289 | 39.92 | 19360 | 1.7865 |
1.9261 | 40.25 | 19520 | 1.7815 |
1.9207 | 40.58 | 19680 | 1.7881 |
1.9164 | 40.91 | 19840 | 1.7747 |
1.9152 | 41.24 | 20000 | 1.7786 |
1.914 | 41.57 | 20160 | 1.7664 |
1.901 | 41.9 | 20320 | 1.7586 |
1.8965 | 42.23 | 20480 | 1.7554 |
1.8982 | 42.56 | 20640 | 1.7524 |
1.8941 | 42.89 | 20800 | 1.7460 |
1.8834 | 43.22 | 20960 | 1.7488 |
1.8841 | 43.55 | 21120 | 1.7486 |
1.8846 | 43.88 | 21280 | 1.7424 |
1.8763 | 44.21 | 21440 | 1.7352 |
1.8688 | 44.54 | 21600 | 1.7349 |
1.8714 | 44.87 | 21760 | 1.7263 |
1.8653 | 45.2 | 21920 | 1.7282 |
1.8673 | 45.53 | 22080 | 1.7195 |
1.8682 | 45.85 | 22240 | 1.7266 |
1.8532 | 46.19 | 22400 | 1.7180 |
1.8553 | 46.51 | 22560 | 1.7137 |
1.8569 | 46.84 | 22720 | 1.7158 |
1.8469 | 47.18 | 22880 | 1.7117 |
1.845 | 47.5 | 23040 | 1.7031 |
1.8475 | 47.83 | 23200 | 1.7089 |
1.845 | 48.16 | 23360 | 1.7018 |
1.8391 | 48.49 | 23520 | 1.6945 |
1.8456 | 48.82 | 23680 | 1.7015 |
1.8305 | 49.15 | 23840 | 1.6964 |
1.8324 | 49.48 | 24000 | 1.6900 |
1.7763 | 49.81 | 24160 | 1.6449 |
1.7728 | 50.14 | 24320 | 1.6436 |
1.7576 | 50.47 | 24480 | 1.6268 |
1.7354 | 50.8 | 24640 | 1.6088 |
1.74 | 51.13 | 24800 | 1.6156 |
1.7251 | 51.06 | 24960 | 1.6041 |
1.719 | 51.39 | 25120 | 1.5938 |
1.7257 | 52.12 | 25280 | 1.5983 |
1.7184 | 52.45 | 25440 | 1.5919 |
1.7093 | 52.78 | 25600 | 1.5848 |
1.7114 | 53.11 | 25760 | 1.5922 |
1.7076 | 53.44 | 25920 | 1.5843 |
1.7 | 53.77 | 26080 | 1.5807 |
1.7027 | 54.1 | 26240 | 1.5811 |
1.704 | 54.43 | 26400 | 1.5766 |
1.6958 | 54.76 | 26560 | 1.5756 |
1.6976 | 55.09 | 26720 | 1.5773 |
1.6944 | 55.42 | 26880 | 1.5725 |
1.6891 | 55.75 | 27040 | 1.5685 |
1.6936 | 56.08 | 27200 | 1.5750 |
1.6893 | 56.41 | 27360 | 1.5696 |
1.6886 | 56.74 | 27520 | 1.5643 |
1.6936 | 57.07 | 27680 | 1.5691 |
1.6883 | 57.4 | 27840 | 1.5718 |
1.6832 | 57.73 | 28000 | 1.5660 |
1.9222 | 28.03 | 28160 | 1.7107 |
1.7838 | 28.19 | 28320 | 1.6345 |
1.7843 | 28.36 | 28480 | 1.6445 |
1.7809 | 28.52 | 28640 | 1.6461 |
1.783 | 28.69 | 28800 | 1.6505 |
1.7869 | 28.85 | 28960 | 1.6364 |
1.778 | 29.02 | 29120 | 1.6363 |
1.775 | 29.18 | 29280 | 1.6364 |
1.7697 | 29.34 | 29440 | 1.6345 |
1.7719 | 29.51 | 29600 | 1.6261 |
1.7454 | 61.16 | 29760 | 1.6099 |
1.741 | 61.49 | 29920 | 1.6006 |
1.7314 | 62.02 | 30080 | 1.6041 |
1.7314 | 62.35 | 30240 | 1.5914 |
1.7246 | 62.68 | 30400 | 1.5917 |
1.7642 | 63.01 | 30560 | 1.5923 |
1.7221 | 63.34 | 30720 | 1.5857 |
1.7185 | 63.67 | 30880 | 1.5836 |
1.7022 | 64.0 | 31040 | 1.5836 |
1.7107 | 64.33 | 31200 | 1.5739 |
1.7082 | 64.66 | 31360 | 1.5724 |
1.7055 | 64.99 | 31520 | 1.5734 |
1.7019 | 65.32 | 31680 | 1.5707 |
1.699 | 65.65 | 31840 | 1.5649 |
1.6963 | 65.98 | 32000 | 1.5685 |
1.6935 | 66.31 | 32160 | 1.5673 |
1.6899 | 66.64 | 32320 | 1.5648 |
1.6869 | 66.97 | 32480 | 1.5620 |
1.6867 | 67.3 | 32640 | 1.5564 |
1.6861 | 67.63 | 32800 | 1.5552 |
1.6831 | 67.96 | 32960 | 1.5496 |
1.6778 | 68.29 | 33120 | 1.5479 |
1.6742 | 68.62 | 33280 | 1.5501 |
1.6737 | 68.95 | 33440 | 1.5441 |
1.6725 | 69.28 | 33600 | 1.5399 |
1.6683 | 69.61 | 33760 | 1.5398 |
1.6689 | 69.94 | 33920 | 1.5374 |
1.6634 | 70.27 | 34080 | 1.5385 |
1.6638 | 70.6 | 34240 | 1.5332 |
1.6614 | 70.93 | 34400 | 1.5329 |
1.6544 | 71.26 | 34560 | 1.5292 |
1.6532 | 71.59 | 34720 | 1.5268 |
1.6511 | 71.92 | 34880 | 1.5225 |
1.6506 | 72.25 | 35040 | 1.5219 |
1.6496 | 72.58 | 35200 | 1.5202 |
1.6468 | 72.91 | 35360 | 1.5199 |
1.6424 | 73.24 | 35520 | 1.5220 |
1.642 | 73.57 | 35680 | 1.5145 |
1.6415 | 73.9 | 35840 | 1.5139 |
1.6419 | 74.23 | 36000 | 1.5120 |
1.633 | 74.56 | 36160 | 1.5113 |
1.6354 | 74.89 | 36320 | 1.5139 |
1.6312 | 75.22 | 36480 | 1.5068 |
1.6298 | 75.55 | 36640 | 1.5056 |
1.6268 | 75.88 | 36800 | 1.5000 |
1.6277 | 76.21 | 36960 | 1.5033 |
1.6198 | 76.54 | 37120 | 1.4988 |
1.6246 | 76.87 | 37280 | 1.4978 |
1.6184 | 77.2 | 37440 | 1.4966 |
1.6187 | 77.53 | 37600 | 1.4954 |
1.6192 | 77.85 | 37760 | 1.4951 |
1.6134 | 78.19 | 37920 | 1.4936 |
1.6176 | 78.51 | 38080 | 1.4908 |
1.6103 | 78.84 | 38240 | 1.4904 |
1.612 | 79.18 | 38400 | 1.4919 |
1.611 | 79.5 | 38560 | 1.4891 |
1.6082 | 79.83 | 38720 | 1.4837 |
1.6047 | 80.16 | 38880 | 1.4859 |
1.6058 | 80.49 | 39040 | 1.4814 |
1.602 | 80.82 | 39200 | 1.4837 |
Framework versions
- Transformers 4.20.1
- Pytorch 1.11.0
- Datasets 2.3.2
- Tokenizers 0.12.1
- Downloads last month
- 443