Edit model card

enlm-r

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4837

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0006
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 128
  • total_train_batch_size: 8192
  • total_eval_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-06
  • lr_scheduler_type: polynomial
  • lr_scheduler_warmup_steps: 24000
  • num_epochs: 81

Training results

Training Loss Epoch Step Validation Loss
6.4 0.33 160 10.7903
6.4 0.66 320 10.1431
6.4 0.99 480 9.8708
6.4 0.33 640 9.3884
6.4 0.66 800 8.7352
6.4 0.99 960 8.3341
6.4 1.33 1120 8.0614
6.4 1.66 1280 7.8582
4.2719 1.99 1440 7.4879
3.2 3.3 1600 7.2689
3.2 3.63 1760 7.1434
3.2 3.96 1920 7.0576
3.2 4.29 2080 7.0030
3.2 4.62 2240 6.9612
3.2 4.95 2400 6.9394
3.2 5.28 2560 6.9559
3.2 5.61 2720 6.8964
3.2 5.94 2880 6.8939
3.2 6.27 3040 6.8871
3.2 6.6 3200 6.8771
3.2 6.93 3360 6.8617
3.2 7.26 3520 6.8472
3.2 7.59 3680 6.8283
3.2 7.92 3840 6.8082
3.2 8.25 4000 6.8119
3.2 8.58 4160 6.7962
3.2 8.91 4320 6.7751
3.2 9.24 4480 6.7405
3.2 9.57 4640 6.7412
3.2 9.9 4800 6.7279
3.2 10.22 4960 6.7069
3.2 10.55 5120 6.6998
3.2 10.88 5280 6.6875
3.2 11.22 5440 6.6580
3.2 11.55 5600 6.6402
3.2 11.88 5760 6.6281
3.2 12.21 5920 6.6181
3.2 12.54 6080 6.5995
3.2 12.87 6240 6.5970
3.2 13.2 6400 6.5772
3.2 13.53 6560 6.5594
3.2 13.85 6720 6.5400
3.2 14.19 6880 6.5396
3.2 14.51 7040 6.5211
3.2 14.84 7200 6.5140
3.2 15.18 7360 6.4002
3.2 15.5 7520 6.3170
3.2 15.83 7680 6.2621
3.2 16.16 7840 6.2253
3.2 16.49 8000 6.1722
3.2 16.82 8160 6.1106
3.2 17.15 8320 6.1281
3.2 17.48 8480 6.0019
3.2 17.81 8640 5.9069
3.2 18.14 8800 5.7105
3.2 18.47 8960 5.2741
3.2 18.8 9120 5.0369
5.0352 19.13 9280 4.8148
4.5102 19.26 9440 4.3175
4.1247 19.59 9600 3.9518
3.8443 20.12 9760 3.6712
3.6334 20.45 9920 3.4654
3.4698 20.78 10080 3.2994
3.3267 21.11 10240 3.1638
3.2173 21.44 10400 3.0672
3.1255 21.77 10560 2.9687
3.0344 22.1 10720 2.8865
2.9645 22.43 10880 2.8104
2.9046 22.76 11040 2.7497
2.8707 23.09 11200 2.7040
2.7903 23.42 11360 2.6416
2.7339 23.75 11520 2.5891
2.6894 24.08 11680 2.5370
2.6461 24.41 11840 2.4960
2.5976 24.74 12000 2.4508
2.5592 25.07 12160 2.4194
2.5305 25.4 12320 2.3790
2.4993 25.73 12480 2.3509
2.465 26.06 12640 2.3173
2.4455 26.39 12800 2.2934
2.4107 26.72 12960 2.2701
2.3883 27.05 13120 2.2378
2.3568 27.38 13280 2.2079
2.3454 27.71 13440 2.1919
2.3207 28.04 13600 2.1671
2.2963 28.37 13760 2.1513
2.2738 28.7 13920 2.1326
2.2632 29.03 14080 2.1176
2.2413 29.36 14240 2.0913
2.2193 29.69 14400 2.0772
2.2169 30.02 14560 2.0692
2.1848 30.35 14720 2.0411
2.1693 30.68 14880 2.0290
2.1964 31.01 15040 2.0169
2.1467 31.34 15200 2.0016
2.1352 31.67 15360 1.9880
2.1152 32.0 15520 1.9727
2.1098 32.33 15680 1.9604
2.0888 32.66 15840 1.9521
2.0837 32.99 16000 1.9394
2.0761 33.32 16160 1.9366
2.0635 33.65 16320 1.9200
2.0631 33.98 16480 1.9147
2.0448 34.31 16640 1.9053
2.0452 34.64 16800 1.8937
2.0303 34.97 16960 1.8801
2.0184 35.3 17120 1.8752
2.0115 35.63 17280 1.8667
2.0042 35.96 17440 1.8626
2.002 36.29 17600 1.8565
1.9918 36.62 17760 1.8475
1.9868 36.95 17920 1.8420
1.9796 37.28 18080 1.8376
1.976 37.61 18240 1.8318
1.9647 37.94 18400 1.8225
1.9561 38.27 18560 1.8202
1.9544 38.6 18720 1.8084
1.9454 38.93 18880 1.8057
1.9333 39.26 19040 1.8030
1.9411 39.59 19200 1.7966
1.9289 39.92 19360 1.7865
1.9261 40.25 19520 1.7815
1.9207 40.58 19680 1.7881
1.9164 40.91 19840 1.7747
1.9152 41.24 20000 1.7786
1.914 41.57 20160 1.7664
1.901 41.9 20320 1.7586
1.8965 42.23 20480 1.7554
1.8982 42.56 20640 1.7524
1.8941 42.89 20800 1.7460
1.8834 43.22 20960 1.7488
1.8841 43.55 21120 1.7486
1.8846 43.88 21280 1.7424
1.8763 44.21 21440 1.7352
1.8688 44.54 21600 1.7349
1.8714 44.87 21760 1.7263
1.8653 45.2 21920 1.7282
1.8673 45.53 22080 1.7195
1.8682 45.85 22240 1.7266
1.8532 46.19 22400 1.7180
1.8553 46.51 22560 1.7137
1.8569 46.84 22720 1.7158
1.8469 47.18 22880 1.7117
1.845 47.5 23040 1.7031
1.8475 47.83 23200 1.7089
1.845 48.16 23360 1.7018
1.8391 48.49 23520 1.6945
1.8456 48.82 23680 1.7015
1.8305 49.15 23840 1.6964
1.8324 49.48 24000 1.6900
1.7763 49.81 24160 1.6449
1.7728 50.14 24320 1.6436
1.7576 50.47 24480 1.6268
1.7354 50.8 24640 1.6088
1.74 51.13 24800 1.6156
1.7251 51.06 24960 1.6041
1.719 51.39 25120 1.5938
1.7257 52.12 25280 1.5983
1.7184 52.45 25440 1.5919
1.7093 52.78 25600 1.5848
1.7114 53.11 25760 1.5922
1.7076 53.44 25920 1.5843
1.7 53.77 26080 1.5807
1.7027 54.1 26240 1.5811
1.704 54.43 26400 1.5766
1.6958 54.76 26560 1.5756
1.6976 55.09 26720 1.5773
1.6944 55.42 26880 1.5725
1.6891 55.75 27040 1.5685
1.6936 56.08 27200 1.5750
1.6893 56.41 27360 1.5696
1.6886 56.74 27520 1.5643
1.6936 57.07 27680 1.5691
1.6883 57.4 27840 1.5718
1.6832 57.73 28000 1.5660
1.9222 28.03 28160 1.7107
1.7838 28.19 28320 1.6345
1.7843 28.36 28480 1.6445
1.7809 28.52 28640 1.6461
1.783 28.69 28800 1.6505
1.7869 28.85 28960 1.6364
1.778 29.02 29120 1.6363
1.775 29.18 29280 1.6364
1.7697 29.34 29440 1.6345
1.7719 29.51 29600 1.6261
1.7454 61.16 29760 1.6099
1.741 61.49 29920 1.6006
1.7314 62.02 30080 1.6041
1.7314 62.35 30240 1.5914
1.7246 62.68 30400 1.5917
1.7642 63.01 30560 1.5923
1.7221 63.34 30720 1.5857
1.7185 63.67 30880 1.5836
1.7022 64.0 31040 1.5836
1.7107 64.33 31200 1.5739
1.7082 64.66 31360 1.5724
1.7055 64.99 31520 1.5734
1.7019 65.32 31680 1.5707
1.699 65.65 31840 1.5649
1.6963 65.98 32000 1.5685
1.6935 66.31 32160 1.5673
1.6899 66.64 32320 1.5648
1.6869 66.97 32480 1.5620
1.6867 67.3 32640 1.5564
1.6861 67.63 32800 1.5552
1.6831 67.96 32960 1.5496
1.6778 68.29 33120 1.5479
1.6742 68.62 33280 1.5501
1.6737 68.95 33440 1.5441
1.6725 69.28 33600 1.5399
1.6683 69.61 33760 1.5398
1.6689 69.94 33920 1.5374
1.6634 70.27 34080 1.5385
1.6638 70.6 34240 1.5332
1.6614 70.93 34400 1.5329
1.6544 71.26 34560 1.5292
1.6532 71.59 34720 1.5268
1.6511 71.92 34880 1.5225
1.6506 72.25 35040 1.5219
1.6496 72.58 35200 1.5202
1.6468 72.91 35360 1.5199
1.6424 73.24 35520 1.5220
1.642 73.57 35680 1.5145
1.6415 73.9 35840 1.5139
1.6419 74.23 36000 1.5120
1.633 74.56 36160 1.5113
1.6354 74.89 36320 1.5139
1.6312 75.22 36480 1.5068
1.6298 75.55 36640 1.5056
1.6268 75.88 36800 1.5000
1.6277 76.21 36960 1.5033
1.6198 76.54 37120 1.4988
1.6246 76.87 37280 1.4978
1.6184 77.2 37440 1.4966
1.6187 77.53 37600 1.4954
1.6192 77.85 37760 1.4951
1.6134 78.19 37920 1.4936
1.6176 78.51 38080 1.4908
1.6103 78.84 38240 1.4904
1.612 79.18 38400 1.4919
1.611 79.5 38560 1.4891
1.6082 79.83 38720 1.4837
1.6047 80.16 38880 1.4859
1.6058 80.49 39040 1.4814
1.602 80.82 39200 1.4837

Framework versions

  • Transformers 4.20.1
  • Pytorch 1.11.0
  • Datasets 2.3.2
  • Tokenizers 0.12.1
Downloads last month
0