Edit model card

BERiT_2000_custom_architecture_relu_40_epochs

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 6.3968

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40
  • label_smoothing_factor: 0.2

Training results

Training Loss Epoch Step Validation Loss
14.8756 0.19 500 8.3404
7.8098 0.39 1000 7.3110
7.2696 0.58 1500 7.1646
7.1277 0.77 2000 7.0953
7.0939 0.97 2500 7.0701
7.0621 1.16 3000 7.0003
7.0236 1.36 3500 6.9189
6.9898 1.55 4000 6.8536
6.9625 1.74 4500 6.8450
6.9125 1.94 5000 6.7799
6.9115 2.13 5500 6.8028
6.8954 2.32 6000 6.7288
6.8289 2.52 6500 6.7664
6.855 2.71 7000 6.7064
6.8134 2.9 7500 6.7155
6.7907 3.1 8000 6.7050
6.791 3.29 8500 6.6695
6.7659 3.49 9000 6.6815
6.775 3.68 9500 6.6449
6.7508 3.87 10000 6.6684
6.7627 4.07 10500 6.6397
6.7229 4.26 11000 6.6417
6.7336 4.45 11500 6.6824
6.7138 4.65 12000 6.6252
6.7123 4.84 12500 6.6374
6.703 5.03 13000 6.6400
6.7054 5.23 13500 6.6264
6.6978 5.42 14000 6.6094
6.6944 5.62 14500 6.6627
6.6857 5.81 15000 6.6363
6.694 6.0 15500 6.6026
6.6882 6.2 16000 6.6123
6.6694 6.39 16500 6.5918
6.6781 6.58 17000 6.6201
6.6605 6.78 17500 6.6316
6.6435 6.97 18000 6.5789
6.6658 7.16 18500 6.5999
6.6551 7.36 19000 6.5425
6.6603 7.55 19500 6.5790
6.6589 7.75 20000 6.5767
6.675 7.94 20500 6.6005
6.6362 8.13 21000 6.5962
6.6391 8.33 21500 6.5716
6.6379 8.52 22000 6.5830
6.6164 8.71 22500 6.6137
6.638 8.91 23000 6.5877
6.6255 9.1 23500 6.6197
6.6284 9.3 24000 6.5573
6.6198 9.49 24500 6.5717
6.6025 9.68 25000 6.5627
6.6334 9.88 25500 6.5902
6.6305 10.07 26000 6.5628
6.5797 10.26 26500 6.5625
6.5906 10.46 27000 6.5808
6.5904 10.65 27500 6.5690
6.5935 10.84 28000 6.5845
6.6231 11.04 28500 6.5282
6.5923 11.23 29000 6.6107
6.6136 11.43 29500 6.5475
6.5954 11.62 30000 6.5823
6.5821 11.81 30500 6.5721
6.5993 12.01 31000 6.5492
6.5584 12.2 31500 6.4938
6.5886 12.39 32000 6.6026
6.5625 12.59 32500 6.5902
6.572 12.78 33000 6.5436
6.5807 12.97 33500 6.5588
6.5853 13.17 34000 6.5555
6.5727 13.36 34500 6.5606
6.5456 13.56 35000 6.5386
6.5538 13.75 35500 6.5712
6.5456 13.94 36000 6.5582
6.5734 14.14 36500 6.4951
6.5639 14.33 37000 6.5323
6.5712 14.52 37500 6.5049
6.5739 14.72 38000 6.5523
6.5534 14.91 38500 6.5188
6.5401 15.1 39000 6.5968
6.5456 15.3 39500 6.5413
6.5555 15.49 40000 6.5347
6.538 15.69 40500 6.5180
6.537 15.88 41000 6.5372
6.537 16.07 41500 6.5514
6.5445 16.27 42000 6.5242
6.5285 16.46 42500 6.5071
6.5046 16.65 43000 6.5342
6.5609 16.85 43500 6.5329
6.527 17.04 44000 6.5569
6.5199 17.23 44500 6.5438
6.5328 17.43 45000 6.5380
6.5183 17.62 45500 6.5273
6.5349 17.82 46000 6.5209
6.5283 18.01 46500 6.4884
6.5111 18.2 47000 6.5036
6.4895 18.4 47500 6.5675
6.5308 18.59 48000 6.5378
6.5159 18.78 48500 6.4792
6.4875 18.98 49000 6.4846
6.5076 19.17 49500 6.5203
6.4991 19.36 50000 6.5007
6.5269 19.56 50500 6.4796
6.4887 19.75 51000 6.5197
6.4995 19.95 51500 6.5009
6.4762 20.14 52000 6.5049
6.4872 20.33 52500 6.4880
6.5117 20.53 53000 6.4917
6.5035 20.72 53500 6.4791
6.4784 20.91 54000 6.4771
6.4749 21.11 54500 6.5230
6.4867 21.3 55000 6.4954
6.4921 21.49 55500 6.5079
6.4587 21.69 56000 6.5309
6.4839 21.88 56500 6.4476
6.5011 22.08 57000 6.5025
6.471 22.27 57500 6.5122
6.4689 22.46 58000 6.4689
6.4764 22.66 58500 6.5073
6.4764 22.85 59000 6.4741
6.4751 23.04 59500 6.4978
6.4823 23.24 60000 6.4857
6.4594 23.43 60500 6.4817
6.4795 23.63 61000 6.5292
6.4565 23.82 61500 6.4684
6.4627 24.01 62000 6.4900
6.4542 24.21 62500 6.4373
6.4692 24.4 63000 6.4787
6.4772 24.59 63500 6.4553
6.4613 24.79 64000 6.4695
6.4673 24.98 64500 6.5077
6.466 25.17 65000 6.4919
6.4595 25.37 65500 6.4451
6.444 25.56 66000 6.4750
6.438 25.76 66500 6.4672
6.4499 25.95 67000 6.4358
6.4578 26.14 67500 6.4762
6.4701 26.34 68000 6.4462
6.4296 26.53 68500 6.4879
6.4305 26.72 69000 6.4519
6.443 26.92 69500 6.4530
6.4571 27.11 70000 6.4564
6.4477 27.3 70500 6.4557
6.443 27.5 71000 6.4862
6.4429 27.69 71500 6.4498
6.4374 27.89 72000 6.4225
6.4363 28.08 72500 6.4723
6.4127 28.27 73000 6.4733
6.4116 28.47 73500 6.4499
6.4312 28.66 74000 6.4600
6.4251 28.85 74500 6.4451
6.4318 29.05 75000 6.4337
6.4432 29.24 75500 6.4713
6.4183 29.43 76000 6.4699
6.4109 29.63 76500 6.4591
6.3939 29.82 77000 6.4768
6.4194 30.02 77500 6.4786
6.4262 30.21 78000 6.4407
6.4392 30.4 78500 6.4202
6.4311 30.6 79000 6.4361
6.3963 30.79 79500 6.4346
6.3872 30.98 80000 6.3810
6.4277 31.18 80500 6.4451
6.4112 31.37 81000 6.4243
6.4202 31.56 81500 6.4502
6.444 31.76 82000 6.4572
6.4066 31.95 82500 6.4033
6.4101 32.15 83000 6.4154
6.3985 32.34 83500 6.4377
6.4294 32.53 84000 6.4392
6.397 32.73 84500 6.4387
6.4217 32.92 85000 6.4305
6.4061 33.11 85500 6.4541
6.4014 33.31 86000 6.4173
6.4223 33.5 86500 6.4403
6.3953 33.69 87000 6.4333
6.4135 33.89 87500 6.4183
6.3955 34.08 88000 6.3958
6.4064 34.28 88500 6.3913
6.3997 34.47 89000 6.4330
6.4212 34.66 89500 6.3955
6.3957 34.86 90000 6.4438
6.3936 35.05 90500 6.4382
6.3927 35.24 91000 6.4055
6.3972 35.44 91500 6.4006
6.4137 35.63 92000 6.4245
6.3947 35.82 92500 6.4057
6.3798 36.02 93000 6.4006
6.4011 36.21 93500 6.3943
6.4012 36.41 94000 6.3766
6.3961 36.6 94500 6.4260
6.3819 36.79 95000 6.3801
6.3795 36.99 95500 6.4019
6.3954 37.18 96000 6.4387
6.3874 37.37 96500 6.4477
6.3844 37.57 97000 6.4177
6.3898 37.76 97500 6.4213
6.3855 37.96 98000 6.3838
6.3825 38.15 98500 6.4048
6.3615 38.34 99000 6.4636
6.392 38.54 99500 6.4197
6.3773 38.73 100000 6.4505
6.3834 38.92 100500 6.3889
6.3846 39.12 101000 6.4394
6.376 39.31 101500 6.3923
6.3699 39.5 102000 6.4025
6.3826 39.7 102500 6.3951
6.373 39.89 103000 6.3968

Framework versions

  • Transformers 4.24.0
  • Pytorch 1.12.1+cu113
  • Datasets 2.7.1
  • Tokenizers 0.13.2
Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.