Edit model card

bert-dp-4

This model is a fine-tuned version of on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 2.4611

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 180
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
6.3492 1.89 1000 5.9327
5.8333 3.78 2000 5.8515
5.7604 5.67 3000 5.8483
5.7137 7.56 4000 5.7914
5.6597 9.45 5000 5.7672
5.6213 11.34 6000 5.7594
5.5798 13.23 7000 5.7352
5.5482 15.12 8000 5.7275
5.513 17.01 9000 5.7203
5.485 18.9 10000 5.7211
5.4498 20.79 11000 5.6947
5.4175 22.68 12000 5.6923
5.3877 24.57 13000 5.6879
5.3635 26.47 14000 5.6776
5.3389 28.36 15000 5.6757
5.3166 30.25 16000 5.6758
5.2951 32.14 17000 5.6676
5.2793 34.03 18000 5.6711
5.2684 35.92 19000 5.6687
5.2609 37.81 20000 5.6684
5.2606 39.7 21000 5.6719
5.2624 41.59 22000 5.6697
5.2551 43.48 23000 5.6718
5.2461 45.37 24000 5.6699
5.2431 47.26 25000 5.6692
5.2414 49.15 26000 5.6691
5.2856 51.04 27000 5.6823
5.2753 52.93 28000 5.6860
5.2549 54.82 29000 5.6877
5.2276 56.71 30000 5.6285
5.1674 58.6 31000 5.5439
5.0894 60.49 32000 5.4082
4.9508 62.38 33000 5.1598
4.7453 64.27 34000 4.9274
4.5898 66.16 35000 4.7884
4.4656 68.05 36000 4.6531
4.35 69.94 37000 4.5123
4.2378 71.83 38000 4.4012
4.1496 73.72 39000 4.3240
4.0891 75.61 40000 4.2763
4.0538 77.5 41000 4.2520
4.0448 79.4 42000 4.2485
3.9724 81.29 43000 3.9940
3.6527 83.18 44000 3.7442
3.4172 85.07 45000 3.5713
3.2446 86.96 46000 3.4403
3.4764 88.85 47000 3.3796
3.0543 90.74 48000 3.2884
2.9549 92.63 49000 3.2107
2.8785 94.52 50000 3.1466
2.8143 96.41 51000 3.0788
2.7605 98.3 52000 3.0230
2.7111 100.19 53000 2.9802
2.6727 102.08 54000 2.9414
2.6417 103.97 55000 2.9167
2.612 105.86 56000 2.8927
2.5918 107.75 57000 2.8769
2.5769 109.64 58000 2.8637
2.566 111.53 59000 2.8551
2.556 113.42 60000 2.8458
2.548 115.31 61000 2.8488
2.5468 117.2 62000 2.8412
2.5453 119.09 63000 2.8383
2.7567 120.98 64000 2.8857
2.6017 122.87 65000 2.8382
2.5416 124.76 66000 2.7862
2.484 126.65 67000 2.7415
2.4361 128.54 68000 2.7079
2.3925 130.43 69000 2.6771
2.3512 132.33 70000 2.6542
2.3146 134.22 71000 2.6327
2.2805 136.11 72000 2.6119
2.2494 138.0 73000 2.5903
2.2218 139.89 74000 2.5734
2.1955 141.78 75000 2.5584
2.1739 143.67 76000 2.5459
2.154 145.56 77000 2.5337
2.1324 147.45 78000 2.5260
2.1149 149.34 79000 2.5169
2.096 151.23 80000 2.5095
2.083 153.12 81000 2.5045
2.0666 155.01 82000 2.4911
2.0562 156.9 83000 2.4907
2.0437 158.79 84000 2.4808
2.0356 160.68 85000 2.4816
2.0317 162.57 86000 2.4758
2.0201 164.46 87000 2.4724
2.0138 166.35 88000 2.4723
2.0095 168.24 89000 2.4651
2.0056 170.13 90000 2.4651
2.0021 172.02 91000 2.4616
1.9974 173.91 92000 2.4611
1.9985 175.8 93000 2.4613
1.9954 177.69 94000 2.4579
1.9979 179.58 95000 2.4611

Framework versions

  • Transformers 4.26.1
  • Pytorch 1.11.0+cu113
  • Datasets 2.13.0
  • Tokenizers 0.13.3
Downloads last month
8