farid1088's picture
Model save
c819ac2 verified
|
raw
history blame
No virus
6.28 kB
metadata
tags:
  - generated_from_trainer
model-index:
  - name: GQA_RoBERTa_legal_SQuAD_complete_augmented_100
    results: []

GQA_RoBERTa_legal_SQuAD_complete_augmented_100

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.7135

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 128
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 4 5.0223
No log 2.0 8 2.8588
No log 3.0 12 3.7945
No log 4.0 16 2.6351
No log 5.0 20 3.0764
No log 6.0 24 2.4511
No log 7.0 28 2.5440
No log 8.0 32 2.8751
No log 9.0 36 2.6175
No log 10.0 40 2.9249
No log 11.0 44 2.3475
No log 12.0 48 2.8935
No log 13.0 52 2.7712
No log 14.0 56 2.5911
No log 15.0 60 3.1567
No log 16.0 64 2.7002
No log 17.0 68 3.0797
No log 18.0 72 2.9672
No log 19.0 76 3.2383
No log 20.0 80 3.1447
No log 21.0 84 3.1826
No log 22.0 88 3.4054
No log 23.0 92 3.1397
No log 24.0 96 3.3258
No log 25.0 100 3.2254
No log 26.0 104 3.1559
No log 27.0 108 3.5900
No log 28.0 112 3.3985
No log 29.0 116 3.7034
No log 30.0 120 3.6305
No log 31.0 124 3.4161
No log 32.0 128 3.6157
No log 33.0 132 3.3394
No log 34.0 136 3.5239
No log 35.0 140 3.4075
No log 36.0 144 3.3560
No log 37.0 148 3.6675
No log 38.0 152 3.3822
No log 39.0 156 3.5903
No log 40.0 160 3.7123
No log 41.0 164 3.5195
No log 42.0 168 3.6451
No log 43.0 172 3.4411
No log 44.0 176 3.6223
No log 45.0 180 3.4833
No log 46.0 184 3.7552
No log 47.0 188 3.5525
No log 48.0 192 3.7382
No log 49.0 196 3.6046
No log 50.0 200 3.6106
No log 51.0 204 3.6298
No log 52.0 208 3.7597
No log 53.0 212 3.5995
No log 54.0 216 3.6429
No log 55.0 220 3.6862
No log 56.0 224 3.6334
No log 57.0 228 4.0924
No log 58.0 232 3.6489
No log 59.0 236 3.6355
No log 60.0 240 3.8356
No log 61.0 244 3.5758
No log 62.0 248 3.5889
No log 63.0 252 3.7572
No log 64.0 256 3.7237
No log 65.0 260 3.6545
No log 66.0 264 3.7671
No log 67.0 268 3.6976
No log 68.0 272 3.6523
No log 69.0 276 3.7270
No log 70.0 280 3.7120
No log 71.0 284 3.6896
No log 72.0 288 3.6892
No log 73.0 292 3.6871
No log 74.0 296 3.7154
No log 75.0 300 3.6874
No log 76.0 304 3.7207
No log 77.0 308 3.7025
No log 78.0 312 3.6979
No log 79.0 316 3.7120
No log 80.0 320 3.7363
No log 81.0 324 3.7563
No log 82.0 328 3.7290
No log 83.0 332 3.6939
No log 84.0 336 3.6881
No log 85.0 340 3.6798
No log 86.0 344 3.6658
No log 87.0 348 3.6642
No log 88.0 352 3.6678
No log 89.0 356 3.6846
No log 90.0 360 3.6806
No log 91.0 364 3.6778
No log 92.0 368 3.6777
No log 93.0 372 3.6907
No log 94.0 376 3.7072
No log 95.0 380 3.7095
No log 96.0 384 3.7103
No log 97.0 388 3.7127
No log 98.0 392 3.7138
No log 99.0 396 3.7140
No log 100.0 400 3.7135

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.7
  • Tokenizers 0.15.0