Edit model card

StatementOfWork_Generator_Omega_BS_512_2

This model is a fine-tuned version of distilgpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8120

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 50
  • eval_batch_size: 50
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 150

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 4 0.9839
No log 2.0 8 0.9786
No log 3.0 12 0.9767
No log 4.0 16 0.9757
No log 5.0 20 0.9716
No log 6.0 24 0.9670
No log 7.0 28 0.9663
No log 8.0 32 0.9627
No log 9.0 36 0.9571
No log 10.0 40 0.9573
No log 11.0 44 0.9520
No log 12.0 48 0.9511
No log 13.0 52 0.9486
No log 14.0 56 0.9425
No log 15.0 60 0.9440
No log 16.0 64 0.9392
No log 17.0 68 0.9357
No log 18.0 72 0.9368
No log 19.0 76 0.9333
No log 20.0 80 0.9284
No log 21.0 84 0.9260
No log 22.0 88 0.9244
No log 23.0 92 0.9228
No log 24.0 96 0.9192
No log 25.0 100 0.9163
No log 26.0 104 0.9164
No log 27.0 108 0.9135
No log 28.0 112 0.9107
No log 29.0 116 0.9105
No log 30.0 120 0.9068
No log 31.0 124 0.9050
No log 32.0 128 0.9034
No log 33.0 132 0.9012
No log 34.0 136 0.8966
No log 35.0 140 0.8968
No log 36.0 144 0.8953
No log 37.0 148 0.8920
No log 38.0 152 0.8920
No log 39.0 156 0.8912
No log 40.0 160 0.8877
No log 41.0 164 0.8871
No log 42.0 168 0.8857
No log 43.0 172 0.8800
No log 44.0 176 0.8789
No log 45.0 180 0.8831
No log 46.0 184 0.8794
No log 47.0 188 0.8757
No log 48.0 192 0.8760
No log 49.0 196 0.8730
No log 50.0 200 0.8726
No log 51.0 204 0.8719
No log 52.0 208 0.8689
No log 53.0 212 0.8691
No log 54.0 216 0.8679
No log 55.0 220 0.8633
No log 56.0 224 0.8623
No log 57.0 228 0.8624
No log 58.0 232 0.8610
No log 59.0 236 0.8601
No log 60.0 240 0.8586
No log 61.0 244 0.8583
No log 62.0 248 0.8564
No log 63.0 252 0.8552
No log 64.0 256 0.8545
No log 65.0 260 0.8526
No log 66.0 264 0.8513
No log 67.0 268 0.8508
No log 68.0 272 0.8501
No log 69.0 276 0.8484
No log 70.0 280 0.8479
No log 71.0 284 0.8465
No log 72.0 288 0.8464
No log 73.0 292 0.8452
No log 74.0 296 0.8442
No log 75.0 300 0.8443
No log 76.0 304 0.8425
No log 77.0 308 0.8410
No log 78.0 312 0.8402
No log 79.0 316 0.8394
No log 80.0 320 0.8385
No log 81.0 324 0.8380
No log 82.0 328 0.8380
No log 83.0 332 0.8369
No log 84.0 336 0.8356
No log 85.0 340 0.8351
No log 86.0 344 0.8343
No log 87.0 348 0.8326
No log 88.0 352 0.8331
No log 89.0 356 0.8328
No log 90.0 360 0.8306
No log 91.0 364 0.8310
No log 92.0 368 0.8314
No log 93.0 372 0.8295
No log 94.0 376 0.8287
No log 95.0 380 0.8286
No log 96.0 384 0.8276
No log 97.0 388 0.8270
No log 98.0 392 0.8262
No log 99.0 396 0.8251
No log 100.0 400 0.8241
No log 101.0 404 0.8231
No log 102.0 408 0.8225
No log 103.0 412 0.8235
No log 104.0 416 0.8234
No log 105.0 420 0.8225
No log 106.0 424 0.8219
No log 107.0 428 0.8209
No log 108.0 432 0.8204
No log 109.0 436 0.8195
No log 110.0 440 0.8191
No log 111.0 444 0.8191
No log 112.0 448 0.8193
No log 113.0 452 0.8197
No log 114.0 456 0.8191
No log 115.0 460 0.8179
No log 116.0 464 0.8176
No log 117.0 468 0.8173
No log 118.0 472 0.8172
No log 119.0 476 0.8174
No log 120.0 480 0.8171
No log 121.0 484 0.8169
No log 122.0 488 0.8168
No log 123.0 492 0.8162
No log 124.0 496 0.8161
0.3706 125.0 500 0.8160
0.3706 126.0 504 0.8156
0.3706 127.0 508 0.8145
0.3706 128.0 512 0.8143
0.3706 129.0 516 0.8143
0.3706 130.0 520 0.8145
0.3706 131.0 524 0.8147
0.3706 132.0 528 0.8142
0.3706 133.0 532 0.8136
0.3706 134.0 536 0.8136
0.3706 135.0 540 0.8138
0.3706 136.0 544 0.8139
0.3706 137.0 548 0.8140
0.3706 138.0 552 0.8138
0.3706 139.0 556 0.8134
0.3706 140.0 560 0.8130
0.3706 141.0 564 0.8128
0.3706 142.0 568 0.8127
0.3706 143.0 572 0.8126
0.3706 144.0 576 0.8124
0.3706 145.0 580 0.8123
0.3706 146.0 584 0.8121
0.3706 147.0 588 0.8120
0.3706 148.0 592 0.8120
0.3706 149.0 596 0.8120
0.3706 150.0 600 0.8120

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
14
Safetensors
Model size
81.9M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for gjonesQ02/StatementOfWork_Generator_Omega_BS_512_2

Finetuned
(547)
this model