gjonesQ02's picture
Upload tokenizer
c30bd05 verified
metadata
license: apache-2.0
tags:
  - generated_from_trainer
base_model: distilgpt2
model-index:
  - name: StatementOfWork_Generator_Omega_BS_1024_2
    results: []

StatementOfWork_Generator_Omega_BS_1024_2

This model is a fine-tuned version of distilgpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8126

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 30
  • eval_batch_size: 10
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 200

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 4 0.7936
No log 2.0 8 0.7958
No log 3.0 12 0.7937
No log 4.0 16 0.7964
No log 5.0 20 0.8001
No log 6.0 24 0.7974
No log 7.0 28 0.7948
No log 8.0 32 0.7964
No log 9.0 36 0.7985
No log 10.0 40 0.7984
No log 11.0 44 0.7970
No log 12.0 48 0.7975
No log 13.0 52 0.7970
No log 14.0 56 0.7976
No log 15.0 60 0.8000
No log 16.0 64 0.8007
No log 17.0 68 0.8006
No log 18.0 72 0.8015
No log 19.0 76 0.8022
No log 20.0 80 0.8030
No log 21.0 84 0.8051
No log 22.0 88 0.8027
No log 23.0 92 0.7989
No log 24.0 96 0.7981
No log 25.0 100 0.8008
No log 26.0 104 0.8019
No log 27.0 108 0.8020
No log 28.0 112 0.8014
No log 29.0 116 0.8018
No log 30.0 120 0.8037
No log 31.0 124 0.8030
No log 32.0 128 0.8034
No log 33.0 132 0.8037
No log 34.0 136 0.8037
No log 35.0 140 0.8040
No log 36.0 144 0.8032
No log 37.0 148 0.8050
No log 38.0 152 0.8069
No log 39.0 156 0.8066
No log 40.0 160 0.8056
No log 41.0 164 0.8050
No log 42.0 168 0.8038
No log 43.0 172 0.8016
No log 44.0 176 0.8031
No log 45.0 180 0.8071
No log 46.0 184 0.8096
No log 47.0 188 0.8066
No log 48.0 192 0.8033
No log 49.0 196 0.8041
No log 50.0 200 0.8059
No log 51.0 204 0.8043
No log 52.0 208 0.8054
No log 53.0 212 0.8073
No log 54.0 216 0.8068
No log 55.0 220 0.8050
No log 56.0 224 0.8025
No log 57.0 228 0.8036
No log 58.0 232 0.8057
No log 59.0 236 0.8054
No log 60.0 240 0.8028
No log 61.0 244 0.8033
No log 62.0 248 0.8040
No log 63.0 252 0.8051
No log 64.0 256 0.8059
No log 65.0 260 0.8057
No log 66.0 264 0.8052
No log 67.0 268 0.8066
No log 68.0 272 0.8091
No log 69.0 276 0.8108
No log 70.0 280 0.8093
No log 71.0 284 0.8076
No log 72.0 288 0.8063
No log 73.0 292 0.8070
No log 74.0 296 0.8084
No log 75.0 300 0.8083
No log 76.0 304 0.8072
No log 77.0 308 0.8061
No log 78.0 312 0.8064
No log 79.0 316 0.8091
No log 80.0 320 0.8103
No log 81.0 324 0.8094
No log 82.0 328 0.8083
No log 83.0 332 0.8090
No log 84.0 336 0.8077
No log 85.0 340 0.8071
No log 86.0 344 0.8090
No log 87.0 348 0.8115
No log 88.0 352 0.8115
No log 89.0 356 0.8086
No log 90.0 360 0.8069
No log 91.0 364 0.8080
No log 92.0 368 0.8089
No log 93.0 372 0.8084
No log 94.0 376 0.8078
No log 95.0 380 0.8097
No log 96.0 384 0.8121
No log 97.0 388 0.8112
No log 98.0 392 0.8102
No log 99.0 396 0.8099
No log 100.0 400 0.8102
No log 101.0 404 0.8117
No log 102.0 408 0.8124
No log 103.0 412 0.8136
No log 104.0 416 0.8132
No log 105.0 420 0.8114
No log 106.0 424 0.8110
No log 107.0 428 0.8119
No log 108.0 432 0.8128
No log 109.0 436 0.8131
No log 110.0 440 0.8135
No log 111.0 444 0.8126
No log 112.0 448 0.8110
No log 113.0 452 0.8106
No log 114.0 456 0.8113
No log 115.0 460 0.8113
No log 116.0 464 0.8115
No log 117.0 468 0.8131
No log 118.0 472 0.8141
No log 119.0 476 0.8145
No log 120.0 480 0.8141
No log 121.0 484 0.8124
No log 122.0 488 0.8115
No log 123.0 492 0.8112
No log 124.0 496 0.8110
0.0586 125.0 500 0.8112
0.0586 126.0 504 0.8111
0.0586 127.0 508 0.8106
0.0586 128.0 512 0.8105
0.0586 129.0 516 0.8111
0.0586 130.0 520 0.8115
0.0586 131.0 524 0.8118
0.0586 132.0 528 0.8110
0.0586 133.0 532 0.8108
0.0586 134.0 536 0.8118
0.0586 135.0 540 0.8121
0.0586 136.0 544 0.8117
0.0586 137.0 548 0.8120
0.0586 138.0 552 0.8126
0.0586 139.0 556 0.8139
0.0586 140.0 560 0.8143
0.0586 141.0 564 0.8137
0.0586 142.0 568 0.8133
0.0586 143.0 572 0.8132
0.0586 144.0 576 0.8134
0.0586 145.0 580 0.8133
0.0586 146.0 584 0.8132
0.0586 147.0 588 0.8129
0.0586 148.0 592 0.8133
0.0586 149.0 596 0.8141
0.0586 150.0 600 0.8145
0.0586 151.0 604 0.8146
0.0586 152.0 608 0.8137
0.0586 153.0 612 0.8127
0.0586 154.0 616 0.8123
0.0586 155.0 620 0.8123
0.0586 156.0 624 0.8128
0.0586 157.0 628 0.8136
0.0586 158.0 632 0.8142
0.0586 159.0 636 0.8144
0.0586 160.0 640 0.8144
0.0586 161.0 644 0.8143
0.0586 162.0 648 0.8139
0.0586 163.0 652 0.8134
0.0586 164.0 656 0.8130
0.0586 165.0 660 0.8126
0.0586 166.0 664 0.8126
0.0586 167.0 668 0.8129
0.0586 168.0 672 0.8134
0.0586 169.0 676 0.8136
0.0586 170.0 680 0.8138
0.0586 171.0 684 0.8138
0.0586 172.0 688 0.8138
0.0586 173.0 692 0.8138
0.0586 174.0 696 0.8135
0.0586 175.0 700 0.8129
0.0586 176.0 704 0.8126
0.0586 177.0 708 0.8126
0.0586 178.0 712 0.8127
0.0586 179.0 716 0.8129
0.0586 180.0 720 0.8132
0.0586 181.0 724 0.8135
0.0586 182.0 728 0.8135
0.0586 183.0 732 0.8135
0.0586 184.0 736 0.8132
0.0586 185.0 740 0.8129
0.0586 186.0 744 0.8127
0.0586 187.0 748 0.8126
0.0586 188.0 752 0.8125
0.0586 189.0 756 0.8125
0.0586 190.0 760 0.8125
0.0586 191.0 764 0.8125
0.0586 192.0 768 0.8126
0.0586 193.0 772 0.8127
0.0586 194.0 776 0.8126
0.0586 195.0 780 0.8126
0.0586 196.0 784 0.8126
0.0586 197.0 788 0.8126
0.0586 198.0 792 0.8126
0.0586 199.0 796 0.8126
0.0586 200.0 800 0.8126

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2