Edit model card

47185774_0

This model is a fine-tuned version of openai-community/gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5103
  • Accuracy: 0.0000

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1.41e-05
  • train_batch_size: 32
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 256
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 2000

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.0647 1.67 25 0.9608 0.0001
0.8454 3.33 50 0.7926 0.0001
0.7865 5.0 75 0.7448 0.0001
0.757 6.67 100 0.7207 0.0001
0.7338 8.33 125 0.7058 0.0000
0.721 10.0 150 0.6932 0.0000
0.7216 11.67 175 0.6845 0.0000
0.709 13.33 200 0.6766 0.0000
0.6942 15.0 225 0.6694 0.0000
0.6921 16.67 250 0.6633 0.0000
0.6945 18.33 275 0.6582 0.0000
0.6826 20.0 300 0.6524 0.0000
0.6702 21.67 325 0.6475 0.0000
0.6637 23.33 350 0.6428 0.0000
0.6695 25.0 375 0.6384 0.0000
0.6615 26.67 400 0.6334 0.0000
0.6491 28.33 425 0.6299 0.0000
0.6553 30.0 450 0.6254 0.0000
0.6473 31.67 475 0.6216 0.0000
0.6345 33.33 500 0.6183 0.0000
0.6486 35.0 525 0.6145 0.0000
0.6365 36.67 550 0.6109 0.0000
0.6313 38.33 575 0.6079 0.0000
0.636 40.0 600 0.6048 0.0000
0.6268 41.67 625 0.6015 0.0000
0.6274 43.33 650 0.5981 0.0000
0.6135 45.0 675 0.5952 0.0000
0.6219 46.67 700 0.5919 0.0000
0.6166 48.33 725 0.5890 0.0000
0.6136 50.0 750 0.5863 0.0000
0.6053 51.67 775 0.5837 0.0000
0.5999 53.33 800 0.5806 0.0000
0.6141 55.0 825 0.5782 0.0000
0.6058 56.67 850 0.5753 0.0000
0.6026 58.33 875 0.5728 0.0000
0.5949 60.0 900 0.5702 0.0000
0.6007 61.67 925 0.5675 0.0000
0.5927 63.33 950 0.5647 0.0000
0.593 65.0 975 0.5625 0.0000
0.5831 66.67 1000 0.5597 0.0000
0.6008 68.33 1025 0.5574 0.0000
0.5877 70.0 1050 0.5551 0.0000
0.5847 71.67 1075 0.5524 0.0000
0.5794 73.33 1100 0.5500 0.0000
0.5952 75.0 1125 0.5474 0.0000
0.5857 76.67 1150 0.5455 0.0000
0.5787 78.33 1175 0.5433 0.0000
0.5963 80.0 1200 0.5416 0.0000
0.564 81.67 1225 0.5390 0.0000
0.5691 83.33 1250 0.5368 0.0000
0.5837 85.0 1275 0.5352 0.0000
0.5668 86.67 1300 0.5333 0.0000
0.5716 88.33 1325 0.5319 0.0000
0.557 90.0 1350 0.5300 0.0000
0.5539 91.67 1375 0.5286 0.0000
0.5683 93.33 1400 0.5273 0.0000
0.5671 95.0 1425 0.5255 0.0000
0.5723 96.67 1450 0.5244 0.0000
0.5627 98.33 1475 0.5229 0.0000
0.5627 100.0 1500 0.5215 0.0000
0.5606 101.67 1525 0.5205 0.0000
0.5651 103.33 1550 0.5192 0.0000
0.5703 105.0 1575 0.5184 0.0000
0.5533 106.67 1600 0.5176 0.0000
0.5627 108.33 1625 0.5165 0.0000
0.5563 110.0 1650 0.5158 0.0000
0.5526 111.67 1675 0.5151 0.0000
0.5432 113.33 1700 0.5143 0.0000
0.5224 115.0 1725 0.5136 0.0000
0.5622 116.67 1750 0.5130 0.0000
0.5494 118.33 1775 0.5125 0.0000
0.554 120.0 1800 0.5121 0.0000
0.5495 121.67 1825 0.5116 0.0000
0.5458 123.33 1850 0.5113 0.0000
0.552 125.0 1875 0.5110 0.0000
0.5475 126.67 1900 0.5108 0.0000
0.5493 128.33 1925 0.5105 0.0000
0.5296 130.0 1950 0.5105 0.0000
0.5562 131.67 1975 0.5104 0.0000
0.561 133.33 2000 0.5103 0.0000

Framework versions

  • Transformers 4.37.0
  • Pytorch 2.0.0+cu118
  • Datasets 2.16.1
  • Tokenizers 0.15.1
Downloads last month
0
Safetensors
Model size
124M params
Tensor type
F32
·

Finetuned from