Edit model card

47163343_0

This model is a fine-tuned version of openai-community/gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6417
  • Accuracy: 0.0001

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1.41e-05
  • train_batch_size: 32
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 256
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 2000

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.0667 0.48 25 0.9624 0.0001
0.854 0.97 50 0.8076 0.0001
0.7965 1.45 75 0.7675 0.0002
0.7817 1.93 100 0.7461 0.0001
0.7668 2.42 125 0.7326 0.0001
0.7533 2.9 150 0.7225 0.0001
0.7479 3.38 175 0.7150 0.0001
0.7325 3.86 200 0.7104 0.0001
0.7397 4.35 225 0.7055 0.0001
0.7255 4.83 250 0.7020 0.0001
0.7182 5.31 275 0.6983 0.0001
0.7161 5.8 300 0.6955 0.0001
0.7162 6.28 325 0.6925 0.0001
0.7043 6.76 350 0.6901 0.0001
0.7076 7.25 375 0.6868 0.0001
0.7042 7.73 400 0.6844 0.0001
0.6975 8.21 425 0.6820 0.0001
0.7021 8.7 450 0.6799 0.0001
0.6955 9.18 475 0.6778 0.0001
0.6868 9.66 500 0.6763 0.0001
0.6866 10.14 525 0.6744 0.0001
0.6903 10.63 550 0.6723 0.0001
0.6786 11.11 575 0.6709 0.0001
0.6843 11.59 600 0.6695 0.0001
0.6835 12.08 625 0.6680 0.0001
0.6819 12.56 650 0.6669 0.0001
0.6804 13.04 675 0.6656 0.0001
0.6748 13.53 700 0.6642 0.0001
0.674 14.01 725 0.6637 0.0001
0.6731 14.49 750 0.6624 0.0001
0.681 14.98 775 0.6611 0.0001
0.6763 15.46 800 0.6602 0.0001
0.677 15.94 825 0.6597 0.0001
0.6725 16.43 850 0.6583 0.0001
0.6669 16.91 875 0.6574 0.0001
0.6682 17.39 900 0.6567 0.0001
0.669 17.87 925 0.6559 0.0001
0.6647 18.36 950 0.6554 0.0001
0.664 18.84 975 0.6549 0.0001
0.6563 19.32 1000 0.6542 0.0001
0.6656 19.81 1025 0.6533 0.0001
0.6634 20.29 1050 0.6530 0.0001
0.6592 20.77 1075 0.6521 0.0001
0.6558 21.26 1100 0.6514 0.0001
0.6664 21.74 1125 0.6511 0.0001
0.6561 22.22 1150 0.6504 0.0001
0.6634 22.71 1175 0.6499 0.0001
0.6679 23.19 1200 0.6491 0.0001
0.6625 23.67 1225 0.6489 0.0001
0.6619 24.15 1250 0.6483 0.0001
0.6495 24.64 1275 0.6479 0.0001
0.6547 25.12 1300 0.6474 0.0001
0.6649 25.6 1325 0.6469 0.0001
0.6551 26.09 1350 0.6466 0.0001
0.6547 26.57 1375 0.6463 0.0001
0.6546 27.05 1400 0.6458 0.0001
0.6576 27.54 1425 0.6456 0.0001
0.6576 28.02 1450 0.6452 0.0001
0.6568 28.5 1475 0.6448 0.0001
0.6596 28.99 1500 0.6446 0.0001
0.6538 29.47 1525 0.6443 0.0001
0.6488 29.95 1550 0.6440 0.0001
0.6433 30.43 1575 0.6437 0.0001
0.6583 30.92 1600 0.6435 0.0001
0.6575 31.4 1625 0.6432 0.0001
0.6465 31.88 1650 0.6430 0.0001
0.6495 32.37 1675 0.6429 0.0001
0.6487 32.85 1700 0.6427 0.0001
0.6571 33.33 1725 0.6426 0.0001
0.6463 33.82 1750 0.6425 0.0001
0.648 34.3 1775 0.6423 0.0001
0.6537 34.78 1800 0.6422 0.0001
0.6564 35.27 1825 0.6420 0.0001
0.6491 35.75 1850 0.6420 0.0001
0.6549 36.23 1875 0.6419 0.0001
0.6524 36.71 1900 0.6418 0.0001
0.6522 37.2 1925 0.6418 0.0001
0.655 37.68 1950 0.6417 0.0001
0.6614 38.16 1975 0.6417 0.0001
0.6451 38.65 2000 0.6417 0.0001

Framework versions

  • Transformers 4.37.0
  • Pytorch 2.0.0+cu118
  • Datasets 2.16.1
  • Tokenizers 0.15.1
Downloads last month
1
Safetensors
Model size
124M params
Tensor type
F32
·

Finetuned from