Edit model card

gpt-imdb-alpha_0.5-beta_0.1

This model is a fine-tuned version of lvwerra/gpt2-imdb on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 5886.0698
  • Rewards/chosen: -0.0051
  • Rewards/rejected: -0.7543
  • Rewards/accuracies: 0.8125
  • Rewards/margins: 0.7492
  • Logps/rejected: -271.2288
  • Logps/chosen: -235.3164
  • Logits/rejected: -35.8752
  • Logits/chosen: -36.3770

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 24
  • eval_batch_size: 24
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.99) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 150
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.4827 0.21 500 1.0040 0.1562 -0.2070 0.7042 0.3632 -265.7552 -233.7028 -33.0609 -33.6065
14.1335 0.42 1000 3.0758 0.3762 -0.1852 0.7438 0.5615 -265.5375 -231.5030 -34.4930 -35.0582
0.5469 0.63 1500 8.0814 0.4345 -0.1070 0.7271 0.5415 -264.7556 -230.9207 -32.8344 -33.3794
1.032 0.83 2000 4.5711 0.5518 -0.0259 0.7104 0.5777 -263.9442 -229.7469 -33.4042 -33.9772
0.3719 1.04 2500 459.9075 0.2914 -0.4286 0.7813 0.7200 -267.9716 -232.3516 -33.0798 -33.6079
0.4085 1.25 3000 526.3080 0.4340 -0.2325 0.7479 0.6666 -266.0108 -230.9248 -35.2424 -35.7675
2.1291 1.46 3500 630.5800 0.4165 -0.3073 0.7604 0.7238 -266.7581 -231.0998 -37.0077 -37.6012
4.7118 1.67 4000 96.2745 0.3115 -0.4479 0.7625 0.7593 -268.1639 -232.1506 -37.1158 -37.6120
0.5195 1.88 4500 342.8383 0.3188 -0.4079 0.7688 0.7267 -267.7646 -232.0775 -37.3006 -37.8729
0.8474 2.08 5000 4552.9634 -0.0725 -0.8330 0.7896 0.7605 -272.0149 -235.9899 -36.5234 -37.0376
0.2874 2.29 5500 3540.6086 0.0246 -0.7477 0.8083 0.7723 -271.1625 -235.0193 -36.0173 -36.5541
2.4701 2.5 6000 4522.3066 -0.0217 -0.7825 0.8042 0.7608 -271.5105 -235.4827 -35.7649 -36.2731
0.59 2.71 6500 4948.8481 0.0070 -0.7472 0.8104 0.7542 -271.1574 -235.1950 -35.8213 -36.3258
0.3244 2.92 7000 5886.0698 -0.0051 -0.7543 0.8125 0.7492 -271.2288 -235.3164 -35.8752 -36.3770

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.1
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
1
Safetensors
Model size
124M params
Tensor type
F32
·

Finetuned from