Edit model card

gpt-imdb-hinge-beta_0.1

This model is a fine-tuned version of lvwerra/gpt2-imdb on an unknown dataset. It achieves the following results on the evaluation set:

  • Step: 5500
  • Loss: 0.1682
  • Rewards/chosen: -2.5613
  • Rewards/rejected: -6.0913
  • Rewards/accuracies: 0.9312
  • Rewards/margins: 3.5300
  • Logps/rejected: -324.5987
  • Logps/chosen: -260.8782
  • Logits/rejected: -45.3410
  • Logits/chosen: -46.5522

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 24
  • eval_batch_size: 24
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.99) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 150
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.3746 0.21 500 0.3940 -0.4768 -1.9553 0.8562 1.4785 -283.2387 -240.0334 -33.1236 -34.2065
0.3627 0.42 1000 0.3395 -1.0759 -2.9896 0.8646 1.9137 -293.5812 -246.0238 -41.8545 -42.9940
0.2687 0.63 1500 0.3229 -1.7235 -4.1025 0.8729 2.3790 -304.7103 -252.5004 -39.8423 -41.2043
0.1878 0.83 2000 0.2360 -1.6708 -4.3940 0.9104 2.7231 -307.6249 -251.9736 -41.4970 -42.6933
0.1936 1.04 2500 0.2124 -1.9623 -4.8688 0.9250 2.9066 -312.3736 -254.8880 -42.8807 -43.9675
0.2302 1.25 3000 0.2062 -2.1959 -5.2559 0.9021 3.0600 -316.2442 -257.2241 -45.2090 -46.3997
0.2137 1.46 3500 0.2235 -2.1054 -5.4204 0.9208 3.3150 -317.8889 -256.3190 -46.5366 -47.7024
0.2231 1.67 4000 0.1884 -2.3281 -5.6096 0.9208 3.2815 -319.7815 -258.5467 -45.7720 -46.8600
0.2269 1.88 4500 0.1785 -2.5145 -6.0015 0.9292 3.4871 -323.7006 -260.4101 -45.7220 -46.8746
0.1831 2.08 5000 0.1727 -2.6850 -6.2801 0.9312 3.5951 -326.4862 -262.1152 -45.0514 -46.1610
0.0112 2.29 5500 0.1682 -2.5613 -6.0913 0.9312 3.5300 -324.5987 -260.8782 -45.3410 -46.5522
0.1894 2.5 6000 0.1706 -2.7334 -6.3632 0.9271 3.6298 -327.3174 -262.5995 -45.2020 -46.4449
0.13 2.71 6500 0.1685 -2.7681 -6.4203 0.9250 3.6522 -327.8886 -262.9462 -45.5580 -46.8017
0.2717 2.92 7000 0.1683 -2.7548 -6.4029 0.9271 3.6481 -327.7139 -262.8134 -45.7026 -46.9404

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.1
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
8
Safetensors
Model size
124M params
Tensor type
F32
·

Finetuned from