Edit model card

gpt-imdb-alpha_0.3-beta_0.1

This model is a fine-tuned version of lvwerra/gpt2-imdb on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 25.4567
  • Rewards/chosen: -0.2859
  • Rewards/rejected: -1.2893
  • Rewards/accuracies: 0.8458
  • Rewards/margins: 1.0034
  • Logps/rejected: -276.5780
  • Logps/chosen: -238.1245
  • Logits/rejected: -31.6823
  • Logits/chosen: -32.1973

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 24
  • eval_batch_size: 24
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.99) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 150
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.3872 0.21 500 0.9032 -0.0063 -0.4921 0.7833 0.4858 -268.6066 -235.3286 -32.2910 -32.9554
0.937 0.42 1000 0.5782 0.3739 -0.2273 0.7667 0.6012 -265.9586 -231.5264 -33.2571 -33.9060
1.6799 0.63 1500 3.1537 0.2527 -0.4167 0.7729 0.6694 -267.8524 -232.7385 -33.1089 -33.5974
0.8141 0.83 2000 1.8978 0.1800 -0.6646 0.7917 0.8446 -270.3312 -233.4657 -32.3310 -32.9275
0.4758 1.04 2500 7.5225 0.0635 -0.8693 0.8188 0.9329 -272.3785 -234.6298 -32.0571 -32.5700
0.5184 1.25 3000 2.2710 0.3736 -0.5136 0.8021 0.8872 -268.8213 -231.5289 -33.9791 -34.4883
0.3571 1.46 3500 12.0724 0.0389 -0.9119 0.8125 0.9507 -272.8040 -234.8766 -32.0986 -32.6149
1.8478 1.67 4000 14.8072 0.0021 -0.9754 0.8229 0.9775 -273.4396 -235.2442 -32.4363 -32.9745
0.6874 1.88 4500 5.9952 0.0487 -0.9284 0.8167 0.9771 -272.9694 -234.7781 -32.9101 -33.4694
0.2233 2.08 5000 11.0797 -0.2853 -1.2611 0.8479 0.9758 -276.2962 -238.1182 -31.8450 -32.3602
0.1784 2.29 5500 7.9899 -0.1567 -1.1325 0.8375 0.9757 -275.0099 -236.8327 -32.0292 -32.5741
0.2919 2.5 6000 29.0523 -0.3295 -1.3283 0.8500 0.9988 -276.9686 -238.5604 -31.4315 -31.9371
2.011 2.71 6500 28.3221 -0.2974 -1.3018 0.8458 1.0044 -276.7031 -238.2393 -31.6565 -32.1763
1.7899 2.92 7000 25.4567 -0.2859 -1.2893 0.8458 1.0034 -276.5780 -238.1245 -31.6823 -32.1973

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.1
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
2
Safetensors
Model size
124M params
Tensor type
F32
·

Finetuned from