---
base_model: lvwerra/gpt2-imdb
tags:
- generated_from_trainer
model-index:
- name: gpt-imdb-alpha_0.3-beta_0.1
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# gpt-imdb-alpha_0.3-beta_0.1

This model is a fine-tuned version of [lvwerra/gpt2-imdb](https://huggingface.co/lvwerra/gpt2-imdb) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 25.4567
- Rewards/chosen: -0.2859
- Rewards/rejected: -1.2893
- Rewards/accuracies: 0.8458
- Rewards/margins: 1.0034
- Logps/rejected: -276.5780
- Logps/chosen: -238.1245
- Logits/rejected: -31.6823
- Logits/chosen: -32.1973

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- optimizer: Adam with betas=(0.9,0.99) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 150
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
| 0.3872        | 0.21  | 500  | 0.9032          | -0.0063        | -0.4921          | 0.7833             | 0.4858          | -268.6066      | -235.3286    | -32.2910        | -32.9554      |
| 0.937         | 0.42  | 1000 | 0.5782          | 0.3739         | -0.2273          | 0.7667             | 0.6012          | -265.9586      | -231.5264    | -33.2571        | -33.9060      |
| 1.6799        | 0.63  | 1500 | 3.1537          | 0.2527         | -0.4167          | 0.7729             | 0.6694          | -267.8524      | -232.7385    | -33.1089        | -33.5974      |
| 0.8141        | 0.83  | 2000 | 1.8978          | 0.1800         | -0.6646          | 0.7917             | 0.8446          | -270.3312      | -233.4657    | -32.3310        | -32.9275      |
| 0.4758        | 1.04  | 2500 | 7.5225          | 0.0635         | -0.8693          | 0.8188             | 0.9329          | -272.3785      | -234.6298    | -32.0571        | -32.5700      |
| 0.5184        | 1.25  | 3000 | 2.2710          | 0.3736         | -0.5136          | 0.8021             | 0.8872          | -268.8213      | -231.5289    | -33.9791        | -34.4883      |
| 0.3571        | 1.46  | 3500 | 12.0724         | 0.0389         | -0.9119          | 0.8125             | 0.9507          | -272.8040      | -234.8766    | -32.0986        | -32.6149      |
| 1.8478        | 1.67  | 4000 | 14.8072         | 0.0021         | -0.9754          | 0.8229             | 0.9775          | -273.4396      | -235.2442    | -32.4363        | -32.9745      |
| 0.6874        | 1.88  | 4500 | 5.9952          | 0.0487         | -0.9284          | 0.8167             | 0.9771          | -272.9694      | -234.7781    | -32.9101        | -33.4694      |
| 0.2233        | 2.08  | 5000 | 11.0797         | -0.2853        | -1.2611          | 0.8479             | 0.9758          | -276.2962      | -238.1182    | -31.8450        | -32.3602      |
| 0.1784        | 2.29  | 5500 | 7.9899          | -0.1567        | -1.1325          | 0.8375             | 0.9757          | -275.0099      | -236.8327    | -32.0292        | -32.5741      |
| 0.2919        | 2.5   | 6000 | 29.0523         | -0.3295        | -1.3283          | 0.8500             | 0.9988          | -276.9686      | -238.5604    | -31.4315        | -31.9371      |
| 2.011         | 2.71  | 6500 | 28.3221         | -0.2974        | -1.3018          | 0.8458             | 1.0044          | -276.7031      | -238.2393    | -31.6565        | -32.1763      |
| 1.7899        | 2.92  | 7000 | 25.4567         | -0.2859        | -1.2893          | 0.8458             | 1.0034          | -276.5780      | -238.1245    | -31.6823        | -32.1973      |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.1
- Datasets 2.15.0
- Tokenizers 0.15.0