---
base_model: lvwerra/gpt2-imdb
tags:
- generated_from_trainer
model-index:
- name: gpt-imdb-hinge-beta_0.1
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# gpt-imdb-hinge-beta_0.1

This model is a fine-tuned version of [lvwerra/gpt2-imdb](https://huggingface.co/lvwerra/gpt2-imdb) on an unknown dataset.
It achieves the following results on the evaluation set:
- Step: 5500
- Loss: 0.1682
- Rewards/chosen: -2.5613
- Rewards/rejected: -6.0913
- Rewards/accuracies: 0.9312
- Rewards/margins: 3.5300
- Logps/rejected: -324.5987
- Logps/chosen: -260.8782
- Logits/rejected: -45.3410
- Logits/chosen: -46.5522

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- optimizer: Adam with betas=(0.9,0.99) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 150
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
| 0.3746        | 0.21  | 500  | 0.3940          | -0.4768        | -1.9553          | 0.8562             | 1.4785          | -283.2387      | -240.0334    | -33.1236        | -34.2065      |
| 0.3627        | 0.42  | 1000 | 0.3395          | -1.0759        | -2.9896          | 0.8646             | 1.9137          | -293.5812      | -246.0238    | -41.8545        | -42.9940      |
| 0.2687        | 0.63  | 1500 | 0.3229          | -1.7235        | -4.1025          | 0.8729             | 2.3790          | -304.7103      | -252.5004    | -39.8423        | -41.2043      |
| 0.1878        | 0.83  | 2000 | 0.2360          | -1.6708        | -4.3940          | 0.9104             | 2.7231          | -307.6249      | -251.9736    | -41.4970        | -42.6933      |
| 0.1936        | 1.04  | 2500 | 0.2124          | -1.9623        | -4.8688          | 0.9250             | 2.9066          | -312.3736      | -254.8880    | -42.8807        | -43.9675      |
| 0.2302        | 1.25  | 3000 | 0.2062          | -2.1959        | -5.2559          | 0.9021             | 3.0600          | -316.2442      | -257.2241    | -45.2090        | -46.3997      |
| 0.2137        | 1.46  | 3500 | 0.2235          | -2.1054        | -5.4204          | 0.9208             | 3.3150          | -317.8889      | -256.3190    | -46.5366        | -47.7024      |
| 0.2231        | 1.67  | 4000 | 0.1884          | -2.3281        | -5.6096          | 0.9208             | 3.2815          | -319.7815      | -258.5467    | -45.7720        | -46.8600      |
| 0.2269        | 1.88  | 4500 | 0.1785          | -2.5145        | -6.0015          | 0.9292             | 3.4871          | -323.7006      | -260.4101    | -45.7220        | -46.8746      |
| 0.1831        | 2.08  | 5000 | 0.1727          | -2.6850        | -6.2801          | 0.9312             | 3.5951          | -326.4862      | -262.1152    | -45.0514        | -46.1610      |
| 0.0112        | 2.29  | 5500 | **0.1682**          | -2.5613        | -6.0913          | 0.9312             | 3.5300          | -324.5987      | -260.8782    | -45.3410        | -46.5522      |
| 0.1894        | 2.5   | 6000 | 0.1706          | -2.7334        | -6.3632          | 0.9271             | 3.6298          | -327.3174      | -262.5995    | -45.2020        | -46.4449      |
| 0.13          | 2.71  | 6500 | 0.1685          | -2.7681        | -6.4203          | 0.9250             | 3.6522          | -327.8886      | -262.9462    | -45.5580        | -46.8017      |
| 0.2717        | 2.92  | 7000 | 0.1683          | -2.7548        | -6.4029          | 0.9271             | 3.6481          | -327.7139      | -262.8134    | -45.7026        | -46.9404      |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.1
- Datasets 2.15.0
- Tokenizers 0.15.0