---
license: mit
base_model: gpt2
tags:
- generated_from_trainer
model-index:
- name: lig_model_1
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# lig_model_1

This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 3.7688

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 256
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 10
- num_epochs: 15

### Training results

| Training Loss | Epoch   | Step | Validation Loss |
|:-------------:|:-------:|:----:|:---------------:|
| 6.8679        | 0.2310  | 8    | 6.6211          |
| 6.3894        | 0.4621  | 16   | 6.3666          |
| 6.2641        | 0.6931  | 24   | 6.2481          |
| 6.1285        | 0.9242  | 32   | 6.0829          |
| 5.9436        | 1.1552  | 40   | 5.8900          |
| 5.8073        | 1.3863  | 48   | 5.7490          |
| 5.7164        | 1.6173  | 56   | 5.6617          |
| 5.6019        | 1.8484  | 64   | 5.5778          |
| 5.5427        | 2.0794  | 72   | 5.4886          |
| 5.454         | 2.3105  | 80   | 5.3954          |
| 5.3546        | 2.5415  | 88   | 5.3066          |
| 5.3014        | 2.7726  | 96   | 5.2124          |
| 5.2448        | 3.0036  | 104  | 5.1365          |
| 5.1185        | 3.2347  | 112  | 5.0765          |
| 5.0938        | 3.4657  | 120  | 5.0071          |
| 5.0347        | 3.6968  | 128  | 4.9339          |
| 4.9681        | 3.9278  | 136  | 4.8552          |
| 4.8323        | 4.1588  | 144  | 4.7821          |
| 4.7912        | 4.3899  | 152  | 4.7215          |
| 4.7225        | 4.6209  | 160  | 4.6431          |
| 4.6433        | 4.8520  | 168  | 4.5701          |
| 4.5309        | 5.0830  | 176  | 4.5002          |
| 4.4506        | 5.3141  | 184  | 4.4442          |
| 4.4097        | 5.5451  | 192  | 4.3820          |
| 4.3871        | 5.7762  | 200  | 4.3290          |
| 4.3345        | 6.0072  | 208  | 4.2869          |
| 4.2004        | 6.2383  | 216  | 4.2412          |
| 4.1716        | 6.4693  | 224  | 4.1978          |
| 4.1536        | 6.7004  | 232  | 4.1607          |
| 4.0975        | 6.9314  | 240  | 4.1294          |
| 3.9743        | 7.1625  | 248  | 4.1014          |
| 3.922         | 7.3935  | 256  | 4.0654          |
| 3.939         | 7.6245  | 264  | 4.0378          |
| 3.9208        | 7.8556  | 272  | 4.0102          |
| 3.8083        | 8.0866  | 280  | 3.9812          |
| 3.7611        | 8.3177  | 288  | 3.9630          |
| 3.7668        | 8.5487  | 296  | 3.9407          |
| 3.7285        | 8.7798  | 304  | 3.9183          |
| 3.6996        | 9.0108  | 312  | 3.8958          |
| 3.5754        | 9.2419  | 320  | 3.8825          |
| 3.5708        | 9.4729  | 328  | 3.8702          |
| 3.5607        | 9.7040  | 336  | 3.8510          |
| 3.5688        | 9.9350  | 344  | 3.8387          |
| 3.4188        | 10.1661 | 352  | 3.8350          |
| 3.432         | 10.3971 | 360  | 3.8261          |
| 3.4236        | 10.6282 | 368  | 3.8131          |
| 3.3985        | 10.8592 | 376  | 3.8026          |
| 3.306         | 11.0903 | 384  | 3.7934          |
| 3.3196        | 11.3213 | 392  | 3.7919          |
| 3.3031        | 11.5523 | 400  | 3.7908          |
| 3.2851        | 11.7834 | 408  | 3.7817          |
| 3.2703        | 12.0144 | 416  | 3.7789          |
| 3.2132        | 12.2455 | 424  | 3.7818          |
| 3.1829        | 12.4765 | 432  | 3.7778          |
| 3.1968        | 12.7076 | 440  | 3.7749          |
| 3.2206        | 12.9386 | 448  | 3.7711          |
| 3.1521        | 13.1697 | 456  | 3.7694          |
| 3.1412        | 13.4007 | 464  | 3.7700          |
| 3.1415        | 13.6318 | 472  | 3.7709          |
| 3.1402        | 13.8628 | 480  | 3.7694          |
| 3.129         | 14.0939 | 488  | 3.7689          |
| 3.1221        | 14.3249 | 496  | 3.7687          |
| 3.1576        | 14.5560 | 504  | 3.7688          |


### Framework versions

- Transformers 4.41.0
- Pytorch 2.3.0+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1