metadata

library_name: transformers
license: apache-2.0
base_model: distilgpt2
tags:
  - generated_from_trainer
model-index:
  - name: distilgpt2-finetuned
    results: []

distilgpt2-finetuned

This model is a fine-tuned version of distilgpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.6391

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 1
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
4.0748	0.0436	50	3.8923
3.8414	0.0871	100	3.8125
3.8957	0.1307	150	3.7769
3.8723	0.1743	200	3.7545
4.0205	0.2179	250	3.7336
3.7175	0.2614	300	3.7282
3.7778	0.3050	350	3.7111
3.7763	0.3486	400	3.6994
3.8142	0.3922	450	3.6945
3.7654	0.4357	500	3.6831
3.9636	0.4793	550	3.6773
3.703	0.5229	600	3.6692
3.6114	0.5664	650	3.6647
3.6269	0.6100	700	3.6591
3.693	0.6536	750	3.6564
3.7969	0.6972	800	3.6529
3.6011	0.7407	850	3.6491
3.4943	0.7843	900	3.6466
3.7543	0.8279	950	3.6440
3.861	0.8715	1000	3.6406
3.5354	0.9150	1050	3.6401
3.6661	0.9586	1100	3.6396

Framework versions

Transformers 4.45.1
Pytorch 2.4.0
Datasets 3.0.1
Tokenizers 0.20.0