Edit model card

gpt2-finetuned-ar-gpt-20240408

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3609

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 25 1.6288
No log 1.99 50 1.5562
No log 2.99 75 1.5183
No log 3.98 100 1.4925
No log 4.98 125 1.4790
No log 5.97 150 1.4629
No log 6.97 175 1.4544
No log 8.0 201 1.4106
No log 9.0 226 1.3789
No log 9.95 250 1.3609

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
2
Safetensors
Model size
127M params
Tensor type
F32
·

Finetuned from