Edit model card

mentalcaption-gen

This model is a fine-tuned version of gpt2 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 7.2071

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
No log 0.9915 102 15.0712
No log 1.9927 205 13.1284
No log 2.9939 308 11.8544
No log 3.9951 411 10.6004
No log 4.9964 514 9.4891
No log 5.9976 617 8.7946
No log 6.9988 720 8.4227
No log 8.0 823 8.1604
No log 8.9915 925 7.9719
No log 9.9927 1028 7.8098
No log 10.9939 1131 7.6744
No log 11.9951 1234 7.5685
No log 12.9964 1337 7.4601
No log 13.9976 1440 7.3741
No log 14.9988 1543 7.3002
No log 16.0 1646 7.2581
No log 16.9915 1748 7.2293
No log 17.9927 1851 7.2151
No log 18.9939 1954 7.2077
No log 19.8299 2040 7.2071

Framework versions

  • Transformers 4.41.0
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
3
Safetensors
Model size
124M params
Tensor type
F32
·

Finetuned from