Edit model card

xsum-gpt2-domain-bart

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.9826
  • Ppl: 1278.8025

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 22554
  • distributed_type: multi-GPU
  • num_devices: 2
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • total_eval_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 2000
  • num_epochs: 15
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Ppl
3.7059 1.2207 4000 3.5548 4819.9339
3.3851 2.4414 8000 3.3082 2703.9901
3.263 3.6620 12000 3.2025 2115.4674
3.1917 4.8827 16000 3.1383 1821.0007
3.1184 6.1034 20000 3.0980 1660.1856
3.0568 7.3241 24000 3.0679 1552.5692
3.0286 8.5447 28000 3.0380 1447.4888
3.0013 9.7654 32000 3.0155 1375.3106
2.9727 10.9861 36000 2.9992 1326.5168
2.9351 12.2068 40000 2.9897 1298.5084
2.9263 13.4274 44000 2.9843 1283.6903
2.9208 14.6481 48000 2.9826 1278.8025

Framework versions

  • Transformers 4.41.1
  • Pytorch 1.13.1+cu117
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
4
Safetensors
Model size
45.2M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for jw00oo/xsum-gpt2-domain-bart

Finetuned
this model