Edit model card

TGL-3

This model is a fine-tuned version of t5-small on an abstract-summary dataset, 23000 pieces of data for training. The data was acquired by openreview.net. It achieves the following results on the evaluation set:

  • Loss: 2.4435
  • Rouge1: 36.4998
  • Rouge2: 17.8322
  • Rougel: 31.8632
  • Rougelsum: 31.8341

Model description

Here is the paper https://arxiv.org/abs/1910.10683

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
2.9096 1.0 1240 2.5721 36.234 17.8214 31.5514 31.5549
2.7259 2.0 2480 2.5258 36.2572 17.9912 31.6249 31.6441
2.6434 3.0 3720 2.4957 36.4623 17.9657 31.7693 31.7542
2.5896 4.0 4960 2.4663 36.3692 17.8372 31.5909 31.6089
2.5491 5.0 6200 2.4511 36.4775 17.8094 31.8102 31.8003
2.5183 6.0 7440 2.4440 36.5892 17.906 31.9058 31.8985
2.4997 7.0 8680 2.4438 36.3747 17.8309 31.7314 31.7178
2.4863 8.0 9920 2.4435 36.4998 17.8322 31.8632 31.8341

Framework versions

  • Transformers 4.21.1
  • Pytorch 1.12.0+cu113
  • Datasets 2.4.0
  • Tokenizers 0.12.1
Downloads last month
2