Edit model card

summarization_all

This model is a fine-tuned version of KETI-AIR/long-ke-t5-base on the jsonl_dataset_sum.py dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1442
  • Rouge1: 21.9857
  • Rouge2: 10.2876
  • Rougel: 21.4026
  • Rougelsum: 21.4278
  • Gen Len: 86.2560

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 8
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.2503 1.0 184670 1.2439 20.2525 9.1467 19.7454 19.771 87.1766
1.1629 2.0 369340 1.1773 21.0068 9.6691 20.4565 20.4888 89.6074
1.1087 3.0 554010 1.1431 21.0216 9.6545 20.489 20.5108 85.5895
1.056 4.0 738680 1.1247 21.6776 10.1424 21.09 21.1168 89.6576
1.0199 5.0 923350 1.1179 21.6563 10.0965 21.0814 21.1056 89.2454
0.9652 6.0 1108020 1.1122 21.6209 10.0725 21.0623 21.0864 86.7079
0.92 7.0 1292690 1.1136 21.9396 10.2734 21.3465 21.3745 86.5547
0.8804 8.0 1477360 1.1228 21.8457 10.1858 21.2552 21.278 87.6413
0.8447 9.0 1662030 1.1327 21.92 10.2635 21.3415 21.3633 86.4453
0.7678 10.0 1846700 1.1442 21.9857 10.2876 21.4026 21.4278 86.2560

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.12.0
  • Datasets 2.8.0
  • Tokenizers 0.13.2
Downloads last month
44
Safetensors
Model size
297M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for KETI-AIR-Downstream/long-ke-t5-base-summarization_e10

Finetuned
(7)
this model

Evaluation results