pszemraj's picture
Update README.md
0b8260a
metadata
license:
  - bsd-3-clause
  - apache-2.0
tags:
  - generated_from_trainer
datasets:
  - pszemraj/scientific_lay_summarisation-elife-norm
metrics:
  - rouge
model-index:
  - name: >-
      long-t5-tglobal-xl-16384-book-summary-scientific_lay_summarisation-elife-norm-16384-summ-v1
    results:
      - task:
          name: Summarization
          type: summarization
        dataset:
          name: pszemraj/scientific_lay_summarisation-elife-norm
          type: pszemraj/scientific_lay_summarisation-elife-norm
          split: validation
        metrics:
          - name: Rouge1
            type: rouge
            value: 47.4591
language:
  - en
library_name: transformers
inference: false

long-t5-tglobal-xl-16384-booksci-summary-v1

This model is a fine-tuned version of pszemraj/long-t5-tglobal-xl-16384-book-summary on the pszemraj/scientific_lay_summarisation-elife-norm dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7518
  • Rouge1: 47.4591
  • Rouge2: 12.7287
  • Rougel: 21.5549
  • Rougelsum: 44.8709
  • Gen Len: 384.39

Model description

An experiment of further fine-tuning a booksum model on a different dataset. Compare to either the starting checkpoint (linked above) or to the variant only fine-tuned on the scientific lay summaries.

Intended uses & limitations

More information needed

Training and evaluation data

the pszemraj/scientific_lay_summarisation-elife-norm dataset, input 16384 tokens then truncate, output 1024 tokens then truncate.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 878
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.02
  • num_epochs: 2.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.9629 1.0 543 1.7637 46.6926 12.4769 21.4364 44.4329 381.23
1.8555 2.0 1086 1.7518 47.4591 12.7287 21.5549 44.8709 384.39