long-t5-tglobal-base-sci-simplify
Exploring how well long-document models trained on "lay summaries" of scientific papers generalize.
A lay summary is a summary of a research paper or scientific study that is written in plain language, without the use of technical jargon, and is designed to be easily understood by non-experts.
Model description
This model is a fine-tuned version of google/long-t5-tglobal-base on the pszemraj/scientific_lay_summarisation-plos-norm
dataset for two epochs.
- The variant trained on the ELIFE subset can be found here
Usage
It's recommended to use this model with beam search decoding. If you are interested, you can also use the textsum
util repo to have most of this abstracted for you:
Install with pip
:
pip install -U textsum
Use in python:
from textsum.summarize import Summarizer
summarizer = Summarizer('pszemraj/long-t5-tglobal-base-sci-simplify')
text = "put the text you don't want to read here"
summary = summarizer.summarize_string(text)
print(summary)
Intended uses & limitations
- Ability to generalize outside of the dataset domain (pubmed/bioscience type papers) has to be evaluated.
Training procedure
Eval results
It achieves the following results on the evaluation set:
- Loss: 1.6778
- Rouge1: 49.1475
- Rouge2: 18.9281
- Rougel: 26.9893
- Rougelsum: 45.0973
- Gen Len: 399.4125
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0004
- train_batch_size: 4
- eval_batch_size: 2
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 16
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 2.0
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
1.966 | 0.52 | 200 | 1.7171 | 48.6521 | 18.427 | 26.7726 | 44.3947 | 376.335 |
1.877 | 1.03 | 400 | 1.6909 | 49.3263 | 18.7945 | 27.0741 | 45.1737 | 382.205 |
1.9007 | 1.55 | 600 | 1.6778 | 49.1475 | 18.9281 | 26.9893 | 45.0973 | 399.4125 |
- Downloads last month
- 267
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.