pszemraj
/

led-large-book-summary-continued

text2text-generation

long document summary

Inference Endpoints

Model card Files Files and versions Community

led-large-book-summary: continued

Fine-tuned further to explore if any improvements vs. the default.

Details

This model is a version of pszemraj/led-large-book-summary further fine-tuned for two epochs.

Usage

It's recommended to use this model with beam search decoding. If interested, you can also use the textsum util repo to have most of this abstracted out for you:

pip install -U textsum

from textsum.summarize import Summarizer

model_name = "pszemraj/led-large-book-summary-continued"
summarizer = Summarizer(model_name) # GPU auto-detected
text = "put the text you don't want to read here"
summary = summarizer.summarize_string(text)
print(summary)

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 4
eval_batch_size: 2
seed: 8191
gradient_accumulation_steps: 16
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.01
num_epochs: 2.0
mixed_precision_training: Native AMP

Downloads last month: 61

Safetensors

Model size

460M params

Tensor type

F32

·

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

Dataset used to train pszemraj/led-large-book-summary-continued

Spaces using pszemraj/led-large-book-summary-continued 2

Collection including pszemraj/led-large-book-summary-continued

BookSum-based Summarizers

BookSum-tuned text-to-text summarization models • 7 items • Updated Nov 4, 2024 • 3

Evaluation results

ROUGE-1 on kmfoda/booksum
test set verified

31.237
ROUGE-2 on kmfoda/booksum
test set verified

5.015
ROUGE-L on kmfoda/booksum
test set verified

15.772
ROUGE-LSUM on kmfoda/booksum
test set verified

28.494
loss on kmfoda/booksum
test set verified

4.777
gen_len on kmfoda/booksum
test set verified

154.191

View on Papers With Code