Edit model card

t5-small-finetuned-summarization-xsum

This model is a fine-tuned version of t5-small on the xsum dataset. It is very fast and light. The model summarizes a whole text in just <1s, making it very efficient for low resource usage.

Model Demo:

https://huggingface.co/spaces/Rahmat82/RHM-text-summarizer-light

It achieves the following results on the evaluation set:

  • Loss: 2.2425
  • Rouge1: 31.3222
  • Rouge2: 10.0614
  • Rougel: 25.0513
  • Rougelsum: 25.0446
  • Gen Len: 18.8044

Model description

This model is light and performs very fast. No matter on GPU or CPU, it always summarizes your text in <1s. If you use optimum, it may get even faster.

Click the following link to open the model's demo:
https://huggingface.co/spaces/Rahmat82/RHM-text-summarizer-light

Use the model:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, pipeline

model_id = "Rahmat82/t5-small-finetuned-summarization-xsum"

model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)
summarizer = pipeline("summarization",model = model, tokenizer=tokenizer)

text_to_summarize = """
The koala is regarded as the epitome of cuddliness. However, animal lovers
will be saddened to hear that this lovable marsupial has been moved to the
endangered species list. The Australian Koala Foundation estimates there are
somewhere between 43,000-100,000 koalas left in the wild. Their numbers have
been dwindling rapidly due to disease, loss of habitat, bushfires, being hit
by cars, and other threats. Stuart Blanch from the World Wildlife Fund in
Australia said: "Koalas have gone from no listing to vulnerable to endangered
within a decade. That is a shockingly fast decline." He added that koalas risk
"sliding toward extinction" 
"""


print(summarizer(text_to_summarize)[0]["summary_text"])

Use model with optimum/onnxruntime - super fast:

#!pip install -q transformers accelerate optimum onnxruntime onnx

from transformers import AutoTokenizer
from optimum.onnxruntime import ORTModelForSeq2SeqLM
from optimum.pipelines import pipeline
import accelerate

model_name = "Rahmat82/t5-small-finetuned-summarization-xsum"

model = ORTModelForSeq2SeqLM.from_pretrained(model_name, export=True)
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
summarizer = pipeline("summarization", model=model, tokenizer=tokenizer,
                                       device_map="auto", batch_size=12)

text_to_summarize = """
The koala is regarded as the epitome of cuddliness. However, animal lovers
will be saddened to hear that this lovable marsupial has been moved to the
endangered species list. The Australian Koala Foundation estimates there are
somewhere between 43,000-100,000 koalas left in the wild. Their numbers have
been dwindling rapidly due to disease, loss of habitat, bushfires, being hit
by cars, and other threats. Stuart Blanch from the World Wildlife Fund in
Australia said: "Koalas have gone from no listing to vulnerable to endangered
within a decade. That is a shockingly fast decline." He added that koalas risk
"sliding toward extinction" 
"""

print(summarizer(text_to_summarize)[0]["summary_text"])

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 28
  • eval_batch_size: 28
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.5078 1.0 7288 2.2860 30.9087 9.7673 24.6951 24.6927 18.7973
2.4245 2.0 14576 2.2425 31.3222 10.0614 25.0513 25.0446 18.8044

Framework versions

  • Transformers 4.37.0
  • Pytorch 2.1.2
  • Datasets 2.1.0
  • Tokenizers 0.15.1
Downloads last month
70
Safetensors
Model size
60.5M params
Tensor type
F32
·

Finetuned from

Dataset used to train Rahmat82/t5-small-finetuned-summarization-xsum

Space using Rahmat82/t5-small-finetuned-summarization-xsum 1

Evaluation results