etsummerizer_v2 / README.md
Quake24's picture
Add training dataset to model card
ae68143
metadata
license: apache-2.0
tags:
  - summarization
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: etsummerizer_v2
    results: []
datasets:
  - EasyTerms/Manuel_dataset
language:
  - en
library_name: transformers
pipeline_tag: summarization

etsummerizer_v2

This model is a fine-tuned version of sshleifer/distilbart-cnn-12-6 on EasyTerms/Manuel_dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3484
  • Rouge1: 0.5448
  • Rouge2: 0.3092
  • Rougel: 0.4363
  • Rougelsum: 0.4370

Model description

This model was finetuned on legal text extracted from different terms and conditions documents. Its objective is to efficiently summerize such text and present the generation in a simplified version lacking in legal jargon.

Intended uses & limitations

As it is the second version of this model it effectively summerize legal text however, further training will be required to improve the simplification task.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
3.5 1.0 30 0.5565 0.5111 0.2863 0.4092 0.4093
0.3056 2.0 60 0.3612 0.5267 0.3021 0.4277 0.4286
0.1716 3.0 90 0.3484 0.5448 0.3092 0.4363 0.4370

Framework versions

  • Transformers 4.30.2
  • Pytorch 2.0.0+cpu
  • Datasets 2.1.0
  • Tokenizers 0.13.3