har1's picture
End of training
7d6726d verified
|
raw
history blame
1.87 kB
metadata
license: mit
base_model: facebook/bart-large-cnn
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: conversation-summ
    results: []

conversation-summ

This model is a fine-tuned version of facebook/bart-large-cnn on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1562
  • Rouge1: 54.3238
  • Rouge2: 34.2678
  • Rougel: 46.5847
  • Rougelsum: 51.2214
  • Gen Len: 77.04

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 2
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.4426 1.0 600 0.1588 52.8864 33.253 44.9089 50.5072 69.38
0.1137 2.0 1201 0.1517 56.8499 35.309 48.2171 53.6983 72.74
0.0796 3.0 1800 0.1562 54.3238 34.2678 46.5847 51.2214 77.04

Framework versions

  • Transformers 4.39.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2