Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Quantization made by Richard Erkhov.

Github

Discord

Request more models

bart-base-finetuned-xsum - bnb 8bits

Original model description:

license: apache-2.0 base_model: facebook/bart-base tags: - generated_from_trainer datasets: - xsum metrics: - rouge model-index: - name: bart-base-finetuned-xsum results: - task: name: Sequence-to-sequence Language Modeling type: text2text-generation dataset: name: xsum type: xsum config: default split: train[:10%] args: default metrics: - name: Rouge1 type: rouge value: 35.8214 pipeline_tag: summarization

bart-base-finetuned-xsum

This model is a fine-tuned version of facebook/bart-base on the xsum dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9356
  • Rouge1: 35.8214
  • Rouge2: 14.7565
  • Rougel: 29.4566
  • Rougelsum: 29.4496
  • Gen Len: 19.562

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.301 1.0 1148 1.9684 34.4715 13.6638 28.1147 28.1204 19.5816
2.1197 2.0 2296 1.9442 35.2502 14.284 28.8462 28.8384 19.5546
1.9804 3.0 3444 1.9406 35.7799 14.7422 29.3669 29.3742 19.5326
1.8891 4.0 4592 1.9349 35.5151 14.4668 29.0359 29.0484 19.5492
1.827 5.0 5740 1.9356 35.8214 14.7565 29.4566 29.4496 19.562

Framework versions

  • Transformers 4.40.1
  • Pytorch 1.13.1+cu117
  • Datasets 2.19.0
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
96.2M params
Tensor type
F32
FP16
I8