bart-large-dialogue-summarization

This model is a fine-tuned version of facebook/bart-large for English dialogue-to-summary generation.
The goal of this model is to generate concise English summaries from multi-speaker English dialogues.

Model description

This model takes an English dialogue as input and generates an English abstractive summary.
It was fine-tuned for a dialogue summarization task using paired examples with the following structure:

Input: dialogue
Target: summary

The model is intended for research and educational use, especially for experiments on dialogue summarization with BART.

Intended use

This model is suitable for:

English dialogue summarization
Research experiments on seq2seq fine-tuning
Educational demonstrations of BART fine-tuning for summarization

This model is not intended for high-stakes or production use without further evaluation.

Training data

The model was fine-tuned on JSON files containing dialogue-summary pairs:

train.json
val.json
test.json

Only the following fields were used during training:

dialogue
summary

Other fields such as multilingual summaries were excluded during preprocessing.

After cleaning, the final dataset sizes were:

Train: 14,730
Validation: 818
Test: 819

Preprocessing

The following preprocessing steps were applied:

Kept only dialogue and summary
Removed unused fields such as summary_zh and summary_de
Removed placeholder tokens such as <file_gif>
Normalized whitespace
Removed empty rows
Removed duplicate rows

Training setup

The model was fine-tuned from facebook/bart-large using the Hugging Face transformers library.

Main configuration

Base model: facebook/bart-large
Task: English dialogue summarization
Max source length: 768
Max target length: 64
Learning rate: 2e-5
Optimizer: Adafactor
Beam size for generation: 5
Epochs: 5
Per-device train batch size: 4
Per-device eval batch size: 4
Gradient accumulation steps: 6
Effective train batch size: 24
Best model selection metric: rougeLsum

Note: Some settings may be adjusted across experiments. This model card reflects the main fine-tuning configuration used for the uploaded checkpoint.

Evaluation

The model was evaluated using ROUGE on the validation and test sets.

Validation Results

Metric	Score
ROUGE-1	54.4213
ROUGE-2	30.4178
ROUGE-L	45.5119
ROUGE-Lsum	50.3035
Loss	1.3063

Test Results

Metric	Score
ROUGE-1	53.0642
ROUGE-2	28.3399
ROUGE-L	44.3812
ROUGE-Lsum	48.8576
Loss	1.3370

Example

Input

Hannah: Hey, do you have Betty's number?
Amanda: Lemme check
Hannah: <file_gif>
Amanda: Sorry, can't find it.
Amanda: Ask Larry
Amanda: He called her last time we were at the park together
Hannah: I don't know him well
Hannah: <file_gif>
Amanda: Don't be shy, he's very nice
Hannah: If you say so..
Hannah: I'd rather you texted him
Amanda: Just text him 🙂
Hannah: Urgh.. Alright
Hannah: Bye
Amanda: Bye bye

Generated Summary

Hannah is looking for Betty's number. Amanda suggests Hannah to ask Larry, who called Betty last time they were at the park.

Limitations

The model was fine-tuned only for English dialogue summarization.
It may hallucinate details or omit important context in long conversations.
Performance may degrade on domains very different from the training data.
This model should not be used in safety-critical or high-stakes settings without additional evaluation.

Citation

If you use this model, please also cite the original BART paper:

@article{lewis2019bart,
  title={BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension},
  author={Lewis, Mike and Liu, Yinhan and Goyal, Naman and Ghazvininejad, Marjan and Mohamed, Abdelrahman and Levy, Omer and Stoyanov, Veselin and Zettlemoyer, Luke},
  journal={arXiv preprint arXiv:1910.13461},
  year={2019}
}

Downloads last month: 32

Safetensors

Model size

0.4B params

Tensor type

F16

Model tree for yunu919/bart-large-dialogue-summarization

Base model

facebook/bart-large

Finetuned

(197)

this model

Paper for yunu919/bart-large-dialogue-summarization

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Paper • 1910.13461 • Published Oct 29, 2019 • 6