Instructions to use yunu919/bart-large-dialogue-summarization with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use yunu919/bart-large-dialogue-summarization with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "summarization" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("summarization", model="yunu919/bart-large-dialogue-summarization")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("yunu919/bart-large-dialogue-summarization") model = AutoModelForSeq2SeqLM.from_pretrained("yunu919/bart-large-dialogue-summarization") - Notebooks
- Google Colab
- Kaggle
bart-large-dialogue-summarization
This model is a fine-tuned version of facebook/bart-large for English dialogue-to-summary generation.
The goal of this model is to generate concise English summaries from multi-speaker English dialogues.
Model description
This model takes an English dialogue as input and generates an English abstractive summary.
It was fine-tuned for a dialogue summarization task using paired examples with the following structure:
- Input:
dialogue - Target:
summary
The model is intended for research and educational use, especially for experiments on dialogue summarization with BART.
Intended use
This model is suitable for:
- English dialogue summarization
- Research experiments on seq2seq fine-tuning
- Educational demonstrations of BART fine-tuning for summarization
This model is not intended for high-stakes or production use without further evaluation.
Training data
The model was fine-tuned on JSON files containing dialogue-summary pairs:
train.jsonval.jsontest.json
Only the following fields were used during training:
dialoguesummary
Other fields such as multilingual summaries were excluded during preprocessing.
After cleaning, the final dataset sizes were:
- Train: 14,730
- Validation: 818
- Test: 819
Preprocessing
The following preprocessing steps were applied:
- Kept only
dialogueandsummary - Removed unused fields such as
summary_zhandsummary_de - Removed placeholder tokens such as
<file_gif> - Normalized whitespace
- Removed empty rows
- Removed duplicate rows
Training setup
The model was fine-tuned from facebook/bart-large using the Hugging Face transformers library.
Main configuration
- Base model:
facebook/bart-large - Task: English dialogue summarization
- Max source length: 768
- Max target length: 64
- Learning rate: 2e-5
- Optimizer: Adafactor
- Beam size for generation: 5
- Epochs: 5
- Per-device train batch size: 4
- Per-device eval batch size: 4
- Gradient accumulation steps: 6
- Effective train batch size: 24
- Best model selection metric:
rougeLsum
Note: Some settings may be adjusted across experiments. This model card reflects the main fine-tuning configuration used for the uploaded checkpoint.
Evaluation
The model was evaluated using ROUGE on the validation and test sets.
Validation Results
| Metric | Score |
|---|---|
| ROUGE-1 | 54.4213 |
| ROUGE-2 | 30.4178 |
| ROUGE-L | 45.5119 |
| ROUGE-Lsum | 50.3035 |
| Loss | 1.3063 |
Test Results
| Metric | Score |
|---|---|
| ROUGE-1 | 53.0642 |
| ROUGE-2 | 28.3399 |
| ROUGE-L | 44.3812 |
| ROUGE-Lsum | 48.8576 |
| Loss | 1.3370 |
Example
Input
Hannah: Hey, do you have Betty's number?
Amanda: Lemme check
Hannah: <file_gif>
Amanda: Sorry, can't find it.
Amanda: Ask Larry
Amanda: He called her last time we were at the park together
Hannah: I don't know him well
Hannah: <file_gif>
Amanda: Don't be shy, he's very nice
Hannah: If you say so..
Hannah: I'd rather you texted him
Amanda: Just text him 🙂
Hannah: Urgh.. Alright
Hannah: Bye
Amanda: Bye bye
Generated Summary
Hannah is looking for Betty's number. Amanda suggests Hannah to ask Larry, who called Betty last time they were at the park.
Limitations
- The model was fine-tuned only for English dialogue summarization.
- It may hallucinate details or omit important context in long conversations.
- Performance may degrade on domains very different from the training data.
- This model should not be used in safety-critical or high-stakes settings without additional evaluation.
Citation
If you use this model, please also cite the original BART paper:
@article{lewis2019bart,
title={BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension},
author={Lewis, Mike and Liu, Yinhan and Goyal, Naman and Ghazvininejad, Marjan and Mohamed, Abdelrahman and Levy, Omer and Stoyanov, Veselin and Zettlemoyer, Luke},
journal={arXiv preprint arXiv:1910.13461},
year={2019}
}
- Downloads last month
- 32
Model tree for yunu919/bart-large-dialogue-summarization
Base model
facebook/bart-large