datien228's picture
Update README.md
3f7f4c2
metadata
language:
  - en
tags:
  - summarization
license: mit
datasets:
  - wiki_lingua
metrics:
  - rouge

Pre-trained BART Model fine-tune on WikiLingua dataset

The repository for the fine-tuned BART model (by sshleifer) using the wiki_lingua dataset (English)

Purpose: Examine the performance of a fine-tuned model research purposes

Observation:

  • Pre-trained model was trained on the XSum dataset, which summarize a not-too-long documents into one-liner summary
  • Fine-tuning this model using WikiLingua is appropriate since the summaries for that dataset are also short
  • In the end, however, the model cannot capture much clearer key points, but instead it mostly extracts the opening sentence
  • Some data pre-processing and models' hyperparameter are also need to be tuned more properly.