BasitAliii's picture
Update README.md
f8c839e verified
metadata
license: cc-by-4.0
language:
  - en
tags:
  - summarization
  - text-generation
  - NLP
  - transformers
datasets:
  - your-dataset-name

BART Fine-Tuned Summarization Model

This repository hosts a BART-based model fine-tuned for text summarization on a custom dataset of articles and highlights. The model is suitable for generating concise summaries from long-form text.


Model Overview

  • Base Model: facebook/bart-large-cnn
  • Task: Text Summarization
  • Fine-Tuning Dataset: Custom CSV dataset containing document and summary columns
  • Dataset Size: Varies depending on your CSV file
  • Framework: Hugging Face Transformers
  • Language: English

Dataset Preparation

  1. Load your CSV dataset containing columns: article (renamed to document) and highlights (renamed to summary).
  2. Clean the dataset by removing missing or non-string entries.
  3. Split the dataset into train and validation sets (80/20 split).
from datasets import Dataset
dataset = Dataset.from_pandas(df)
dataset = dataset.train_test_split(test_size=0.2, seed=42)