YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Dataset Card for Custom Text Dataset

Dataset Name

Custom CNN/DailyMail Text Summarization Dataset

Overview

This dataset is a custom subset and extension of the CNN/DailyMail dataset, consisting of news articles and their corresponding summaries.

Composition

Train Dataset: A custom train dataset consisting of one long news article with its manually written summary. Test Dataset: A test dataset sampled from the original CNN/DailyMail dataset, consisting of 100 articles and their corresponding highlights.

Collection Process

The custom train dataset was crafted using news articles from the CNN/DailyMail dataset.

Preprocessing

The intput text was tokenized.

How to Use

from datasets import load_from_disk

# Load the custom dataset
train_dataset = load_from_disk("./results/custom_dataset/train")
test_dataset = load_from_disk("./results/custom_dataset/test")

Evaluation

This dataset can be evaluated using metrics such as ROUGE or BLEU.

Limitations

The train dataset consists of only one example.

Ethical Considerations

The data originates from news sources, which may contain sensitive or politically biased contents.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support