Update README.md

92fec89 verified 3 months ago

3.72 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model: t5-small
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: cnn_news_summary_model_trained_on_reduced_data
	results: []
	datasets:
	- abisee/cnn_dailymail
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# cnn_news_summary_model_trained_on_reduced_data

	This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on an [cnn_dailymail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset.
	It achieves the following results on the evaluation set:
	- *Loss*: 1.6597
	- Rouge_1: 0.2162
	- Rouge_2: 0.0943
	- Rouge_l: 0.1834
	- Rouge_lsum: 0.1834
	- Generated_Length: 19.0

	## Model description

	Base Model: t5-small, which is a smaller version of the T5 (Text-to-Text Transfer Transformer) model developed by *Google*.

	This model can be particularly useful if you need to quickly summarize large volumes of text, making it easier to digest and understand key information.

	## Intended uses & limitations

	* ### Intended Use

	* The model is designed for text summarization, which involves condensing long pieces of text into shorter, more digestible summaries. Here are some specific use cases:

	* News Summarization: Quickly summarizing news articles to provide readers with the main points.


	* Document Summarization: Condensing lengthy reports or research papers into brief overviews.


	* Content Curation: Helping content creators and curators to generate summaries for newsletters, blogs, or social media posts.


	* Educational Tools: Assisting students and educators by summarizing academic texts and articles.

	* ### Limitations

	* While the model is powerful, it does have some limitations:

	* Accuracy: The summaries generated might not always capture all the key points accurately, especially for complex or nuanced texts.


	* Bias: The model can inherit biases present in the training data, which might affect the quality and neutrality of the summaries.


	* Context Understanding: It might struggle with understanding the full context of very long documents, leading to incomplete or misleading summaries.


	* Language and Style: The model’s output might not always match the desired tone or style, requiring further editing.


	* Data Dependency: Performance can vary depending on the quality and nature of the input data. It performs best on data similar to its training set (news articles)

	## Training and evaluation data

	The model was trained using the Adam optimizer with a learning rate of 2e-05 over 2 epochs.

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 2
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Generated Length \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|:------:\|:------:\|:---------:\|:----------------:\|
	\| No log \| 1.0 \| 288 \| 1.6727 \| 0.217 \| 0.0949 \| 0.1841 \| 0.1839 \| 19.0 \|
	\| 1.9118 \| 2.0 \| 576 \| 1.6597 \| 0.2162 \| 0.0943 \| 0.1834 \| 0.1834 \| 19.0 \|


	### Framework versions

	- Transformers 4.44.2
	- Pytorch 2.4.1+cu121
	- Datasets 3.0.0
	- Tokenizers 0.19.1