File size: 3,012 Bytes

cb3d71c
 
 
 
 
 
 
 
 
 
 
 
 
 
73206e6
cb3d71c
73206e6
cb3d71c
73206e6
cb3d71c
73206e6
c1f0eae
73206e6
c1f0eae
73206e6
c1f0eae
73206e6
 
 
c1f0eae
73206e6
 
 
c1f0eae
73206e6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cb3d71c

---
license: apache-2.0
base_model: google/flan-t5-base
tags:
- generated_from_trainer
model-index:
- name: Fake-news-gen
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->


# Model Card: Fake-news-generator

## Model Purpose

This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the XSUM BBC news dataset. Its primary purpose is to serve as a tool for research, education, and testing in the domain of AI-generated fake news.

## Summary

The model is a conditional text generation system specifically fine-tuned to create artificially generated news articles based on short text summaries. This demonstration aims to showcase the capabilities and potential risks associated with AI systems automatically synthesizing false or misleading news content from limited input information.

## Intended Uses

1. **Research on AI Fake News Generation:**
   - Understanding the capabilities and limitations of AI models in generating deceptive content.
   - Exploring potential mitigation strategies and ethical considerations.

2. **Educational Purposes:**
   - Increasing awareness of the challenges posed by AI-generated fake content.
   - Promoting responsible AI development and usage.

3. **Testing Fake News Detection Systems:**
   - Evaluating the effectiveness of automatic fake news detection systems against AI-generated content.

## Factors

- **Training Data:**
  - Initially trained on XSUM BBC news summarization data.
  - Fine-tuned end-to-end to generate full articles from short text summaries.

- **Generation Process:**
  - Content is generated token-by-token based on the provided summary prompt.
  - No ground-truth real/fake labels or classifier included in the training data.

- **Output Characteristics:**
  - Outputs are raw model decodes without post-processing.

## Caveats and Recommendations

- **Not Intended for Malicious Uses:**
  - This model is explicitly not intended for creating or disseminating malicious or harmful content.

- **Ethical Considerations:**
  - Users are strongly advised to exercise caution and ethical responsibility when using or sharing outputs from this model.

- **Limitation on Real/Fake Labels:**
  - The model lacks ground-truth labels for distinguishing between real and fake news.

- **Limited Post-Processing:**
  - Generated outputs are presented without additional post-processing to emphasize raw model capabilities.


## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10

### Training results



### Framework versions

- Transformers 4.36.0
- Pytorch 2.1.2+cu121
- Datasets 2.16.1
- Tokenizers 0.15.0