jacoballessio's picture
Update README.md
2897763 verified
---
license: mit
tags:
- image-classification
- pytorch
- ViT
- transformers
- real-fake-detection
- deep-fake
- ai-detect
- ai-image-detection
metrics:
- accuracy
model-index:
- name: AI Image Detect Distilled
results:
- task:
type: image-classification
name: Image Classification
metrics:
- type: accuracy
value: 0.74
pipeline_tag: image-classification
library_name: transformers
---
# AI Detection Model
## Model Architecture and Training
Three separate models were initially trained:
1. Midjourney vs. Real Images
2. Stable Diffusion vs. Real Images
3. Stable Diffusion Fine-tunings vs. Real Images
Data preparation process:
- Used Google's Open Image Dataset for real images
- Described real images using BLIP (Bootstrapping Language-Image Pre-training)
- Generated Stable Diffusion images using BLIP descriptions
- Found similar Midjourney images based on BLIP descriptions
This approach ensured real and AI-generated images were as similar as possible, differing only in their origin.
The three models were then distilled into a small ViT model with 11.8 Million Parameters, combining their learned features for more efficient detection.
## Data Sources
- Google's Open Image Dataset: [link](https://storage.googleapis.com/openimages/web/index.html)
- Ivan Sivkov's Midjourney Dataset: [link](https://www.kaggle.com/datasets/ivansivkovenin/midjourney-prompts-image-part8)
- TANREI(NAMA)'s Stable Diffusion Prompts Dataset: [link](https://www.kaggle.com/datasets/tanreinama/900k-diffusion-prompts-dataset)
## Performance
- Validation Set: 74% accuracy
- Held out from training data to assess generalization
- Custom Real-World Set: 72% accuracy
- Composed of self-captured images and online-sourced images
- Designed to be more representative of internet-based images
- Comparative Analysis:
- Outperformed other popular AI detection models by 5 percentage points on both sets
- Other models achieved 89% and 79% on validation and real-world sets respectively
## Key Insights
1. Strong generalization on validation data (75% accuracy)
2. Good adaptability to diverse, real-world images (72% accuracy)
3. Consistent outperformance of other popular models
4. 10-point accuracy drop from validation to real-world set indicates room for improvement
5. Comprehensive training on multiple AI generation techniques contributes to model versatility
6. Focus on subtle differences in image generation rather than content disparities
## Future Directions
- Expand dataset with more diverse, real-world examples to bridge the performance gap
- Improve generalization to internet-sourced images
- Conduct error analysis on misclassified samples to identify patterns
- Integrate new AI image generation techniques as they emerge
- Consider fine-tuning for specific domains where detection accuracy is critical