jacoballessio
/

ai-image-detect-distilled

Image Classification

real-fake-detection

ai-image-detection

Inference Endpoints

Model card Files Files and versions Community

ai-image-detect-distilled / README.md

jacoballessio's picture

Update README.md

2897763 verified 4 months ago

|

history blame contribute delete

2.86 kB

	---
	license: mit
	tags:
	- image-classification
	- pytorch
	- ViT
	- transformers
	- real-fake-detection
	- deep-fake
	- ai-detect
	- ai-image-detection
	metrics:
	- accuracy
	model-index:
	- name: AI Image Detect Distilled
	results:
	- task:
	type: image-classification
	name: Image Classification
	metrics:
	- type: accuracy
	value: 0.74
	pipeline_tag: image-classification
	library_name: transformers
	---
	# AI Detection Model

	## Model Architecture and Training

	Three separate models were initially trained:
	1. Midjourney vs. Real Images
	2. Stable Diffusion vs. Real Images
	3. Stable Diffusion Fine-tunings vs. Real Images

	Data preparation process:
	- Used Google's Open Image Dataset for real images
	- Described real images using BLIP (Bootstrapping Language-Image Pre-training)
	- Generated Stable Diffusion images using BLIP descriptions
	- Found similar Midjourney images based on BLIP descriptions

	This approach ensured real and AI-generated images were as similar as possible, differing only in their origin.

	The three models were then distilled into a small ViT model with 11.8 Million Parameters, combining their learned features for more efficient detection.

	## Data Sources

	- Google's Open Image Dataset: [link](https://storage.googleapis.com/openimages/web/index.html)
	- Ivan Sivkov's Midjourney Dataset: [link](https://www.kaggle.com/datasets/ivansivkovenin/midjourney-prompts-image-part8)
	- TANREI(NAMA)'s Stable Diffusion Prompts Dataset: [link](https://www.kaggle.com/datasets/tanreinama/900k-diffusion-prompts-dataset)

	## Performance

	- Validation Set: 74% accuracy
	- Held out from training data to assess generalization

	- Custom Real-World Set: 72% accuracy
	- Composed of self-captured images and online-sourced images
	- Designed to be more representative of internet-based images

	- Comparative Analysis:
	- Outperformed other popular AI detection models by 5 percentage points on both sets
	- Other models achieved 89% and 79% on validation and real-world sets respectively

	## Key Insights

	1. Strong generalization on validation data (75% accuracy)
	2. Good adaptability to diverse, real-world images (72% accuracy)
	3. Consistent outperformance of other popular models
	4. 10-point accuracy drop from validation to real-world set indicates room for improvement
	5. Comprehensive training on multiple AI generation techniques contributes to model versatility
	6. Focus on subtle differences in image generation rather than content disparities

	## Future Directions

	- Expand dataset with more diverse, real-world examples to bridge the performance gap
	- Improve generalization to internet-sourced images
	- Conduct error analysis on misclassified samples to identify patterns
	- Integrate new AI image generation techniques as they emerge
	- Consider fine-tuning for specific domains where detection accuracy is critical