Instructions to use bezand/BoN1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Scikit-learn
How to use bezand/BoN1 with Scikit-learn:
from huggingface_hub import hf_hub_download import joblib model = joblib.load( hf_hub_download("bezand/BoN1", "sklearn_model.joblib") ) # only load pickle files from sources you trust # read more about it here https://skops.readthedocs.io/en/stable/persistence.html - Notebooks
- Google Colab
- Kaggle
Bot or Not — Denoising Trajectory Detector
Logistic-regression classifier on top of denoising-trajectory features extracted with CLIP ViT-L/14 + Stable Diffusion v1.5. Reproduces the method from Liang et al., "Denoising Trajectory Biases for Zero-Shot AI-Generated Image Detection" (NeurIPS 2025).
How it works
For each input image:
- Encode to SD v1.5 latent space (VAE).
- Add DDPM noise at timesteps
(50, 150, 300, 500, 800). - Run one UNet denoising step per timestep with an empty-prompt embedding.
- Decode each denoised latent back to image space.
- Compute CLIP-cosine similarity between the original and each reconstruction.
This yields a 6-D feature vector [sim_mean, sim_t50, sim_t150, sim_t300, sim_t500, sim_t800],
which a logistic regression (class_weight='balanced', solver='lbfgs') classifies as AI / Real.
Training data
- AI images: 2,500 images generated by diffusion models (1024×1024).
- Real images: 2,500 images sampled from COCO 2017
train2017. - 80/20 stratified split,
random_state=42.
Test metrics
Held-out test set: 1,000 images (500 Real, 500 AI), random_state=42.
| Metric | Value |
|---|---|
| Accuracy | 0.7940 |
| ROC AUC | 0.8679 |
| F1 | 0.7876 |
Per-class breakdown:
| Precision | Recall | F1 | Support | |
|---|---|---|---|---|
| Real | 0.78 | 0.82 | 0.80 | 500 |
| AI | 0.81 | 0.76 | 0.79 | 500 |
Confusion matrix (rows = true, cols = predicted):
| Pred Real | Pred AI | |
|---|---|---|
| True Real | 412 | 88 |
| True AI | 118 | 382 |
Usage
from huggingface_hub import hf_hub_download
import joblib, json
# Or use the bundled inference module:
# from inference import BotOrNotDetector
# detector = BotOrNotDetector.from_pretrained("bezand/BoN1")
# detector.predict("image.jpg")
A CUDA GPU is required for practical inference (~30s/image on a T4; CPU inference is impractical because each prediction runs five SD denoising steps).
Files
classifier.joblib— trainedsklearn.linear_model.LogisticRegression.scaler.joblib—StandardScalerfit on training features.config.json— feature-extractor config (timesteps, CLIP and SD model IDs).inference.py,feature_extractor.py— inference wrappers.
Limitations and biases
- Trained on a single AI-image source at fixed 1024×1024 resolution. Real images (COCO) vary in size and content, which may bias the classifier toward resolution/aspect-ratio cues rather than denoising-trajectory artefacts.
- Single-step denoising with an empty prompt; full multi-step trajectories may give cleaner signal but were not used in training.
- Only tested against SD-family generators. Performance on other generators (Midjourney, FLUX, autoregressive models) is unknown.
License
The trained classifier weights and StandardScaler are released under CC-BY-NC-4.0.
Inference also requires Stable Diffusion v1.5 (CreativeML Open RAIL-M) and
CLIP ViT-L/14, each governed by its own license.
Citation
@inproceedings{liang2025denoising,
title = {Denoising Trajectory Biases for Zero-Shot AI-Generated Image Detection},
author = {Liang et al.},
booktitle = {NeurIPS},
year = {2025}
}
- Downloads last month
- -