AnomalyThink-Qwen2.5-VL-7B (Arm-C — best model)

The single best detector in the thesis (DS-MVTec 82.80 / VisA 72.07), SFT only.

A Qwen2.5-VL-7B model fine-tuned for explainable industrial anomaly detection (IAD). Given a product image it produces a structured reasoning trace (<think>), a defect <location> and <type> (for anomalies), and a binary <answer>. Research artefact from the MSc thesis Reasoning-Enhanced Vision-Language Models for Explainable Industrial Anomaly Detection (TU Delft, 2026).

Results (MMAD subsets, balanced accuracy)

Benchmark Balanced accuracy
DS-MVTec (1,670) 82.80%
VisA (2,141) 72.07%

Evaluated under a single common harness on the MMAD DS-MVTec and VisA subsets. It exceeds the released IAD-R1 checkpoint on both benchmarks under this harness, using supervised fine-tuning alone.

Training

Supervised fine-tuning from the Qwen2.5-VL-7B base on the curated Arm-C corpus (STaR with external critique-and-revise self-distillation, 6,000 balanced traces, vision encoder frozen). No reinforcement-learning stage is used.

The "AnomalyThink" reasoning traces were distilled from Gemini 2.5-Flash on Real-IAD images. Training data: aacudad/AnomalyThink.

Usage

import torch
from transformers import AutoProcessor, Qwen2_5_VLForConditionalGeneration
model = Qwen2_5_VLForConditionalGeneration.from_pretrained("aacudad/AnomalyThink-Qwen2.5-VL-7B", torch_dtype=torch.bfloat16, device_map="auto")
processor = AutoProcessor.from_pretrained("aacudad/AnomalyThink-Qwen2.5-VL-7B")
# Build the structured single-image IAD prompt + image, then generate.

Intended use and limitations

Research on explainable IAD. Known limitations: the model can confidently hallucinate a defect on a normal part (false positive), and GRPO-lineage variants can over-predict the "Missing Parts" type. As the public DS-MVTec/VisA images may appear in VLM pretraining, absolute numbers should be read with that caveat.

Citation

@mastersthesis{acudad2026anomalythink,
  title  = {Reasoning-Enhanced Vision-Language Models for Explainable Industrial Anomaly Detection},
  author = {Acudad, Adnane},
  school = {Delft University of Technology},
  year   = {2026}
}

License

Apache-2.0 (inherits the Qwen2.5-VL-7B base). Trained on Real-IAD (cite Real-IAD separately; images are not redistributed) with traces distilled from Gemini 2.5-Flash.

Downloads last month
20
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aacudad/AnomalyThink-Qwen2.5-VL-7B

Finetuned
(1128)
this model
Quantizations
1 model