DocuGuard β ViT checkpoints
Trained model weights for the DocuGuard identity-document forgery detection
pipeline (Roy Boker, B.Sc. final project, 2026). Powers the live demo at
royboker.github.io via the
Royboker/docuguard-demo
HF Space.
Files
| File | Architecture | Task | Classes | Training samples |
|---|---|---|---|---|
vit_document_classifier_9k.pth |
vit_tiny_patch16_224 |
Document type | ID Card / Passport / Driver License | 9,000 |
vit_passport_binary_20k.pth |
vit_small_patch16_224 |
Forgery (binary) | Real / Fake | 20,000 |
vit_passport_fraud_type_20k.pth |
vit_small_patch16_224 |
Fraud type | face_morphing / face_replacement | 20,000 |
vit_id_card_binary_20k.pth |
vit_small_patch16_224 |
Forgery (binary) | Real / Fake | 20,000 |
vit_id_card_fraud_type_20k.pth |
vit_small_patch16_224 |
Fraud type | face_morphing / face_replacement | 20,000 |
vit_drivers_license_binary_15k.pth |
vit_small_patch16_224 |
Forgery (binary) | Real / Fake | 15,000 |
vit_drivers_license_fraud_type_15k.pth |
vit_small_patch16_224 |
Fraud type | face_morphing / face_replacement | 15,000 |
Architecture notes
- Stage 1 (
vit_document_classifier_9k.pth):ViTTinyClassifierβ timm backbone +Sequential(Dropout(0.2), Linear β 3). - Stage 2 & 3 (all
_binary_/_fraud_type_files):ViTBinaryClassifier/ViTFraudTypeClassifierβ timm backbone +Sequential(LayerNorm, Linear β d/2, GELU, Dropout(0.1), Linear β 2). - Image size: 224 Γ 224, ImageNet normalization, RGB.
- Inference applies 4-view Test-Time Augmentation on stages 2 & 3 (base / scale-down / scale-up / brightness) β see
docuguard-demo/model_loader.py.
Dataset
Trained on the IDNet identity-document analysis dataset (~290k images, 9 countries).