Object Detection
English
signature-detection
legal
document-ai
rf-detr
contract

legaldocuman-rfdetr

RF-DETR-Base fine-tuned for handwritten signature detection in legal documents.
This model is the computer vision backbone of LegalDocuMan — a commercial document intelligence API for contract processing.


Model Details

Field Value
Base architecture RF-DETR-Base
Parameters 31.9M
Base checkpoint Roboflow RF-DETR-Base (COCO pretrained)
Task Single-class object detection
Class signature
Input resolution 560px (square)
License Apache 2.0

Training Data

Three open-source datasets were merged, deduplicated, and cleaned before training.
Cross-split contamination was verified via perceptual hashing — zero overlap between train, val, and test confirmed.

Dataset Images License
signatures-xc8up (Roboflow 100) ~2,800 CC BY 4.0
Signature Detector (TrueSign) 772 MIT
Signature Detection v3 (home) 2,145 CC BY 4.0

Final dataset size after cleaning: 3,996 images


Training Configuration

Parameter Value
Augmentation preset Document (9 transforms)
Augmentations Perspective distortion, grid distortion, JPEG compression artifacts, Gaussian blur, Gaussian noise, random brightness/contrast, CLAHE, sharpening, horizontal flip
Input resolution 560px
Early stopping patience 20 epochs
Stopped at Epoch 16
Best val mAP@50 0.9123 (epoch 11)
Framework rfdetr · PyTorch · PyTorch Lightning
Hardware NVIDIA RTX 4070 Ti Super 16GB

Evaluation

Evaluated on 168 held-out test images with zero contamination from training or validation sets.

Overall

Metric Score
mAP@50 0.7924
mAP@50:95 0.820
Precision@0.50 0.9490
Recall@0.50 0.8020
F1@0.50 0.8696

By Signature Size

Size AP AR
Small 0.252 0.500
Medium 0.834 0.855
Large 0.828 0.845

Note on small signatures: Signatures smaller than ~32×32px after resizing have significantly lower recall.
For documents with initials or compact date-field signatures, human review is recommended.


Usage

from rfdetr import RFDETRBase
from PIL import Image

model = RFDETRBase(pretrain_weights="Mo-Awadalla/legaldocuman-rfdetr")

image = Image.open("contract_page.jpg")
detections = model.predict(image, threshold=0.45)
print(detections)

For PDF inputs, convert pages to images first using pdf2image:

from pdf2image import convert_from_path

pages = convert_from_path("contract.pdf", dpi=200)
for i, page in enumerate(pages):
    detections = model.predict(page, threshold=0.45)
    print(f"Page {i+1}: {detections}")

Intended Use

Designed for:

  • Detecting the presence and location of handwritten ink signatures in legal contract pages
  • Document intake pipelines processing PDF, DOCX, and scanned image inputs
  • Execution status classification (executed vs. draft) as part of a broader document intelligence pipeline

Out of scope:

  • Signature verification or authenticity determination
  • Forgery detection
  • Digital or electronic signature detection
  • Handwriting recognition or transcription

Limitations

  • Small signatures (initials, compact date-field signatures) have significantly lower recall (AP 0.252)
  • Performance may degrade on scans below 150 DPI
  • Not trained on non-Latin document layouts
  • Should not be used as the sole decision-maker for high-stakes legal determinations without human review in the loop

Part of LegalDocuMan

This model is one component of the LegalDocuMan pipeline:

Upload PDF/DOCX/Image
        ↓
Text extraction (pdfplumber · python-docx · Tesseract OCR)
        ↓
Document type classification (MSA · SOW · NDA · PO · Amendment · License · Contract)
        ↓
Execution status (Regex NLP + RF-DETR visual signature detection)
        ↓
Vendor · Dates · Retention extraction
        ↓
Structured filename + PostgreSQL persistence

GitHub: Mo-Awadalla/LegalDocuMan


Attribution

Training data includes datasets licensed under CC BY 4.0.
Per license requirements:

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support