|
--- |
|
license: mit |
|
datasets: |
|
- purplehaze1/CrowdHuman |
|
- Hakureirm/citypersons |
|
pipeline_tag: object-detection |
|
--- |
|
# PairDETR: face_body_detection_and_association |
|
This card contains the official weights of PairDETR, a method for Joint Detection and Association of Human Bodies and Faces **CVPR 2024**. |
|
<img src="./teaser.jpg" width="1024" height="600"></img> |
|
|
|
To reproduce our training experiments and evaluation results please use our github repo <a href="https://github.com/mts-ai/pairdetr">PairDETR</a> |
|
## System architecture: |
|
<img src="./sys.jpg" width="1024" height="600"></img> |
|
PairDETR extracts embeddings using ResNet-50 followed by a transformer to predict pairs. During training, pairs are matched with ground-truth and corrected using approximated matching loss. |
|
## Inference example with transformers: |
|
```python |
|
import os |
|
import numpy as np |
|
import pandas as pd |
|
from transformers import DeformableDetrForObjectDetection, DeformableDetrConfig, AutoImageProcessor |
|
import torch.nn as nn |
|
import torch |
|
from PIL import Image |
|
import shutil |
|
import requests |
|
from hf_utils import PairDetr, inverse_sigmoid, forward |
|
|
|
## Or download the weights manually |
|
def get_weights(): |
|
url = "https://huggingface.co/MTSAIR/PairDETR/blob/main/pytorch_model.bin" |
|
response = requests.get(url, stream=True) |
|
with open('full_weights.pth', 'wb') as out_file: |
|
shutil.copyfileobj(response.raw, out_file) |
|
|
|
## loading the model |
|
configuration = DeformableDetrConfig("SenseTime/deformable-detr") |
|
processor = AutoImageProcessor.from_pretrained("MTSAIR/PairDETR") |
|
model = DeformableDetrForObjectDetection(configuration) |
|
model = PairDetr(model, 1500, 3) |
|
get_weights() |
|
checkpoint = torch.load("full_weights.pth", map_location="cpu") |
|
model.load_state_dict(checkpoint, strict=False) |
|
|
|
## run inference |
|
path = "./test.jpg" |
|
image = Image.open(path) |
|
inputs = processor(images=image, return_tensors="pt") |
|
outputs = forward(model, inputs["pixel_values"]) |
|
|
|
``` |
|
## Results |
|
Comparison between PairDETR method and other methods in the miss Matching Rate mMr-2 (the lower the better) on CrowdHuman dataset: |
|
|
|
| **Model** | **Reasnable** | **Bare** | **Partial** | **Heavy** | **Hard** | **Average** |**Checkpoints** | |
|
|-----------|:-------------:|:--------:|-------------|:---------:|----------|----------|----------| |
|
| **POS** | 55.49 | 48.20 | 62.00 | 80.98 | 84.58 | 66.4 | <a href="https://drive.google.com/file/d/1GFnIXqc9aG0eXSQFI4Pe4XfO-8hAZmKV/view">weights</a> | |
|
| **BFJ** | 42.96 | 37.96 | 48.20 | 67.31 | 71.44 | 52.5 | <a href="https://drive.google.com/file/d/1E8MQf3pfOyjbVvxZeBLdYBFUiJA6bdgr/view">weights</a> | |
|
| **BPJ** | - | - | - | - | - | 50.1 |<a href="https://github.com/hnuzhy/BPJDet">weights</a> | |
|
| **PBADET** | - | - | - | - | - | 50.8 | <a href="">none</a> | |
|
| **OURs** | 35.25 | 30.38 | 38.12 | 52.47 | 55.75 | 42.9 | <a href="">weights</a> | |
|
## References and useful links |
|
### Papers |
|
* <a href='https://arxiv.org/abs/2005.12872'>End-to-End Object Detection with Transformers</a> |
|
* <a href='https://arxiv.org/abs/1805.00123'>CrowdHuman: A Benchmark for Detecting Human in a Crowd</a> |
|
* <a href='https://openaccess.thecvf.com/content/ICCV2021/html/Wan_Body-Face_Joint_Detection_via_Embedding_and_Head_Hook_ICCV_2021_paper.html'>Body-Face Joint Detection via Embedding and Head Hook</a> |
|
* <a href='https://arxiv.org/abs/2010.04159'>Deformable DETR: Deformable Transformers for End-to-End Object Detection</a> |
|
* <a href='https://arxiv.org/abs/2012.06785'>DETR for Crowd Pedestrian Detection</a> |
|
* <a href='https://arxiv.org/abs/2204.07962'>An Extendable, Efficient and Effective Transformer-based Object Detector</a> |
|
|
|
### This work is implemented on top of: |
|
* <a href='https://github.com/facebookresearch/detr/tree/3af9fa878e73b6894ce3596450a8d9b89d918ca9'>DETR</a> |
|
* <a href='https://github.com/fundamentalvision/Deformable-DETR'>Deformable-DETR</a> |
|
* <a href='https://github.com/AibeeDetect/BFJDet/tree/main'>BFJDet</a> |
|
* <a href='https://huggingface.co/docs/transformers/en/index'>Hugginface transformers</a> |