metadata

license: mit
datasets:
  - purplehaze1/CrowdHuman
  - Hakureirm/citypersons
pipeline_tag: object-detection

PairDETR: face_body_detection_and_association

This card contains the official weights of PairDETR, a method for Joint Detection and Association of Human Bodies and Faces CVPR 2024.

To reproduce our training experiments and evaluation results please use our github repo PairDETR

System architecture:

PairDETR extracts embeddings using ResNet-50 followed by a transformer to predict pairs. During training, pairs are matched with ground-truth and corrected using approximated matching loss.

Inference example with transformers:

import os
import numpy as np 
import pandas as pd 
from transformers import DeformableDetrForObjectDetection, DeformableDetrConfig, AutoImageProcessor
import torch.nn as nn
import torch
from PIL import Image
import shutil
import requests
from hf_utils import PairDetr, inverse_sigmoid, forward

## Or download the weights manually
def get_weights():
    url = "https://huggingface.co/MTSAIR/PairDETR/blob/main/pytorch_model.bin"
    response = requests.get(url, stream=True)
    with open('full_weights.pth', 'wb') as out_file:
        shutil.copyfileobj(response.raw, out_file)

## loading the model
configuration = DeformableDetrConfig("SenseTime/deformable-detr")
processor = AutoImageProcessor.from_pretrained("MTSAIR/PairDETR")
model = DeformableDetrForObjectDetection(configuration)
model = PairDetr(model, 1500, 3)
get_weights()
checkpoint = torch.load("full_weights.pth", map_location="cpu")
model.load_state_dict(checkpoint, strict=False)

## run inference
path = "./test.jpg"
image = Image.open(path)
inputs = processor(images=image, return_tensors="pt")
outputs = forward(model, inputs["pixel_values"])

Results

Comparison between PairDETR method and other methods in the miss Matching Rate mMr-2 (the lower the better) on CrowdHuman dataset:

Model	Reasnable	Bare	Partial	Heavy	Hard	Average	Checkpoints
POS	55.49	48.20	62.00	80.98	84.58	66.4	weights
BFJ	42.96	37.96	48.20	67.31	71.44	52.5	weights
BPJ	-	-	-	-	-	50.1	weights
PBADET	-	-	-	-	-	50.8	none
OURs	35.25	30.38	38.12	52.47	55.75	42.9	weights

MTSAIR
/

PairDETR

PairDETR: face_body_detection_and_association

System architecture:

Inference example with transformers:

Results

References and useful links

Papers

This work is implemented on top of: