PairDETR / README.md

ammarali32

Update README.md

1bb2217 verified 6 months ago

preview code

raw

history blame contribute delete

No virus

4.15 kB

	---
	license: mit
	datasets:
	- purplehaze1/CrowdHuman
	- Hakureirm/citypersons
	pipeline_tag: object-detection
	---
	# PairDETR: face_body_detection_and_association
	This card contains the official weights of PairDETR, a method for Joint Detection and Association of Human Bodies and Faces CVPR 2024.
	<img src="./teaser.jpg" width="1024" height="600"></img>

	To reproduce our training experiments and evaluation results please use our github repo <a href="https://github.com/mts-ai/pairdetr">PairDETR</a>
	## System architecture:
	<img src="./sys.jpg" width="1024" height="600"></img>
	PairDETR extracts embeddings using ResNet-50 followed by a transformer to predict pairs. During training, pairs are matched with ground-truth and corrected using approximated matching loss.
	## Inference example with transformers:
	```python
	import os
	import numpy as np
	import pandas as pd
	from transformers import DeformableDetrForObjectDetection, DeformableDetrConfig, AutoImageProcessor
	import torch.nn as nn
	import torch
	from PIL import Image
	import shutil
	import requests
	from hf_utils import PairDetr, inverse_sigmoid, forward

	## Or download the weights manually
	def get_weights():
	url = "https://huggingface.co/MTSAIR/PairDETR/blob/main/pytorch_model.bin"
	response = requests.get(url, stream=True)
	with open('full_weights.pth', 'wb') as out_file:
	shutil.copyfileobj(response.raw, out_file)

	## loading the model
	configuration = DeformableDetrConfig("SenseTime/deformable-detr")
	processor = AutoImageProcessor.from_pretrained("MTSAIR/PairDETR")
	model = DeformableDetrForObjectDetection(configuration)
	model = PairDetr(model, 1500, 3)
	get_weights()
	checkpoint = torch.load("full_weights.pth", map_location="cpu")
	model.load_state_dict(checkpoint, strict=False)

	## run inference
	path = "./test.jpg"
	image = Image.open(path)
	inputs = processor(images=image, return_tensors="pt")
	outputs = forward(model, inputs["pixel_values"])

	```
	## Results
	Comparison between PairDETR method and other methods in the miss Matching Rate mMr-2 (the lower the better) on CrowdHuman dataset:

	\| Model \| Reasnable \| Bare \| Partial \| Heavy \| Hard \| Average \|Checkpoints \|
	\|-----------\|:-------------:\|:--------:\|-------------\|:---------:\|----------\|----------\|----------\|
	\| POS \| 55.49 \| 48.20 \| 62.00 \| 80.98 \| 84.58 \| 66.4 \| <a href="https://drive.google.com/file/d/1GFnIXqc9aG0eXSQFI4Pe4XfO-8hAZmKV/view">weights</a> \|
	\| BFJ \| 42.96 \| 37.96 \| 48.20 \| 67.31 \| 71.44 \| 52.5 \| <a href="https://drive.google.com/file/d/1E8MQf3pfOyjbVvxZeBLdYBFUiJA6bdgr/view">weights</a> \|
	\| BPJ \| - \| - \| - \| - \| - \| 50.1 \|<a href="https://github.com/hnuzhy/BPJDet">weights</a> \|
	\| PBADET \| - \| - \| - \| - \| - \| 50.8 \| <a href="">none</a> \|
	\| OURs \| 35.25 \| 30.38 \| 38.12 \| 52.47 \| 55.75 \| 42.9 \| <a href="">weights</a> \|
	## References and useful links
	### Papers
	* <a href='https://arxiv.org/abs/2005.12872'>End-to-End Object Detection with Transformers</a>
	* <a href='https://arxiv.org/abs/1805.00123'>CrowdHuman: A Benchmark for Detecting Human in a Crowd</a>
	* <a href='https://openaccess.thecvf.com/content/ICCV2021/html/Wan_Body-Face_Joint_Detection_via_Embedding_and_Head_Hook_ICCV_2021_paper.html'>Body-Face Joint Detection via Embedding and Head Hook</a>
	* <a href='https://arxiv.org/abs/2010.04159'>Deformable DETR: Deformable Transformers for End-to-End Object Detection</a>
	* <a href='https://arxiv.org/abs/2012.06785'>DETR for Crowd Pedestrian Detection</a>
	* <a href='https://arxiv.org/abs/2204.07962'>An Extendable, Efficient and Effective Transformer-based Object Detector</a>

	### This work is implemented on top of:
	* <a href='https://github.com/facebookresearch/detr/tree/3af9fa878e73b6894ce3596450a8d9b89d918ca9'>DETR</a>
	* <a href='https://github.com/fundamentalvision/Deformable-DETR'>Deformable-DETR</a>
	* <a href='https://github.com/AibeeDetect/BFJDet/tree/main'>BFJDet</a>
	* <a href='https://huggingface.co/docs/transformers/en/index'>Hugginface transformers</a>