metadata

license: mit
language:
  - en
base_model:
  - google/vit-base-patch16-224
pipeline_tag: image-feature-extraction

Model Card for Model ID

Overview

This is a checkpoint for the HAP (Human-centric vision backbone) model, originally introduced in the NeurIPS 2023 paper HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception. The checkpoint has been converted from the authors' original format to the HuggingFace Vision Transformer (ViT) format. Note: This model is not my own work; it is an adaptation of the original.

Source Information

Paper: HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception
Project Page: https://zhangxinyu-xyz.github.io/hap.github.io/
GitHub Repository: https://github.com/junkunyuan/HAP

Citation

If you use this model, please cite the original work as follows:

@article{yuan2023hap,
  title={HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception},
  author={Yuan, Junkun and Zhang, Xinyu and Zhou, Hao and Wang, Jian and Qiu, Zhongwei and Shao, Zhiyin and Zhang, Shaofeng and Long, Sifan and Kuang, Kun and Yao, Kun and others},
  journal={Advances in Neural Information Processing Systems},
  volume={36},
  year={2023}
}