Human Detector

Age and gender recognition in the field is a challenging task: in addition to variable environmental conditions, pose complexity, and differences in image quality, there is also partial or complete occlusion of the face.MiVOLO (Multi-Input VOLO) is a simple approach to age and gender estimation utilizing the state-of-the-art Vision Transformer. The method integrates these two tasks into a unified two-input/output model that utilizes not only facial information but also person image data. This improves the generalization ability of the model, allowing it to provide satisfactory results even when faces are not visible in the image. To evaluate the model, experiments were conducted on four popular benchmark datasets and state-of-the-art performance was achieved while demonstrating the ability to process in real-time. In addition, a new benchmark dataset was introduced based on images from the Open Images dataset. The ground truth annotations of this benchmark dataset are carefully generated by human annotators and high accuracy is guaranteed by intelligently aggregating the voting results. In addition, the age recognition performance of the model is compared to human-level accuracy and demonstrated to significantly outperform humans in most age ranges. Finally, access to the model was provided to the public, along with the code used for validation and inference. Additional annotations are also provided for the datasets used, and new benchmark datasets are presented.

Maintenance

GIT_LFS_SKIP_SMUDGE=1 git clone git@hf.co:monet-joe/human-detector

Mirror

https://www.modelscope.cn/models/monetjoe/human-detector

Reference

[1] https://github.com/WildChlamydia/MiVOLO

Downloads last month
26

Dataset used to train monet-joe/human-detector

Space using monet-joe/human-detector 1

Collection including monet-joe/human-detector