File size: 1,592 Bytes
07ea023
 
ea95e4d
 
5fb67fb
 
 
 
 
 
 
 
85bebf8
5fb67fb
 
 
 
 
 
 
 
 
 
37a67d1
5fb67fb
 
 
 
 
 
 
37a67d1
 
 
5fb67fb
 
 
 
 
 
 
4882636
5fb67fb
 
b49c9a9
 
 
 
 
 
 
 
 
 
 
 
5fb67fb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
---
license: mit
tags:
- transformers
language:
- en
---

# FaceXFormer Model Card

<div align="center">

[**Project Page**](https://kartik-3004.github.io/facexformer_web/) **|** [**Paper (ArXiv)**](https://arxiv.org/abs/2403.12960) **|** [**Code**](https://github.com/Kartik-3004/facexformer)


</div>

## Introduction

FaceXFormer is an end-to-end unified model capable of handling a comprehensive range of facial analysis tasks such as face parsing, 
landmark detection, head pose estimation, attributes recognition, age/gender/race estimation and landmarks visibility prediction.

<div  align="center">
<img src='assets/intro_viz.png'>
</div>

## Model Details

FaceXFormer is a transformer-based encoder-decoder architecture where each task is treated as a learnable token, enabling the 
integration of multiple tasks within a single framework.

<div  align="center">
<img src='assets/main_archi.png'>
</div>

## Usage

The models can be downloaded directly from this repository or using python:
```python
from huggingface_hub import hf_hub_download

hf_hub_download(repo_id="kartiknarayan/facexformer", filename="ckpts/model.pt", local_dir="./")
```

## Citation
```bibtex
@misc{narayan2024facexformer,
      title={FaceXFormer : A Unified Transformer for Facial Analysis},
      author={Kartik Narayan and Vibashan VS and Rama Chellappa and Vishal M. Patel},
      year={2024},
      eprint={2403.12960},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
```

Please check our [GitHub repository](https://github.com/Kartik-3004/facexformer) for complete inference instructions.