LianJC commited on
Commit
f573903
Β·
verified Β·
1 Parent(s): 623e09f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -15
README.md CHANGED
@@ -1,36 +1,40 @@
1
  ---
2
  license: bsd-3-clause
3
- library_name: pytorch
4
- pipeline_tag: image-classification
5
  tags:
6
- - facial-forgery-detection
7
- - multi-label-classification
8
- - vit
9
- - deepfake
 
10
  - acl-2026
11
  ---
12
 
13
- # Face-ViT: Multi-Label Facial Forgery Region Classifier
14
 
15
  ## πŸ“– Model Description
16
- This is the **Face-ViT** auxiliary perception module proposed in the ACL 2026 paper:
17
  *"Generating Attribution Reports for Manipulated Facial Images: A Dataset and Baseline"*.
18
 
19
- Face-ViT is a multi-label classifier based on the **ViT-H/14** architecture. It is specifically trained to recognize 21 different types of facial manipulations (e.g., eye modification, skin smoothing, mouth tampering). In the DFF framework, it provides fine-grained visual cues that guide the large language model to generate accurate forensic explanations.
 
 
 
 
20
 
21
  ## πŸ› οΈ Model Details
22
- - **Architecture**: ViT-H/14 with an additional CNN branch and max-pooling for multi-label support.
23
- - **Input Size**: 224x224 RGB images.
24
- - **Number of Classes**: 21 (Facial attributes/manipulation types).
25
- - **Training Objective**: Joint loss including BCE, Focal, Dice, and Jaccard loss.
26
 
27
  ## πŸš€ Links
28
  - **Official Code**: [Generating-Attribution-Reports](https://github.com/JingchunLian/Generating-Attribution-Reports)
29
- - **Main Framework (DFF)**: [LianJC/DFF-InstructBLIP-Detection](https://huggingface.co/LianJC/DFF-InstructBLIP-Detection)
30
  - **Dataset (MMTT)**: [LianJC/MMTT-Dataset](https://huggingface.co/datasets/LianJC/MMTT-Dataset)
31
 
32
  ## πŸ“œ Citation
33
- If you find this model useful, please cite:
34
  ```bibtex
35
  @inproceedings{lian2026generating,
36
  title={Generating Attribution Reports for Manipulated Facial Images: A Dataset and Baseline},
 
1
  ---
2
  license: bsd-3-clause
3
+ library_name: lavis
4
+ pipeline_tag: visual-question-answering
5
  tags:
6
+ - explainable-ai
7
+ - deepfake-detection
8
+ - vlm
9
+ - instructblip
10
+ - forensic-explanation
11
  - acl-2026
12
  ---
13
 
14
+ # DFF: InstructBLIP-based Explainable DeepFake Detection
15
 
16
  ## πŸ“– Model Description
17
+ This is the core **DFF (DeepFake Detection and Forensic Explanation Framework)** model as described in the ACL 2026 paper:
18
  *"Generating Attribution Reports for Manipulated Facial Images: A Dataset and Baseline"*.
19
 
20
+ DFF is built upon the **InstructBLIP (Flan-T5 XL)** architecture. By integrating the Face-ViT auxiliary classifier, it achieves state-of-the-art performance in both **forgery localization (mask generation)** and **forensic explanation (captioning)**.
21
+
22
+ ## 🌟 Key Capabilities
23
+ 1. **Forgery Localization**: Generates high-resolution binary masks highlighting manipulated facial regions.
24
+ 2. **Natural Language Explanation**: Produces detailed text describing why a specific image is considered a forgery (e.g., "The texture around the eyes is unnatural due to GAN-based blending").
25
 
26
  ## πŸ› οΈ Model Details
27
+ - **Base LLM**: Flan-T5 XL.
28
+ - **Visual Encoder**: EVA-ViT-G.
29
+ - **Auxiliary Module**: Face-ViT (Multi-label perception).
30
+ - **Task**: Explainable Detection & Multi-modal Attribution Reporting.
31
 
32
  ## πŸš€ Links
33
  - **Official Code**: [Generating-Attribution-Reports](https://github.com/JingchunLian/Generating-Attribution-Reports)
34
+ - **Auxiliary Classifier**: [LianJC/Face-ViT-MultiLabel](https://huggingface.co/LianJC/Face-ViT-MultiLabel)
35
  - **Dataset (MMTT)**: [LianJC/MMTT-Dataset](https://huggingface.co/datasets/LianJC/MMTT-Dataset)
36
 
37
  ## πŸ“œ Citation
 
38
  ```bibtex
39
  @inproceedings{lian2026generating,
40
  title={Generating Attribution Reports for Manipulated Facial Images: A Dataset and Baseline},