Update README.md
Browse files
README.md
CHANGED
|
@@ -1,36 +1,40 @@
|
|
| 1 |
---
|
| 2 |
license: bsd-3-clause
|
| 3 |
-
library_name:
|
| 4 |
-
pipeline_tag:
|
| 5 |
tags:
|
| 6 |
-
-
|
| 7 |
-
-
|
| 8 |
-
-
|
| 9 |
-
-
|
|
|
|
| 10 |
- acl-2026
|
| 11 |
---
|
| 12 |
|
| 13 |
-
#
|
| 14 |
|
| 15 |
## π Model Description
|
| 16 |
-
This is the **
|
| 17 |
*"Generating Attribution Reports for Manipulated Facial Images: A Dataset and Baseline"*.
|
| 18 |
|
| 19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
## π οΈ Model Details
|
| 22 |
-
- **
|
| 23 |
-
- **
|
| 24 |
-
- **
|
| 25 |
-
- **
|
| 26 |
|
| 27 |
## π Links
|
| 28 |
- **Official Code**: [Generating-Attribution-Reports](https://github.com/JingchunLian/Generating-Attribution-Reports)
|
| 29 |
-
- **
|
| 30 |
- **Dataset (MMTT)**: [LianJC/MMTT-Dataset](https://huggingface.co/datasets/LianJC/MMTT-Dataset)
|
| 31 |
|
| 32 |
## π Citation
|
| 33 |
-
If you find this model useful, please cite:
|
| 34 |
```bibtex
|
| 35 |
@inproceedings{lian2026generating,
|
| 36 |
title={Generating Attribution Reports for Manipulated Facial Images: A Dataset and Baseline},
|
|
|
|
| 1 |
---
|
| 2 |
license: bsd-3-clause
|
| 3 |
+
library_name: lavis
|
| 4 |
+
pipeline_tag: visual-question-answering
|
| 5 |
tags:
|
| 6 |
+
- explainable-ai
|
| 7 |
+
- deepfake-detection
|
| 8 |
+
- vlm
|
| 9 |
+
- instructblip
|
| 10 |
+
- forensic-explanation
|
| 11 |
- acl-2026
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# DFF: InstructBLIP-based Explainable DeepFake Detection
|
| 15 |
|
| 16 |
## π Model Description
|
| 17 |
+
This is the core **DFF (DeepFake Detection and Forensic Explanation Framework)** model as described in the ACL 2026 paper:
|
| 18 |
*"Generating Attribution Reports for Manipulated Facial Images: A Dataset and Baseline"*.
|
| 19 |
|
| 20 |
+
DFF is built upon the **InstructBLIP (Flan-T5 XL)** architecture. By integrating the Face-ViT auxiliary classifier, it achieves state-of-the-art performance in both **forgery localization (mask generation)** and **forensic explanation (captioning)**.
|
| 21 |
+
|
| 22 |
+
## π Key Capabilities
|
| 23 |
+
1. **Forgery Localization**: Generates high-resolution binary masks highlighting manipulated facial regions.
|
| 24 |
+
2. **Natural Language Explanation**: Produces detailed text describing why a specific image is considered a forgery (e.g., "The texture around the eyes is unnatural due to GAN-based blending").
|
| 25 |
|
| 26 |
## π οΈ Model Details
|
| 27 |
+
- **Base LLM**: Flan-T5 XL.
|
| 28 |
+
- **Visual Encoder**: EVA-ViT-G.
|
| 29 |
+
- **Auxiliary Module**: Face-ViT (Multi-label perception).
|
| 30 |
+
- **Task**: Explainable Detection & Multi-modal Attribution Reporting.
|
| 31 |
|
| 32 |
## π Links
|
| 33 |
- **Official Code**: [Generating-Attribution-Reports](https://github.com/JingchunLian/Generating-Attribution-Reports)
|
| 34 |
+
- **Auxiliary Classifier**: [LianJC/Face-ViT-MultiLabel](https://huggingface.co/LianJC/Face-ViT-MultiLabel)
|
| 35 |
- **Dataset (MMTT)**: [LianJC/MMTT-Dataset](https://huggingface.co/datasets/LianJC/MMTT-Dataset)
|
| 36 |
|
| 37 |
## π Citation
|
|
|
|
| 38 |
```bibtex
|
| 39 |
@inproceedings{lian2026generating,
|
| 40 |
title={Generating Attribution Reports for Manipulated Facial Images: A Dataset and Baseline},
|