Kilich commited on
Commit
bf88262
Β·
1 Parent(s): dbbf2c4

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -11
README.md CHANGED
@@ -1,11 +1,64 @@
1
- ---
2
- title: Affective Visdial
3
- emoji: πŸ¦€
4
- colorFrom: gray
5
- colorTo: blue
6
- sdk: static
7
- pinned: false
8
- license: mit
9
- ---
10
-
11
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <div align="center">
2
+ <p align="center">
3
+ <img src="assets/img/web_teaser.png" width=500px/>
4
+ </p>
5
+ <h1 align="center">
6
+ </h1>
7
+ <h1 align="center">
8
+ Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations
9
+ </h1>
10
+
11
+ [![arXiv](https://img.shields.io/badge/πŸ“š%20arXiv-grey?logoColor=white&logoWidth=20)](#)
12
+ [![Download (coming soon)](https://img.shields.io/badge/πŸ“¦%20Download-grey?logoColor=white&logoWidth=20)](#)
13
+ [![Website](https://img.shields.io/badge/🌐%20Website-green?logoColor=white&logoWidth=20)](https://affective-visual-dialog.github.io/)
14
+
15
+ </div>
16
+
17
+ ## πŸ“° News
18
+
19
+ - **30/08/2023**: The preprint of our paper is now available on [arXiv](https://arxiv.org/abs/2308.16349).
20
+
21
+ ## Summary
22
+
23
+ - [πŸ“° News](#-news)
24
+ - [Summary](#summary)
25
+ - [πŸ“š Introduction](#-introduction)
26
+ - [πŸ“Š Baselines](#-baselines)
27
+ - [Citation](#citation)
28
+ - [References](#references)
29
+
30
+ <br>
31
+
32
+ ## πŸ“š Introduction
33
+
34
+ AffectVisDial is a large-scale dataset which consists of 50K 10-turn visually grounded dialogs as well as concluding emotion attributions and dialog-informed textual emotion explanations.
35
+
36
+ <br>
37
+
38
+ ## πŸ“Š Baselines
39
+
40
+ We provide baseline models explanation generation task:
41
+ - [GenLM](./baselines/GenLM/): BERT- and BART-based models [3, 4]
42
+ - [NLX-GPT](./baselines/nlx): NLX-GPT based model [1]
43
+
44
+ <br>
45
+
46
+ ## Citation
47
+
48
+ If you use our dataset, please cite the two following references:
49
+
50
+ ```bibtex
51
+ @article{haydarov2023affective,
52
+ title={Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations},
53
+ author={Haydarov, Kilichbek and Shen, Xiaoqian and Madasu, Avinash and Salem, Mahmoud and Li, Li-Jia and Elsayed, Gamaleldin and Elhoseiny, Mohamed},
54
+ journal={arXiv preprint arXiv:2308.16349},
55
+ year={2023}
56
+ }
57
+ ```
58
+ </br>
59
+
60
+ ## References
61
+ 1. _[Sammani et al., 2022] - NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks
62
+ 2. _[Li et al., 2022]_ - BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
63
+ 3. _[Lewis et al., 2019]_ - BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension.
64
+ 4. _[Dewlin et al., 2018]_ - BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding