yonatanbitton commited on
Commit
52b6252
1 Parent(s): 71f6f34

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +105 -0
README.md CHANGED
@@ -10,3 +10,108 @@ pinned: false
10
  ---
11
 
12
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
13
+
14
+
15
+
16
+ # Dataset Card for WHOOPS!
17
+
18
+ - [Dataset Description](#dataset-description)
19
+ - [Contribute Images to Extend WHOOPS!](#contribute-images-to-extend-whoops)
20
+ - [Languages](#languages)
21
+ - [Dataset](#dataset-structure)
22
+ - [Data Fields](#data-fields)
23
+ - [Data Splits](#data-splits)
24
+ - [Data Loading](#data-loading)
25
+ - [Licensing Information](#licensing-information)
26
+ - [Annotations](#annotations)
27
+ - [Considerations for Using the Data](#considerations-for-using-the-data)
28
+ - [Citation Information](#citation-information)
29
+
30
+
31
+ ## Dataset Description
32
+ WHOOPS! is a dataset and benchmark for visual commonsense. The dataset is comprised of purposefully commonsense-defying images created by designers using publicly-available image generation tools like Midjourney. It contains commonsense-defying image from a wide range of reasons, deviations from expected social norms and everyday knowledge.
33
+
34
+ The WHOOPS! benchmark includes four tasks:
35
+ 1. A novel task of explanation-of-violation: generating a detailed explanation for what makes the image weird.
36
+ 2. Generating a literal caption
37
+ 3. Distinguishing between detailed and underspecified captions
38
+ 4. Answering questions that test compositional understanding
39
+
40
+ The results show that state-of-the-art models such as GPT3 and BLIP2 still lag behind human performance on WHOOPS!.
41
+
42
+ * Homepage: https://whoops-benchmark.github.io/
43
+ * Paper: https://arxiv.org/pdf/2303.07274.pdf
44
+ * WHOOPS! Explorer: https://huggingface.co/spaces/nlphuji/whoops-explorer-full
45
+ * Normal vs. Wired Explorer: https://huggingface.co/spaces/nlphuji/whoops-explorer-analysis
46
+ * Point of Contact: yonatanbitton1@gmail.com
47
+
48
+ [//]: # (Colab notebook code for WHOOPS evaluation )
49
+
50
+ ## Contribute Images to Extend WHOOPS!
51
+ Would you like to add a commonsense-defying image to our database? Please send candidate images to yonatanbitton1@gmail.com. Thanks!
52
+
53
+ ### Languages
54
+ English.
55
+
56
+ ## Dataset
57
+ ### Data Fields
58
+ image (image) - The weird image.
59
+ designer_explanation (string) - Detailed single-sentence explanation given by the designer, explaining why the image is weird.
60
+ selected_caption (string) - The caption that was selected from the crowed collected captions.
61
+ crowd_captions (list) - Crowd collected captions, depicting whats been seen in the image.
62
+ crowd_explanations (list) - Crowd collected single-sentence explanations, explaining why the image is weird.
63
+ crowd_underspecified_captions (list) - Crowd collected under-specified captions, depicting what is seen in the image, without depicting the commonsense-violation.
64
+ question_answering_pairs (list) - Automatically generated Q-A pairs. FlanT5 XL was used to answer the questions and filter out instances where the BEM metric is above 0.1.
65
+ commonsense_category (string) - The commonsense category the images related to (Full categories list can be found in [paper](https://arxiv.org/pdf/2303.07274.pdf)).
66
+ image_id (string)- The unique id of the image in the dataset
67
+ image_designer (string) - The name of the image designer.
68
+
69
+ ### Data Splits
70
+ There is a single TEST split.
71
+ Although primarily intended as a challenging test set, we trained on the WHOOPS! dataset to demonstrate the value of the data and to create a better model.
72
+ We will provide the splits in the future.
73
+
74
+ ### Data Loading
75
+ You can load the data as follows (credit to [Winoground](https://huggingface.co/datasets/facebook/winoground)):
76
+ ```
77
+ from datasets import load_dataset
78
+ examples = load_dataset('nlphuji/whoops', use_auth_token=<YOUR USER ACCESS TOKEN>)
79
+ ```
80
+ You can get `<YOUR USER ACCESS TOKEN>` by following these steps:
81
+ 1) log into your Hugging Face account
82
+ 2) click on your profile picture
83
+ 3) click "Settings"
84
+ 4) click "Access Tokens"
85
+ 5) generate an access token
86
+
87
+ ## Licensing Information
88
+ [CC-By 4.0](https://creativecommons.org/licenses/by/4.0/)
89
+ Additional license information: [license_agreement.txt](https://huggingface.co/datasets/nlphuji/whoops/blob/main/license_agreement.txt)
90
+ After clicking on “Access repository”, you affirmed that your intent is solely to use it for research purposes, explicitly excluding the development of commercial chatbots, and you acknowledge acceptance of the terms in the [WHOOPS! license agreement](https://whoops-benchmark.github.io/static/pdfs/whoops_license_agreement.txt).
91
+ - The dataset is aimed to facilitate academic research with the purpose of publications.
92
+ - Participants will not incorporate the Dataset into any other program, dataset, or product.
93
+ - Participants may report results on the dataset as a test set.
94
+
95
+
96
+
97
+ [//]: # (To evaluate WHOOPS! with a fine-tune BLIP2, we split the images in WHOOPS! into 5 cross- validation splits. For these 5 splits independently, we train supervised models using 60% of the data as training, 20% as validation, and 20% for test.)
98
+
99
+
100
+ ## Annotations
101
+ We paid designers to create images, and supply explanations for what is making the image wierd.
102
+ We paid Amazon Mechanical Turk Workers to supply explanations, captions and under-specified captions for each image in our dataset.
103
+
104
+ ## Considerations for Using the Data
105
+ We took measures to filter out potentially harmful or offensive images and texts in WHOOPS!, but it is still possible that some individuals may find certain content objectionable.
106
+ If you come across any instances of harm, please report them to our point of contact. We will review and eliminate any images from the dataset that are deemed harmful.
107
+
108
+ [//]: # (All images, explanations, captions and under-specified captions were obtained with human annotators.)
109
+
110
+
111
+ ### Citation Information
112
+ @article{bitton2023breaking,
113
+ title={Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images},
114
+ author={Bitton-Guetta, Nitzan and Bitton, Yonatan and Hessel, Jack and Schmidt, Ludwig and Elovici, Yuval and Stanovsky, Gabriel and Schwartz, Roy},
115
+ journal={arXiv preprint arXiv:2303.07274},
116
+ year={2023}
117
+ }