metadata

title: WHOOPS! Explorer
emoji: 🔥
colorFrom: purple
colorTo: blue
sdk: gradio
sdk_version: 3.21.0
app_file: app.py
pinned: false

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Dataset Card for WHOOPS!

Dataset Description
Contribute Images to Extend WHOOPS!
- Languages
Dataset
Licensing Information
Annotations
Considerations for Using the Data
Citation Information

Dataset Description

WHOOPS! is a dataset and benchmark for visual commonsense. The dataset is comprised of purposefully commonsense-defying images created by designers using publicly-available image generation tools like Midjourney. It contains commonsense-defying image from a wide range of reasons, deviations from expected social norms and everyday knowledge.

The WHOOPS! benchmark includes four tasks:

A novel task of explanation-of-violation: generating a detailed explanation for what makes the image weird.
Generating a literal caption
Distinguishing between detailed and underspecified captions
Answering questions that test compositional understanding

The results show that state-of-the-art models such as GPT3 and BLIP2 still lag behind human performance on WHOOPS!.

Homepage: https://whoops-benchmark.github.io/
Paper: https://arxiv.org/pdf/2303.07274.pdf
WHOOPS! Explorer: https://huggingface.co/spaces/nlphuji/whoops-explorer-full
Normal vs. Wired Explorer: https://huggingface.co/spaces/nlphuji/whoops-explorer-analysis
Point of Contact: yonatanbitton1@gmail.com

Contribute Images to Extend WHOOPS!

Would you like to add a commonsense-defying image to our database? Please send candidate images to yonatanbitton1@gmail.com. Thanks!

Languages

English.

Dataset

Data Fields

image (image) - The weird image.
designer_explanation (string) - Detailed single-sentence explanation given by the designer, explaining why the image is weird.
selected_caption (string) - The caption that was selected from the crowed collected captions.
crowd_captions (list) - Crowd collected captions, depicting whats been seen in the image.
crowd_explanations (list) - Crowd collected single-sentence explanations, explaining why the image is weird.
crowd_underspecified_captions (list) - Crowd collected under-specified captions, depicting what is seen in the image, without depicting the commonsense-violation.
question_answering_pairs (list) - Automatically generated Q-A pairs. FlanT5 XL was used to answer the questions and filter out instances where the BEM metric is above 0.1.
commonsense_category (string) - The commonsense category the images related to (Full categories list can be found in [paper](https://arxiv.org/pdf/2303.07274.pdf)).
image_id (string)- The unique id of the image in the dataset
image_designer (string) - The name of the image designer.

Data Splits

There is a single TEST split. Although primarily intended as a challenging test set, we trained on the WHOOPS! dataset to demonstrate the value of the data and to create a better model. We will provide the splits in the future.

Data Loading

You can load the data as follows (credit to Winoground):

from datasets import load_dataset
examples = load_dataset('nlphuji/whoops', use_auth_token=<YOUR USER ACCESS TOKEN>)

You can get <YOUR USER ACCESS TOKEN> by following these steps:

log into your Hugging Face account
click on your profile picture
click "Settings"
click "Access Tokens"
generate an access token

Licensing Information

CC-By 4.0
Additional license information: license_agreement.txt
After clicking on “Access repository”, you affirmed that your intent is solely to use it for research purposes, explicitly excluding the development of commercial chatbots, and you acknowledge acceptance of the terms in the WHOOPS! license agreement.

The dataset is aimed to facilitate academic research with the purpose of publications.
Participants will not incorporate the Dataset into any other program, dataset, or product.
Participants may report results on the dataset as a test set.

Annotations

We paid designers to create images, and supply explanations for what is making the image wierd. We paid Amazon Mechanical Turk Workers to supply explanations, captions and under-specified captions for each image in our dataset.

Considerations for Using the Data

We took measures to filter out potentially harmful or offensive images and texts in WHOOPS!, but it is still possible that some individuals may find certain content objectionable. If you come across any instances of harm, please report them to our point of contact. We will review and eliminate any images from the dataset that are deemed harmful.

Citation Information

@article{bitton2023breaking,
  title={Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images},
  author={Bitton-Guetta, Nitzan and Bitton, Yonatan and Hessel, Jack and Schmidt, Ludwig and Elovici, Yuval and Stanovsky, Gabriel and Schwartz, Roy},
  journal={arXiv preprint arXiv:2303.07274},
  year={2023}
}