pminervini commited on
Commit
00a1742
1 Parent(s): d354e12
Files changed (1) hide show
  1. src/display/about.py +8 -0
src/display/about.py CHANGED
@@ -10,6 +10,14 @@ The backend of the Hallucinations leaderboard is based on the [Eleuther AI Langu
10
  Metrics and datasets used by the Hallucinations Leaderboard were identified while writing our [awesome-hallucinations-detection](https://github.com/EdinburghNLP/awesome-hallucination-detection) page (you are encouraged to contribute to this list via pull requests).
11
  If you have comments or suggestions on datasets and metrics, please [reach out to us in our discussion forum](https://huggingface.co/spaces/hallucinations-leaderboard/leaderboard/discussions).
12
 
 
 
 
 
 
 
 
 
13
  For more information about the leaderboard, check our [HuggingFace Blog article](https://huggingface.co/blog/leaderboards-on-the-hub-hallucinations).
14
  """
15
 
 
10
  Metrics and datasets used by the Hallucinations Leaderboard were identified while writing our [awesome-hallucinations-detection](https://github.com/EdinburghNLP/awesome-hallucination-detection) page (you are encouraged to contribute to this list via pull requests).
11
  If you have comments or suggestions on datasets and metrics, please [reach out to us in our discussion forum](https://huggingface.co/spaces/hallucinations-leaderboard/leaderboard/discussions).
12
 
13
+ The Hallucination Leaderboard includes a variety of tasks identified while working on the [awesome-hallucination-detection](https://github.com/EdinburghNLP/awesome-hallucination-detection) repository:
14
+ - **Closed-book Open-domain QA** -- [NQ Open](https://huggingface.co/datasets/nq_open) (8-shot and 64-shot), [TriviaQA](https://huggingface.co/datasets/trivia_qa) (8-shot and 64-shot), [TruthfulQA](https://huggingface.co/datasets/truthful_qa) ([MC1](https://huggingface.co/datasets/truthful_qa/viewer/multiple_choice), [MC2](https://huggingface.co/datasets/truthful_qa/viewer/multiple_choice), and [Generative](https://huggingface.co/datasets/truthful_qa/viewer/generation))
15
+ - **Summarisation** -- [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum), [CNN/DM](https://huggingface.co/datasets/cnn_dailymail)
16
+ - **Reading Comprehension** -- [RACE](https://huggingface.co/datasets/EleutherAI/race)
17
+ - **Instruction Following** -- [MemoTrap](https://huggingface.co/datasets/pminervini/inverse-scaling/viewer/memo-trap), [IFEval](https://huggingface.co/datasets/wis-k/instruction-following-eval)
18
+ - **Hallucination Detection** -- [FaithDial](https://huggingface.co/datasets/McGill-NLP/FaithDial), [True-False](https://huggingface.co/datasets/pminervini/true-false), [HaluEval](https://huggingface.co/datasets/pminervini/HaluEval) ([QA](https://huggingface.co/datasets/pminervini/HaluEval/viewer/qa_samples), [Summarisation](https://huggingface.co/datasets/pminervini/HaluEval/viewer/summarization_samples), and [Dialogue](https://huggingface.co/datasets/pminervini/HaluEval/viewer/dialogue_samples))
19
+ - **Self-Consistency** -- [SelfCheckGPT](https://huggingface.co/datasets/potsawee/wiki_bio_gpt3_hallucination)
20
+
21
  For more information about the leaderboard, check our [HuggingFace Blog article](https://huggingface.co/blog/leaderboards-on-the-hub-hallucinations).
22
  """
23