osanseviero HF staff commited on
Commit
2e1dca3
•
1 Parent(s): 83312d5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -23,7 +23,7 @@ Authors: *Linting Xue, Aditya Barua, Noah Constant, Rami Al-Rfou, Sharan Narang,
23
 
24
  ## Details of the downstream task (Question Answering) - Dataset 📚
25
 
26
- [TweetQA](hhttps://huggingface.co/datasets/tweet_qa)
27
 
28
 
29
  With social media becoming increasingly more popular, lots of news and real-time events are being covered. Developing automated question answering systems is critical to the effectiveness of many applications that rely on real-time knowledge. While previous question answering (QA) datasets have focused on formal text such as news and Wikipedia, we present the first large-scale dataset for QA over social media data. To make sure that the tweets are meaningful and contain interesting information, we gather tweets used by journalists to write news articles. We then ask human annotators to write questions and answers upon these tweets. Unlike other QA datasets like SQuAD (in which the answers are extractive), we allow the answers to be abstractive. The task requires the model to read a short tweet and a question and outputs a text phrase (does not need to be in the tweet) as the answer.
@@ -36,7 +36,7 @@ Sample
36
  {
37
  "Question": "who is the tallest host?",
38
  "Answer": ["sam bee","sam bee"],
39
- "Tweet": "Don't believe @ConanOBrien's height lies. Sam Bee is the tallest host in late night. #alternativefacts\\\\\\\\u2014 Full Frontal (@FullFrontalSamB) January 22, 2017",
40
  "qid": "3554ee17d86b678be34c4dc2c04e334f"
41
  }
42
  ```
@@ -77,7 +77,7 @@ def get_answer(question, context):
77
  return tokenizer.decode(output[0], skip_special_tokens=True)
78
 
79
 
80
- context = "Don't believe @ConanOBrien's height lies. Sam Bee is the tallest host in late night. #alternativefacts\\\\\\\\u2014 Full Frontal (@FullFrontalSamB) January 22, 2017"
81
  question = "who is the tallest host?"
82
 
83
  get_answer(question, context)
 
23
 
24
  ## Details of the downstream task (Question Answering) - Dataset 📚
25
 
26
+ [TweetQA](https://huggingface.co/datasets/tweet_qa)
27
 
28
 
29
  With social media becoming increasingly more popular, lots of news and real-time events are being covered. Developing automated question answering systems is critical to the effectiveness of many applications that rely on real-time knowledge. While previous question answering (QA) datasets have focused on formal text such as news and Wikipedia, we present the first large-scale dataset for QA over social media data. To make sure that the tweets are meaningful and contain interesting information, we gather tweets used by journalists to write news articles. We then ask human annotators to write questions and answers upon these tweets. Unlike other QA datasets like SQuAD (in which the answers are extractive), we allow the answers to be abstractive. The task requires the model to read a short tweet and a question and outputs a text phrase (does not need to be in the tweet) as the answer.
 
36
  {
37
  "Question": "who is the tallest host?",
38
  "Answer": ["sam bee","sam bee"],
39
+ "Tweet": "Don't believe @ConanOBrien's height lies. Sam Bee is the tallest host in late night. #alternativefacts\\\\\\\\\\\\\\\\u2014 Full Frontal (@FullFrontalSamB) January 22, 2017",
40
  "qid": "3554ee17d86b678be34c4dc2c04e334f"
41
  }
42
  ```
 
77
  return tokenizer.decode(output[0], skip_special_tokens=True)
78
 
79
 
80
+ context = "Don't believe @ConanOBrien's height lies. Sam Bee is the tallest host in late night. #alternativefacts\\\\\\\\\\\\\\\\u2014 Full Frontal (@FullFrontalSamB) January 22, 2017"
81
  question = "who is the tallest host?"
82
 
83
  get_answer(question, context)