potsawee commited on
Commit
08bd308
1 Parent(s): 4e34578

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +69 -0
README.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - squad
5
+ language:
6
+ - en
7
+ library_name: transformers
8
+ pipeline_tag: text2text-generation
9
+ ---
10
+ # t5-large fine-tuned to SQuAD for Generating Question+Answer
11
+ - Input: `context` (e.g. news article)
12
+ - Output: `question <sep> answer`
13
+
14
+ The answers in the training data (SQuAD) are highly extractive; therefore, this model will generate **extractive** answers. If you would like to have **abstractive** questions/answers, you can use our model trained on the RACE dataset: https://huggingface.co/potsawee/t5-large-generation-race-QuestionAnswer.
15
+
16
+ ## Model Details
17
+
18
+ t5-large model is fine-tuned to the SQuAD dataset where the input is the context/passage and the output is the question followed by the answer. This is the first component in the question generation pipeline (i.e. `g1`) in our [MQAG paper](https://arxiv.org/abs/2301.12307),
19
+ or please refer to the GitHub repo of this project: https://github.com/potsawee/mqag0.
20
+
21
+ ## How to Use the Model
22
+
23
+ Use the code below to get started with the model. You can also set ```do_sample=True``` in ```generate()``` to obtain different question-answer pairs.
24
+
25
+ ```python
26
+ >>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
27
+
28
+ >>> tokenizer = AutoTokenizer.from_pretrained("potsawee/t5-large-generation-squad-QuestionAnswer")
29
+ >>> model = AutoModelForSeq2SeqLM.from_pretrained("potsawee/t5-large-generation-squad-QuestionAnswer")
30
+
31
+ >>> context = r"""Chelsea's mini-revival continued with a third victory in a row as they consigned struggling Leicester City to a fifth consecutive defeat.
32
+ Buoyed by their Champions League win over Borussia Dortmund, Chelsea started brightly and Ben Chilwell volleyed in from a tight angle against his old club.
33
+ Chelsea's Joao Felix and Leicester's Kiernan Dewsbury-Hall hit the woodwork in the space of two minutes, then Felix had a goal ruled out by the video assistant referee for offside.
34
+ Patson Daka rifled home an excellent equaliser after Ricardo Pereira won the ball off the dawdling Felix outside the box.
35
+ But Kai Havertz pounced six minutes into first-half injury time with an excellent dinked finish from Enzo Fernandez's clever aerial ball.
36
+ Mykhailo Mudryk thought he had his first goal for the Blues after the break but his effort was disallowed for offside.
37
+ Mateo Kovacic sealed the win as he volleyed in from Mudryk's header.
38
+ The sliding Foxes, who ended with 10 men following Wout Faes' late dismissal for a second booking, now just sit one point outside the relegation zone.
39
+ """.replace('\n', ' ')
40
+
41
+ >>> inputs = tokenizer(context, return_tensors="pt")
42
+ >>> outputs = model.generate(**inputs, max_length=100)
43
+ >>> question_answer = tokenizer.decode(outputs[0], skip_special_tokens=False)
44
+ >>> question_answer = question_answer.replace(tokenizer.pad_token, "").replace(tokenizer.eos_token, "")
45
+ >>> question, answer = question_answer.split(tokenizer.sep_token)
46
+
47
+ >>> print("question:", question)
48
+ question: Who scored the winner for Chelsea?
49
+ >>> print("answer:", answer)
50
+ answer: Mateo Kovacic
51
+
52
+ ```
53
+
54
+ ## Generating Distractors (other options in a multiple-choice setup)
55
+
56
+ ```Context ---> Question + (A) Answer (B) Distractor1 (C) Distractor2 (D) Distractor3```
57
+
58
+ Please refer to our distractor generation model, e.g. https://huggingface.co/potsawee/t5-large-generation-race-Distractor
59
+
60
+ ## Citation
61
+
62
+ ```bibtex
63
+ @article{manakul2023mqag,
64
+ title={MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency in Summarization},
65
+ author={Manakul, Potsawee and Liusie, Adian and Gales, Mark JF},
66
+ journal={arXiv preprint arXiv:2301.12307},
67
+ year={2023}
68
+ }
69
+ ```