asahi417 commited on
Commit
3af8cf4
1 Parent(s): a368b6c

model update

Browse files
README.md ADDED
@@ -0,0 +1,149 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ license: cc-by-4.0
4
+ metrics:
5
+ - bleu4
6
+ - meteor
7
+ - rouge-l
8
+ - bertscore
9
+ - moverscore
10
+ language: en
11
+ datasets:
12
+ - lmqg/qag_tweetqa
13
+ pipeline_tag: text2text-generation
14
+ tags:
15
+ - questions and answers generation
16
+ widget:
17
+ - text: "generate question and answer: Beyonce further expanded her acting career, starring as blues singer Etta James in the 2008 musical biopic, Cadillac Records."
18
+ example_title: "Questions & Answers Generation Example 1"
19
+ model-index:
20
+ - name: lmqg/t5-large-tweetqa-qag
21
+ results:
22
+ - task:
23
+ name: Text2text Generation
24
+ type: text2text-generation
25
+ dataset:
26
+ name: lmqg/qag_tweetqa
27
+ type: default
28
+ args: default
29
+ metrics:
30
+ - name: BLEU4
31
+ type: bleu4
32
+ value: 5.960482240567237e-10
33
+ - name: ROUGE-L
34
+ type: rouge-l
35
+ value: 0.0054045507102811466
36
+ - name: METEOR
37
+ type: meteor
38
+ value: 0.0029513976825252613
39
+ - name: BERTScore
40
+ type: bertscore
41
+ value: 0.03922946683914634
42
+ - name: MoverScore
43
+ type: moverscore
44
+ value: 0.45608571714273055
45
+ ---
46
+
47
+ # Model Card of `lmqg/t5-large-tweetqa-qag`
48
+ This model is fine-tuned version of [t5-large](https://huggingface.co/t5-large) for question generation task on the
49
+ [lmqg/qag_tweetqa](https://huggingface.co/datasets/lmqg/qag_tweetqa) (dataset_name: default) via [`lmqg`](https://github.com/asahi417/lm-question-generation).
50
+ This model is fine-tuned on the end-to-end question and answer generation.
51
+
52
+ Please cite our paper if you use the model ([https://arxiv.org/abs/2210.03992](https://arxiv.org/abs/2210.03992)).
53
+
54
+ ```
55
+
56
+ @inproceedings{ushio-etal-2022-generative,
57
+ title = "{G}enerative {L}anguage {M}odels for {P}aragraph-{L}evel {Q}uestion {G}eneration",
58
+ author = "Ushio, Asahi and
59
+ Alva-Manchego, Fernando and
60
+ Camacho-Collados, Jose",
61
+ booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing",
62
+ month = dec,
63
+ year = "2022",
64
+ address = "Abu Dhabi, U.A.E.",
65
+ publisher = "Association for Computational Linguistics",
66
+ }
67
+
68
+ ```
69
+
70
+ ### Overview
71
+ - **Language model:** [t5-large](https://huggingface.co/t5-large)
72
+ - **Language:** en
73
+ - **Training data:** [lmqg/qag_tweetqa](https://huggingface.co/datasets/lmqg/qag_tweetqa) (default)
74
+ - **Online Demo:** [https://autoqg.net/](https://autoqg.net/)
75
+ - **Repository:** [https://github.com/asahi417/lm-question-generation](https://github.com/asahi417/lm-question-generation)
76
+ - **Paper:** [https://arxiv.org/abs/2210.03992](https://arxiv.org/abs/2210.03992)
77
+
78
+ ### Usage
79
+ - With [`lmqg`](https://github.com/asahi417/lm-question-generation#lmqg-language-model-for-question-generation-)
80
+ ```python
81
+
82
+ from lmqg import TransformersQG
83
+ # initialize model
84
+ model = TransformersQG(language='en', model='lmqg/t5-large-tweetqa-qag')
85
+ # model prediction
86
+ question = model.generate_qa(list_context=["William Turner was an English painter who specialised in watercolour landscapes"], list_answer=["William Turner"])
87
+
88
+ ```
89
+
90
+ - With `transformers`
91
+ ```python
92
+
93
+ from transformers import pipeline
94
+ # initialize model
95
+ pipe = pipeline("text2text-generation", 'lmqg/t5-large-tweetqa-qag')
96
+ # question generation
97
+ question = pipe('generate question and answer: Beyonce further expanded her acting career, starring as blues singer Etta James in the 2008 musical biopic, Cadillac Records.')
98
+
99
+ ```
100
+
101
+ ## Evaluation Metrics
102
+
103
+
104
+ ### Metrics
105
+
106
+ | Dataset | Type | BLEU4 | ROUGE-L | METEOR | BERTScore | MoverScore | Link |
107
+ |:--------|:-----|------:|--------:|-------:|----------:|-----------:|-----:|
108
+ | [lmqg/qag_tweetqa](https://huggingface.co/datasets/lmqg/qag_tweetqa) | default | 0.0 | 0.005 | 0.003 | 0.039 | 0.456 | [link](https://huggingface.co/lmqg/t5-large-tweetqa-qag/raw/main/eval/metric.first.sentence.paragraph.questions_answers.lmqg_qag_tweetqa.default.json) |
109
+
110
+
111
+
112
+
113
+ ## Training hyperparameters
114
+
115
+ The following hyperparameters were used during fine-tuning:
116
+ - dataset_path: lmqg/qag_tweetqa
117
+ - dataset_name: default
118
+ - input_types: ['paragraph']
119
+ - output_types: ['questions_answers']
120
+ - prefix_types: ['qag']
121
+ - model: t5-large
122
+ - max_length: 256
123
+ - max_length_output: 128
124
+ - epoch: 15
125
+ - batch: 16
126
+ - lr: 5e-05
127
+ - fp16: False
128
+ - random_seed: 1
129
+ - gradient_accumulation_steps: 4
130
+ - label_smoothing: 0.15
131
+
132
+ The full configuration can be found at [fine-tuning config file](https://huggingface.co/lmqg/t5-large-tweetqa-qag/raw/main/trainer_config.json).
133
+
134
+ ## Citation
135
+ ```
136
+
137
+ @inproceedings{ushio-etal-2022-generative,
138
+ title = "{G}enerative {L}anguage {M}odels for {P}aragraph-{L}evel {Q}uestion {G}eneration",
139
+ author = "Ushio, Asahi and
140
+ Alva-Manchego, Fernando and
141
+ Camacho-Collados, Jose",
142
+ booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing",
143
+ month = dec,
144
+ year = "2022",
145
+ address = "Abu Dhabi, U.A.E.",
146
+ publisher = "Association for Computational Linguistics",
147
+ }
148
+
149
+ ```
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "lmqg_output/t5_large_tweetqa/best_model",
3
  "add_prefix": true,
4
  "architectures": [
5
  "T5ForConditionalGeneration"
 
1
  {
2
+ "_name_or_path": "lmqg_output/t5_large_tweetqa/model_woixzh/epoch_10",
3
  "add_prefix": true,
4
  "architectures": [
5
  "T5ForConditionalGeneration"
eval/metric.first.sentence.paragraph.questions_answers.lmqg_qag_tweetqa.default.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"validation": {"Bleu_1": 2.949014119691204e-10, "Bleu_2": 7.288962467520874e-11, "Bleu_3": 1.603926876322537e-11, "Bleu_4": 1.3497983619005407e-15, "METEOR": 0.0013936164132520032, "ROUGE_L": 0.00245043948954712, "BERTScore": 0.013843970846129832, "MoverScore": 0.4563235198215272}, "test": {"Bleu_1": 0.00018229942730569628, "Bleu_2": 3.209939830954064e-05, "Bleu_3": 7.944684191265035e-06, "Bleu_4": 5.960482240567237e-10, "METEOR": 0.0029513976825252613, "ROUGE_L": 0.0054045507102811466, "BERTScore": 0.03922946683914634, "MoverScore": 0.45608571714273055}}
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d464da93cab952b1e544e1dada97cb9e4890d0f55fdde0cfcc68ae55c4da0a92
3
- size 2950727111
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6c7f153ea41c7a5eb2abd6e1e982d3592c8be3c73e953e18f13dc53b557c8024
3
+ size 2950734215
tokenizer_config.json CHANGED
@@ -104,7 +104,7 @@
104
  "eos_token": "</s>",
105
  "extra_ids": 100,
106
  "model_max_length": 512,
107
- "name_or_path": "lmqg_output/t5_large_tweetqa/best_model",
108
  "pad_token": "<pad>",
109
  "special_tokens_map_file": null,
110
  "tokenizer_class": "T5Tokenizer",
 
104
  "eos_token": "</s>",
105
  "extra_ids": 100,
106
  "model_max_length": 512,
107
+ "name_or_path": "lmqg_output/t5_large_tweetqa/model_woixzh/epoch_10",
108
  "pad_token": "<pad>",
109
  "special_tokens_map_file": null,
110
  "tokenizer_class": "T5Tokenizer",