asahi417 commited on
Commit
63a00d3
1 Parent(s): eb406b6

model update

Browse files
Files changed (4) hide show
  1. README.md +149 -0
  2. config.json +1 -1
  3. pytorch_model.bin +2 -2
  4. tokenizer_config.json +1 -1
README.md ADDED
@@ -0,0 +1,149 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ license: cc-by-4.0
4
+ metrics:
5
+ - bleu4
6
+ - meteor
7
+ - rouge-l
8
+ - bertscore
9
+ - moverscore
10
+ language: en
11
+ datasets:
12
+ - lmqg/qag_tweetqa
13
+ pipeline_tag: text2text-generation
14
+ tags:
15
+ - questions and answers generation
16
+ widget:
17
+ - text: " Beyonce further expanded her acting career, starring as blues singer Etta James in the 2008 musical biopic, Cadillac Records."
18
+ example_title: "Questions & Answers Generation Example 1"
19
+ model-index:
20
+ - name: lmqg/t5-small-tweetqa-qag-np
21
+ results:
22
+ - task:
23
+ name: Text2text Generation
24
+ type: text2text-generation
25
+ dataset:
26
+ name: lmqg/qag_tweetqa
27
+ type: default
28
+ args: default
29
+ metrics:
30
+ - name: BLEU4
31
+ type: bleu4
32
+ value: 0.1071170387718974
33
+ - name: ROUGE-L
34
+ type: rouge-l
35
+ value: 0.34768502761194764
36
+ - name: METEOR
37
+ type: meteor
38
+ value: 0.2780003860872953
39
+ - name: BERTScore
40
+ type: bertscore
41
+ value: 0.8947686245265588
42
+ - name: MoverScore
43
+ type: moverscore
44
+ value: 0.6053481203449237
45
+ ---
46
+
47
+ # Model Card of `lmqg/t5-small-tweetqa-qag-np`
48
+ This model is fine-tuned version of [t5-small](https://huggingface.co/t5-small) for question generation task on the
49
+ [lmqg/qag_tweetqa](https://huggingface.co/datasets/lmqg/qag_tweetqa) (dataset_name: default) via [`lmqg`](https://github.com/asahi417/lm-question-generation).
50
+ This model is fine-tuned on the end-to-end question and answer generation.
51
+
52
+ Please cite our paper if you use the model ([https://arxiv.org/abs/2210.03992](https://arxiv.org/abs/2210.03992)).
53
+
54
+ ```
55
+
56
+ @inproceedings{ushio-etal-2022-generative,
57
+ title = "{G}enerative {L}anguage {M}odels for {P}aragraph-{L}evel {Q}uestion {G}eneration",
58
+ author = "Ushio, Asahi and
59
+ Alva-Manchego, Fernando and
60
+ Camacho-Collados, Jose",
61
+ booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing",
62
+ month = dec,
63
+ year = "2022",
64
+ address = "Abu Dhabi, U.A.E.",
65
+ publisher = "Association for Computational Linguistics",
66
+ }
67
+
68
+ ```
69
+
70
+ ### Overview
71
+ - **Language model:** [t5-small](https://huggingface.co/t5-small)
72
+ - **Language:** en
73
+ - **Training data:** [lmqg/qag_tweetqa](https://huggingface.co/datasets/lmqg/qag_tweetqa) (default)
74
+ - **Online Demo:** [https://autoqg.net/](https://autoqg.net/)
75
+ - **Repository:** [https://github.com/asahi417/lm-question-generation](https://github.com/asahi417/lm-question-generation)
76
+ - **Paper:** [https://arxiv.org/abs/2210.03992](https://arxiv.org/abs/2210.03992)
77
+
78
+ ### Usage
79
+ - With [`lmqg`](https://github.com/asahi417/lm-question-generation#lmqg-language-model-for-question-generation-)
80
+ ```python
81
+
82
+ from lmqg import TransformersQG
83
+ # initialize model
84
+ model = TransformersQG(language='en', model='lmqg/t5-small-tweetqa-qag-np')
85
+ # model prediction
86
+ question = model.generate_qa(list_context=["William Turner was an English painter who specialised in watercolour landscapes"], list_answer=["William Turner"])
87
+
88
+ ```
89
+
90
+ - With `transformers`
91
+ ```python
92
+
93
+ from transformers import pipeline
94
+ # initialize model
95
+ pipe = pipeline("text2text-generation", 'lmqg/t5-small-tweetqa-qag-np')
96
+ # question generation
97
+ question = pipe(' Beyonce further expanded her acting career, starring as blues singer Etta James in the 2008 musical biopic, Cadillac Records.')
98
+
99
+ ```
100
+
101
+ ## Evaluation Metrics
102
+
103
+
104
+ ### Metrics
105
+
106
+ | Dataset | Type | BLEU4 | ROUGE-L | METEOR | BERTScore | MoverScore | Link |
107
+ |:--------|:-----|------:|--------:|-------:|----------:|-----------:|-----:|
108
+ | [lmqg/qag_tweetqa](https://huggingface.co/datasets/lmqg/qag_tweetqa) | default | 0.107 | 0.348 | 0.278 | 0.895 | 0.605 | [link](https://huggingface.co/lmqg/t5-small-tweetqa-qag-np/raw/main/eval/metric.first.answer.paragraph.questions_answers.lmqg_qag_tweetqa.default.json) |
109
+
110
+
111
+
112
+
113
+ ## Training hyperparameters
114
+
115
+ The following hyperparameters were used during fine-tuning:
116
+ - dataset_path: lmqg/qag_tweetqa
117
+ - dataset_name: default
118
+ - input_types: ['paragraph']
119
+ - output_types: ['questions_answers']
120
+ - prefix_types: None
121
+ - model: t5-small
122
+ - max_length: 256
123
+ - max_length_output: 128
124
+ - epoch: 16
125
+ - batch: 64
126
+ - lr: 0.0001
127
+ - fp16: False
128
+ - random_seed: 1
129
+ - gradient_accumulation_steps: 1
130
+ - label_smoothing: 0.15
131
+
132
+ The full configuration can be found at [fine-tuning config file](https://huggingface.co/lmqg/t5-small-tweetqa-qag-np/raw/main/trainer_config.json).
133
+
134
+ ## Citation
135
+ ```
136
+
137
+ @inproceedings{ushio-etal-2022-generative,
138
+ title = "{G}enerative {L}anguage {M}odels for {P}aragraph-{L}evel {Q}uestion {G}eneration",
139
+ author = "Ushio, Asahi and
140
+ Alva-Manchego, Fernando and
141
+ Camacho-Collados, Jose",
142
+ booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing",
143
+ month = dec,
144
+ year = "2022",
145
+ address = "Abu Dhabi, U.A.E.",
146
+ publisher = "Association for Computational Linguistics",
147
+ }
148
+
149
+ ```
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "lmqg_output/t5_small_tweetqa/best_model",
3
  "add_prefix": false,
4
  "architectures": [
5
  "T5ForConditionalGeneration"
 
1
  {
2
+ "_name_or_path": "lmqg_output/t5_small_tweetqa/model_eszyci/epoch_15",
3
  "add_prefix": false,
4
  "architectures": [
5
  "T5ForConditionalGeneration"
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:838e6c4ab72a2c943e2140f89cd4f0caaa16dde03102b20cfa071cc80714d848
3
- size 242014489
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0016419353e7d7f80a14aa8daf64879f65b06e56c3a9a019ad70ed8fb198a6a1
3
+ size 242016345
tokenizer_config.json CHANGED
@@ -104,7 +104,7 @@
104
  "eos_token": "</s>",
105
  "extra_ids": 100,
106
  "model_max_length": 512,
107
- "name_or_path": "lmqg_output/t5_small_tweetqa/best_model",
108
  "pad_token": "<pad>",
109
  "special_tokens_map_file": null,
110
  "tokenizer_class": "T5Tokenizer",
 
104
  "eos_token": "</s>",
105
  "extra_ids": 100,
106
  "model_max_length": 512,
107
+ "name_or_path": "lmqg_output/t5_small_tweetqa/model_eszyci/epoch_15",
108
  "pad_token": "<pad>",
109
  "special_tokens_map_file": null,
110
  "tokenizer_class": "T5Tokenizer",