MarkS commited on
Commit
ebd0939
1 Parent(s): 75687b8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -0
README.md CHANGED
@@ -1,3 +1,63 @@
1
  ---
2
  license: afl-3.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: afl-3.0
3
  ---
4
+
5
+ # Generating Declarative Statements from QA Pairs
6
+
7
+ There are already some rule-based models that can accomplish this task, but I haven't seen any transformer-based models that can do so. Therefore, I trained this model based on `Bart-base` to transform QA pairs into declarative statements.
8
+
9
+ I compared the my model with other rule base models, including
10
+
11
+ > [paper1](https://aclanthology.org/D19-5401.pdf) (2019), which proposes **2 Encoder Pointer-Gen model**
12
+
13
+ and
14
+
15
+ > [paper2](https://arxiv.org/pdf/2112.03849.pdf) (2021), which propose **RBV2 model**
16
+
17
+ **Here are results compared to 2 Encoder Pointer-Gen model (on testset released by paper1)**
18
+
19
+ Test on testset
20
+
21
+ | Model | 2 Encoder Pointer-Gen(2019) | BART-base |
22
+ | ------- | --------------------------- | ---------- |
23
+ | BLEU | 74.05 | **78.878** |
24
+ | ROUGE-1 | 91.24 | **91.937** |
25
+ | ROUGE-2 | 81.91 | **82.177** |
26
+ | ROUGE-L | 86.25 | **87.172** |
27
+
28
+ Test on NewsQA testset
29
+
30
+ | Model | 2 Encoder Pointer-Gen | BART |
31
+ | ------- | --------------------- | ---------- |
32
+ | BLEU | 73.29 | **74.966** |
33
+ | ROUGE-1 | **95.38** | 89.328 |
34
+ | ROUGE-2 | **87.18** | 78.538 |
35
+ | ROUGE-L | **93.65** | 87.583 |
36
+
37
+ Test on free_base testset
38
+
39
+ | Model | 2 Encoder Pointer-Gen | BART |
40
+ | ------- | --------------------- | ---------- |
41
+ | BLEU | 75.41 | **76.082** |
42
+ | ROUGE-1 | **93.46** | 92.693 |
43
+ | ROUGE-2 | **82.29** | 81.216 |
44
+ | ROUGE-L | **87.5** | 86.834 |
45
+
46
+
47
+
48
+ **As paper2 doesn't release its own dataset, it's hard to make a fair comparison. But according to results in paper2, the Bleu and ROUGE score of their model is lower than that of MPG, which is exactly the 2 Encoder Pointer-Gen model.**
49
+
50
+ | Model | BLEU | ROUGE-1 | ROUGE-2 | ROUGE-L |
51
+ | ------------ | ---- | ------- | ------- | ------- |
52
+ | RBV2 | 74.8 | 95.3 | 83.1 | 90.3 |
53
+ | RBV2+BERT | 71.5 | 93.9 | 82.4 | 89.5 |
54
+ | RBV2+RoBERTa | 72.1 | 94 | 83.1 | 89.8 |
55
+ | RBV2+XLNET | 71.2 | 93.6 | 82.3 | 89.4 |
56
+ | MPG | 75.8 | 94.4 | 87.4 | 91.6 |
57
+
58
+ There are reasons to believe that my model performs better than RBV2.
59
+
60
+ To sum up,my model performs nearly as well as the SOTA rule-based model evaluated with BLEU and ROUGE score. However the sentence pattern is lack of diversity.
61
+
62
+ (It's worth mentioning that even though I tried my best to conduct objective tests, the testsets I could find were more or less different from what they introduced in the paper.)
63
+