nbansal commited on
Commit
eb36db1
1 Parent(s): 27a1559

Added deviation section in README

Browse files
Files changed (2) hide show
  1. README.md +26 -0
  2. pipeline.png +0 -0
README.md CHANGED
@@ -89,6 +89,32 @@ SentenceTransformer, such as `all-mpnet-base-v2` or `roberta-base`. You can exte
89
  extending the `Encoder` base class in the `encoder_models.py` file.
90
 
91
  ## Deviations from Published Methodology
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
92
 
93
  ## Citation
94
  ```bibtex
 
89
  extending the `Encoder` base class in the `encoder_models.py` file.
90
 
91
  ## Deviations from Published Methodology
92
+ In our implementation, we expand upon the methodology presented in the original paper, which focused solely on
93
+ extractive model summaries. The primary approach in the paper involved ranking sentences in the source document based on
94
+ ground-truth reference sentences. The Normalized Cumulative Gain (NCG) score was computed using the formula:
95
+
96
+ ```ncg = $\frac{\text{cumulative gain}}{\text{ideal cumulative gain}}$```
97
+
98
+ as depicted in the following image:
99
+ ![img.png](pipeline.png)
100
+
101
+ Key deviations in our implementation from the paper include:
102
+ 1. **Inclusion of Abstractive Model Summaries:** Unlike the paper, which exclusively considered extractive model
103
+ summaries, our implementation supports both extractive and abstractive summarization models.
104
+ 2. **Enhanced Calculation of NCG Scores:** For both extractive and abstractive summaries, we compute rankings based on
105
+ both the reference/ground truth (`gt_gain`) and predicted summaries (`pred_gain`). The NCG score is calculated using the
106
+ method shown below:
107
+ ```python
108
+ def compute_ncg(pred_gains, gt_gains, k: int) -> float:
109
+ gt_dict = dict(gt_gains)
110
+ gt_rel = [v for _, v in gt_gains[:k]]
111
+ model_rel = [gt_dict[position] for position, _ in pred_gains[:k]]
112
+ return sum(model_rel)/sum(gt_rel)
113
+ ```
114
+
115
+ This approach allows us to evaluate summarization quality across both extractive and abstractive methods, providing a
116
+ more comprehensive assessment than the original methodology.
117
+
118
 
119
  ## Citation
120
  ```bibtex
pipeline.png ADDED