Spaces:

BridgeAI-Lab
/

Sem-nCG

Sleeping

App Files Files Community

nbansal commited on Jul 8, 2024

Commit

eb36db1

1 Parent(s): 27a1559

Added deviation section in README

Browse files

Files changed (2) hide show

README.md +26 -0
pipeline.png +0 -0

README.md CHANGED Viewed

@@ -89,6 +89,32 @@ SentenceTransformer, such as `all-mpnet-base-v2` or `roberta-base`. You can exte
 extending the `Encoder` base class in the `encoder_models.py` file.
 ## Deviations from Published Methodology
 ## Citation
 ```bibtex

 extending the `Encoder` base class in the `encoder_models.py` file.
 ## Deviations from Published Methodology
+In our implementation, we expand upon the methodology presented in the original paper, which focused solely on
+extractive model summaries. The primary approach in the paper involved ranking sentences in the source document based on
+ground-truth reference sentences. The Normalized Cumulative Gain (NCG) score was computed using the formula:
+```ncg = $\frac{\text{cumulative gain}}{\text{ideal cumulative gain}}$```
+as depicted in the following image:
+![img.png](pipeline.png)
+Key deviations in our implementation from the paper include:
+1. **Inclusion of Abstractive Model Summaries:** Unlike the paper, which exclusively considered extractive model
+summaries, our implementation supports both extractive and abstractive summarization models.
+2. **Enhanced Calculation of NCG Scores:** For both extractive and abstractive summaries, we compute rankings based on
+both the reference/ground truth (`gt_gain`) and predicted summaries (`pred_gain`). The NCG score is calculated using the
+method shown below:
+```python
+def compute_ncg(pred_gains, gt_gains, k: int) -> float:
+    gt_dict = dict(gt_gains)
+    gt_rel = [v for _, v in gt_gains[:k]]
+    model_rel = [gt_dict[position] for position, _ in pred_gains[:k]]
+    return sum(model_rel)/sum(gt_rel)
+```
+This approach allows us to evaluate summarization quality across both extractive and abstractive methods, providing a
+more comprehensive assessment than the original methodology.
 ## Citation
 ```bibtex

pipeline.png ADDED Viewed