Spaces:
Sleeping
Sleeping
Added deviation section in README
Browse files- README.md +26 -0
- pipeline.png +0 -0
README.md
CHANGED
@@ -89,6 +89,32 @@ SentenceTransformer, such as `all-mpnet-base-v2` or `roberta-base`. You can exte
|
|
89 |
extending the `Encoder` base class in the `encoder_models.py` file.
|
90 |
|
91 |
## Deviations from Published Methodology
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
92 |
|
93 |
## Citation
|
94 |
```bibtex
|
|
|
89 |
extending the `Encoder` base class in the `encoder_models.py` file.
|
90 |
|
91 |
## Deviations from Published Methodology
|
92 |
+
In our implementation, we expand upon the methodology presented in the original paper, which focused solely on
|
93 |
+
extractive model summaries. The primary approach in the paper involved ranking sentences in the source document based on
|
94 |
+
ground-truth reference sentences. The Normalized Cumulative Gain (NCG) score was computed using the formula:
|
95 |
+
|
96 |
+
```ncg = $\frac{\text{cumulative gain}}{\text{ideal cumulative gain}}$```
|
97 |
+
|
98 |
+
as depicted in the following image:
|
99 |
+
![img.png](pipeline.png)
|
100 |
+
|
101 |
+
Key deviations in our implementation from the paper include:
|
102 |
+
1. **Inclusion of Abstractive Model Summaries:** Unlike the paper, which exclusively considered extractive model
|
103 |
+
summaries, our implementation supports both extractive and abstractive summarization models.
|
104 |
+
2. **Enhanced Calculation of NCG Scores:** For both extractive and abstractive summaries, we compute rankings based on
|
105 |
+
both the reference/ground truth (`gt_gain`) and predicted summaries (`pred_gain`). The NCG score is calculated using the
|
106 |
+
method shown below:
|
107 |
+
```python
|
108 |
+
def compute_ncg(pred_gains, gt_gains, k: int) -> float:
|
109 |
+
gt_dict = dict(gt_gains)
|
110 |
+
gt_rel = [v for _, v in gt_gains[:k]]
|
111 |
+
model_rel = [gt_dict[position] for position, _ in pred_gains[:k]]
|
112 |
+
return sum(model_rel)/sum(gt_rel)
|
113 |
+
```
|
114 |
+
|
115 |
+
This approach allows us to evaluate summarization quality across both extractive and abstractive methods, providing a
|
116 |
+
more comprehensive assessment than the original methodology.
|
117 |
+
|
118 |
|
119 |
## Citation
|
120 |
```bibtex
|
pipeline.png
ADDED