Spaces:
Running
Running
Update Space (evaluate main: b4d71080)
Browse files- README.md +3 -0
- requirements.txt +1 -1
README.md
CHANGED
|
@@ -116,6 +116,9 @@ While the correlation between METEOR and human judgments was measured for Chines
|
|
| 116 |
|
| 117 |
Furthermore, while the alignment and matching done in METEOR is based on unigrams, using multiple word entities (e.g. bigrams) could contribute to improving its accuracy -- this has been proposed in [more recent publications](https://www.cs.cmu.edu/~alavie/METEOR/pdf/meteor-naacl-2010.pdf) on the subject.
|
| 118 |
|
|
|
|
|
|
|
|
|
|
| 119 |
|
| 120 |
## Citation
|
| 121 |
|
|
|
|
| 116 |
|
| 117 |
Furthermore, while the alignment and matching done in METEOR is based on unigrams, using multiple word entities (e.g. bigrams) could contribute to improving its accuracy -- this has been proposed in [more recent publications](https://www.cs.cmu.edu/~alavie/METEOR/pdf/meteor-naacl-2010.pdf) on the subject.
|
| 118 |
|
| 119 |
+
Scores differ by up to **±10 points** across v1.0↔v1.5 and flag combinations (`-l`, `-norm`, `-vOut`).
|
| 120 |
+
Pin the Java package and document your flags. This uses the NLTK implementation (METEOR v1.0).
|
| 121 |
+
[Lübbers, 2024](https://github.com/cluebbers/Reproducibility-METEOR-NLP)
|
| 122 |
|
| 123 |
## Citation
|
| 124 |
|
requirements.txt
CHANGED
|
@@ -1,2 +1,2 @@
|
|
| 1 |
-
git+https://github.com/huggingface/evaluate@
|
| 2 |
nltk
|
|
|
|
| 1 |
+
git+https://github.com/huggingface/evaluate@b4d710804b601459dc9266ad4622dcbe9b056d26
|
| 2 |
nltk
|