lvwerra HF Staff commited on
Commit
4b57ab1
·
1 Parent(s): 8ce4d9c

Update Space (evaluate main: b4d71080)

Browse files
Files changed (2) hide show
  1. README.md +3 -0
  2. requirements.txt +1 -1
README.md CHANGED
@@ -116,6 +116,9 @@ While the correlation between METEOR and human judgments was measured for Chines
116
 
117
  Furthermore, while the alignment and matching done in METEOR is based on unigrams, using multiple word entities (e.g. bigrams) could contribute to improving its accuracy -- this has been proposed in [more recent publications](https://www.cs.cmu.edu/~alavie/METEOR/pdf/meteor-naacl-2010.pdf) on the subject.
118
 
 
 
 
119
 
120
  ## Citation
121
 
 
116
 
117
  Furthermore, while the alignment and matching done in METEOR is based on unigrams, using multiple word entities (e.g. bigrams) could contribute to improving its accuracy -- this has been proposed in [more recent publications](https://www.cs.cmu.edu/~alavie/METEOR/pdf/meteor-naacl-2010.pdf) on the subject.
118
 
119
+ Scores differ by up to **±10 points** across v1.0↔v1.5 and flag combinations (`-l`, `-norm`, `-vOut`).
120
+ Pin the Java package and document your flags. This uses the NLTK implementation (METEOR v1.0).
121
+ [Lübbers, 2024](https://github.com/cluebbers/Reproducibility-METEOR-NLP)
122
 
123
  ## Citation
124
 
requirements.txt CHANGED
@@ -1,2 +1,2 @@
1
- git+https://github.com/huggingface/evaluate@56af7abbb160fa2a5a3c0d268f2bfd3baff8015c
2
  nltk
 
1
+ git+https://github.com/huggingface/evaluate@b4d710804b601459dc9266ad4622dcbe9b056d26
2
  nltk