MiriUll commited on
Commit
c2da874
1 Parent(s): 9a4336b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -4
README.md CHANGED
@@ -1,12 +1,56 @@
1
  ---
2
  title: Negbleurt
3
- emoji: 🏢
4
- colorFrom: purple
5
- colorTo: purple
6
  sdk: gradio
7
  sdk_version: 3.38.0
8
  app_file: app.py
9
  pinned: false
 
10
  ---
 
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: Negbleurt
3
+ emoji: 🌖
4
+ colorFrom: indigo
5
+ colorTo: indigo
6
  sdk: gradio
7
  sdk_version: 3.38.0
8
  app_file: app.py
9
  pinned: false
10
+ license: mit
11
  ---
12
+ # Metric Card for NegBLEURT
13
 
14
+
15
+ ## Metric Description
16
+
17
+ NegBLEURT is the negation-aware version of the BLEURT metric. It can be used to evaluate generated text against a reference.
18
+ BLEURT a learnt evaluation metric for Natural Language Generation. It is built using multiple phases of transfer learning starting from a pretrained BERT model (Devlin et al. 2018) and then employing another pre-training phrase using synthetic data. Finally it is trained on WMT human annotations and the CANNOT negation awareness dataset.
19
+
20
+ ## How to Use
21
+
22
+ At minimum, this metric requires predictions and references as inputs.
23
+
24
+ ```python
25
+ >>> negBLEURT = evaluate.load('tum-nlp/negbleurt')
26
+ >>> predictions = ["Ray Charles is a legend.", "Ray Charles isn’t legendary."]
27
+ >>> references = ["Ray Charles is legendary.", "Ray Charles is legendary."]
28
+ >>> results = negBLEURT.compute(predictions=predictions, references=references)
29
+ >>> print(results)
30
+ {'negBLERUT': [0.8409, 0.2601]}
31
+ ```
32
+
33
+
34
+ ### Inputs
35
+ - **predictions: list of predictions to score. Each prediction should be a string.
36
+ - **references: list of references, one for each prediction. Each reference should be a string
37
+ - **batch_size (optional): batch_size for model inference. Default is 16
38
+ ### Output Values
39
+ - **negBLEURT**(list of `float`): NegBLEURT scores. Values usually range between 0 and 1 where 1 indicates a perfect prediction and 0 indicates a poor fit.
40
+ Output Example(s):
41
+ ```python
42
+ {'negBLERUT': [0.8409, 0.2601]}
43
+ ```
44
+ This metric outputs a dictionary, containing the negBLEURT score.
45
+
46
+
47
+ ## Limitations and Bias
48
+ This metric is based on BERT (Devlin et al. 2018) and as such inherits its biases and weaknesses. It was trained in an negation aware setting, and thus, overcomes BERT issues with negation awareness.
49
+
50
+ Currently, NegBLEURT is only available in English.
51
+ ## Citation(s)
52
+ ```bibtex
53
+ tba
54
+ ```
55
+ ## Further References
56
+ - The original [NegBLEURT GitHub repo](https://github.com/MiriUll/negation_aware_evaluation)