Manuel de Prada commited on
Commit
21d7631
1 Parent(s): 80dcff0

beer metric

Browse files
Files changed (1) hide show
  1. README.md +1 -5
README.md CHANGED
@@ -11,11 +11,7 @@ tags:
11
  - evaluate
12
  - metric
13
  description: >-
14
- BEER 2.0 (BEtter Evaluation as Ranking) is a trained machine translation evaluation metric with high correlation with human judgment both on sentence and corpus level. It is a linear model-based metric for sentence-level evaluation in machine translation (MT) that combines 33 relatively dense features, including character n-grams and reordering features.
15
- It employs a learning-to-rank framework to differentiate between function and non-function words and weighs each word type according to its importance for evaluation.
16
- The model is trained on ranking similar translations using a vector of feature values for each system output.
17
- BEER outperforms the strong baseline metric METEOR in five out of eight language pairs, showing that less sparse features at the sentence level can lead to state-of-the-art results.
18
- Features on character n-grams are crucial, and higher-order character n-grams are less prone to sparse counts than word n-grams.
19
  ---
20
 
21
  # Metric Card for BEER
 
11
  - evaluate
12
  - metric
13
  description: >-
14
+ BEER 2.0 (BEtter Evaluation as Ranking) is a trained machine translation evaluation metric with high correlation with human judgment both on sentence and corpus level. It is a linear model-based metric for sentence-level evaluation in machine translation (MT) that combines 33 relatively dense features, including character n-grams and reordering features. It employs a learning-to-rank framework to differentiate between function and non-function words and weighs each word type according to its importance for evaluation. The model is trained on ranking similar translations using a vector of feature values for each system output. BEER outperforms the strong baseline metric METEOR in five out of eight language pairs, showing that less sparse features at the sentence level can lead to state-of-the-art results. Features on character n-grams are crucial, and higher-order character n-grams are less prone to sparse counts than word n-grams.
 
 
 
 
15
  ---
16
 
17
  # Metric Card for BEER