ferran-espuna commited on
Commit
8d5e095
1 Parent(s): 76b490a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -14
README.md CHANGED
@@ -927,20 +927,20 @@ Further details on all tasks and criteria, a full list of results compared to ot
927
 
928
  | **Category** | **Dataset** | **Metric** | **es** | **ca** | **gl** | **eu** | **en** |
929
  |---------|---------|-----------|-------|-------|-------|-------|-------|
930
- | **Commonsense Reasoning** | **XStoryCloze** | Ending Coherence (1 to 5) | 3.24 / 0.63 | 3.12 / 0.51 | 2.87 / 0.59 | 2.16 / 0.52 | 3.71 / 0.50 |
931
- | **Paraphrasing** | **PAWS** | Paraphrase Completeness (0/1) | 0.86 / 0.07 | 0.82 / 0.09 | 0.78 / 0.10 | ---- / ---- | 0.92 / 0.05 |
932
- | | | Paraphrase Generation (1 to 5) | 3.81 / 0.54 | 3.67 / 0.55 | 3.56 / 0.57 | ---- / ---- | 3.98 / 0.37 |
933
- | | | Paraphrase Grammatical Correctness (0/1) | 0.93 / 0.03 | 0.92 / 0.05 | 0.89 / 0.06 | ---- / ---- | 0.96 / 0.03 |
934
- | **Reading Comprehension** | **Belebele** | Passage Comprehension (1 to 5) | 3.43 / 0.43 | 3.28 / 0.50 | 3.02 / 0.56 | 2.61 / 0.43 | 3.43 / 0.58 |
935
- | | | Answer Relevance (0/1) | 0.86 / 0.05 | 0.84 / 0.05 | 0.75 / 0.08 | 0.65 / 0.11 | 0.83 / 0.06 |
936
- | **Extreme Summarization** | **XLSum & caBreu & summarization_gl** | Extreme Summarization Informativeness (1 to 5) | 3.37 / 0.34 | 3.57 / 0.31 | 3.40 / 0.31 | ---- / ---- | 3.32 / 0.26 |
937
- | | | Extreme Summarization Conciseness (1 to 5) | 3.06 / 0.34 | 2.88 / 0.50 | 3.09 / 0.38 | ---- / ---- | 3.32 / 0.22 |
938
- | **Mathematics** | **mgsm** | Reasoning Capability (1 to 5) | 3.29 / 0.72 | 3.16 / 0.65 | 3.33 / 0.60 | 2.56 / 0.52 | 3.35 / 0.65 |
939
- | | | Mathematical Correctness (0/1) | 0.68 / 0.12 | 0.65 / 0.13 | 0.73 / 0.11 | 0.59 / 0.13 | 0.67 / 0.12 |
940
- | **Translation form Language** | **FLoRes** | Translation Fluency (1 to 5) | 3.95 / 0.11 | 3.88 / 0.15 | ---- / ---- | ---- / ---- | 3.92 / 0.14 |
941
- | | | Translation Accuracy (1 to 5) | 4.22 / 0.15 | 4.25 / 0.21 | ---- / ---- | ---- / ---- | 4.25 / 0.23 |
942
- | **Translation to Language** | **FLoRes** | Translation Fluency (1 to 5) | 3.92 / 0.11 | 3.84 / 0.14 | ---- / ---- | ---- / ---- | 4.19 / 0.14 |
943
- | | | Translation Accuracy (1 to 5) | 4.31 / 0.16 | 4.18 / 0.20 | ---- / ---- | ---- / ---- | 4.63 / 0.15 |
944
 
945
  ---
946
 
 
927
 
928
  | **Category** | **Dataset** | **Metric** | **es** | **ca** | **gl** | **eu** | **en** |
929
  |---------|---------|-----------|-------|-------|-------|-------|-------|
930
+ | **Commonsense Reasoning** | **XStoryCloze** | Ending Coherence (1 to 5) | 3.24/0.63 | 3.12/0.51 | 2.87/0.59 | 2.16/0.52 | 3.71/0.50 |
931
+ | **Paraphrasing** | **PAWS** | Paraphrase Completeness (0/1) | 0.86/0.07 | 0.82/0.09 | 0.78/0.10 | ----/---- | 0.92/0.05 |
932
+ | | | Paraphrase Generation (1 to 5) | 3.81/0.54 | 3.67/0.55 | 3.56/0.57 | ----/---- | 3.98/0.37 |
933
+ | | | Paraphrase Grammatical Correctness (0/1) | 0.93/0.03 | 0.92/0.05 | 0.89/0.06 | ----/---- | 0.96/0.03 |
934
+ | **Reading Comprehension** | **Belebele** | Passage Comprehension (1 to 5) | 3.43/0.43 | 3.28/0.50 | 3.02/0.56 | 2.61/0.43 | 3.43/0.58 |
935
+ | | | Answer Relevance (0/1) | 0.86/0.05 | 0.84/0.05 | 0.75/0.08 | 0.65/0.11 | 0.83/0.06 |
936
+ | **Extreme Summarization** | **XLSum & caBreu & summarization_gl** | Extreme Summarization Informativeness (1 to 5) | 3.37/0.34 | 3.57/0.31 | 3.40/0.31 | ----/---- | 3.32/0.26 |
937
+ | | | Extreme Summarization Conciseness (1 to 5) | 3.06/0.34 | 2.88/0.50 | 3.09/0.38 | ----/---- | 3.32/0.22 |
938
+ | **Mathematics** | **mgsm** | Reasoning Capability (1 to 5) | 3.29/0.72 | 3.16/0.65 | 3.33/0.60 | 2.56/0.52 | 3.35/0.65 |
939
+ | | | Mathematical Correctness (0/1) | 0.68/0.12 | 0.65/0.13 | 0.73/0.11 | 0.59/0.13 | 0.67/0.12 |
940
+ | **Translation form Language** | **FLoRes** | Translation Fluency (1 to 5) | 3.95/0.11 | 3.88/0.15 | ----/---- | ----/---- | 3.92/0.14 |
941
+ | | | Translation Accuracy (1 to 5) | 4.22/0.15 | 4.25/0.21 | ----/---- | ----/---- | 4.25/0.23 |
942
+ | **Translation to Language** | **FLoRes** | Translation Fluency (1 to 5) | 3.92/0.11 | 3.84/0.14 | ----/---- | ----/---- | 4.19/0.14 |
943
+ | | | Translation Accuracy (1 to 5) | 4.31/0.16 | 4.18/0.20 | ----/---- | ----/---- | 4.63/0.15 |
944
 
945
  ---
946