vectara
/

hallucination_evaluation_model

Text Classification

Model card Files Files and versions Community

Forrest Bao commited on Aug 5

Commit

9c966da

•

1 Parent(s): 402fb1d

fix typos in performance numbers

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -68,10 +68,10 @@ The tables below compare the two models on the [AggreFact](https://arxiv.org/pdf
 Table 1: Performance on AggreFact-SOTA
 | model     |    Balanced Accuracy | F1     | Recall | Precision |
 |:------------------------|---------:|-------:|-------:|----------:|
-| HHEM-1.0                | 78.87%   | 90.47% | 70.81% | 67.28%    |
 | HHEM-2.1-Open           | 76.55%   | 66.77% | 68.48% | 65.13%    |
-| GPT-3.5-Turbo zero-shot | 72.19%   | 60.88% | 58.48% | 63.48%    |
-| GPT-4 06-13 zero-shot   | 73.78%   | 63.86% | 53.03% | 80.27%    |
 Table 2: Performance on RAGTruth-Summ
 | model     |    Balanced Accuracy | F1         | Recall    | Precision |

 Table 1: Performance on AggreFact-SOTA
 | model     |    Balanced Accuracy | F1     | Recall | Precision |
 |:------------------------|---------:|-------:|-------:|----------:|
+| HHEM-1.0                | 78.87%   | 90.47% | 70.81% | 67.27%    |
 | HHEM-2.1-Open           | 76.55%   | 66.77% | 68.48% | 65.13%    |
+| GPT-3.5-Turbo zero-shot | 72.19%   | 60.88% | 58.48% | 63.49%    |
+| GPT-4 06-13 zero-shot   | 73.78%   | 63.87% | 53.03% | 80.28%    |
 Table 2: Performance on RAGTruth-Summ
 | model     |    Balanced Accuracy | F1         | Recall    | Precision |