FreedomIntelligence
/

AceGPT-13B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sxn6144 commited on Sep 21, 2023

Commit

f9498b8

•

1 Parent(s): 30cb415

Update README.md

Files changed (1) hide show

README.md +15 -0

README.md CHANGED Viewed

@@ -27,6 +27,21 @@ Models input text only.
 ## Output
 Models output text only.
 ---
 ## Samples
 #### Sample1(Vicuna)

 ## Output
 Models output text only.
+## Model Evaluation Results
+Experiments on Arabic MMLU and EXAMs. 'AverageBest', 'STEM','Humanities','Social Sciences' and 'Others (Business, Health, Misc)'
+belong to Arabic MMLU. Best performance is in bold and the second best is underlined.
+|   Model         | Average | STEM | Humanities | Social Sciences | Others (Business, Health, Misc) |EXAMs         |
+|-----------------|---------|------|------------|-----------------|---------------------------------|--------------|
+| Bloomz Muennighoff et al. (2022) | 30.95        | 32.32        | 26.71        | 35.85        | 28.95       | 33.89          |
+| Llama2-7B                        | 28.81        | 28.48        | 26.68        | 29.88        | 30.18        | 23.48         |
+| Llama2-13B                       | 31.25        | 31.06        | 27.11        | 35.5         | 31.35        | 25.45         |
+| Jais-13B-base                    | 30.01        | 27.85        | 25.42        | 39.7         | 27.06        | 35.67         |
+| AceGPT-7B-base                   | 30.36        | 26.63        | 28.17        | 35.15        | 31.5         | 31.96         |
+| AceGPT-13B-base                  | <u>37.26</u> | <u>35.16</u> | <u>30.3</u>  | <u>47.34</u> | <u>36.25</u> | <u>36.63</u>  |
+| ChatGPT                          | <b>46.07</b> | <b>44.17</b> | <b>35.33/<b> | <b>61.26</b> | <b>43.52</b> | <b>45.63 </b> |
 ---
 ## Samples
 #### Sample1(Vicuna)