leaderboard-pr-bot commited on
Commit
fa3d45b
1 Parent(s): 8a5bda1

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -68,4 +68,17 @@ GOAT-7B-Community model weights are available under LLAMA-2 license. Note that t
68
 
69
  ### Risks and Biases
70
 
71
- GOAT-7B-Community model can produce factually incorrect output and should not be relied on to deliver factually accurate information. The model was trained on various private and public datasets. Therefore, the GOAT-7B-Community model could possibly generate wrong, biased, or otherwise offensive outputs.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68
 
69
  ### Risks and Biases
70
 
71
+ GOAT-7B-Community model can produce factually incorrect output and should not be relied on to deliver factually accurate information. The model was trained on various private and public datasets. Therefore, the GOAT-7B-Community model could possibly generate wrong, biased, or otherwise offensive outputs.
72
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
73
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_GOAT-AI__GOAT-7B-Community)
74
+
75
+ | Metric | Value |
76
+ |-----------------------|---------------------------|
77
+ | Avg. | 42.74 |
78
+ | ARC (25-shot) | 48.81 |
79
+ | HellaSwag (10-shot) | 74.63 |
80
+ | MMLU (5-shot) | 49.58 |
81
+ | TruthfulQA (0-shot) | 42.48 |
82
+ | Winogrande (5-shot) | 72.3 |
83
+ | GSM8K (5-shot) | 4.47 |
84
+ | DROP (3-shot) | 6.91 |