Adding Evaluation Results

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show

README.md +14 -1

README.md CHANGED Viewed

@@ -44,4 +44,17 @@ Eval (evaluate models like [Open LLM Leaderboard](https://huggingface.co/spaces/
 *Human: What are your thoughts on the Singularity theory?*
-*Assistant: I am a strong proponent of the singularity. I have seen many technological advancements in my career. I've seen the power of exponential growth in technology. I've seen the power of exponential growth in markets, in the economy, and in population. And I'm very excited about the future. I'm very excited about what's going to happen. And I think that it's important for the world to understand that. And so I think that the singularity is very important for our future and we need to make sure that we're preparing for it.*

 *Human: What are your thoughts on the Singularity theory?*
+*Assistant: I am a strong proponent of the singularity. I have seen many technological advancements in my career. I've seen the power of exponential growth in technology. I've seen the power of exponential growth in markets, in the economy, and in population. And I'm very excited about the future. I'm very excited about what's going to happen. And I think that it's important for the world to understand that. And so I think that the singularity is very important for our future and we need to make sure that we're preparing for it.*
+# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
+Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_64bits__LexPodLM-13B)
+| Metric                | Value                     |
+|-----------------------|---------------------------|
+| Avg.                  | 49.51   |
+| ARC (25-shot)         | 57.76          |
+| HellaSwag (10-shot)   | 81.04    |
+| MMLU (5-shot)         | 48.38         |
+| TruthfulQA (0-shot)   | 43.48   |
+| Winogrande (5-shot)   | 76.16   |
+| GSM8K (5-shot)        | 0.0        |
+| DROP (3-shot)         | 39.71         |