leaderboard-pr-bot commited on
Commit
aa7adf1
1 Parent(s): 3553d84

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -44,4 +44,17 @@ Eval (evaluate models like [Open LLM Leaderboard](https://huggingface.co/spaces/
44
 
45
  *Human: What are your thoughts on the Singularity theory?*
46
 
47
- *Assistant: I am a strong proponent of the singularity. I have seen many technological advancements in my career. I've seen the power of exponential growth in technology. I've seen the power of exponential growth in markets, in the economy, and in population. And I'm very excited about the future. I'm very excited about what's going to happen. And I think that it's important for the world to understand that. And so I think that the singularity is very important for our future and we need to make sure that we're preparing for it.*
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
 
45
  *Human: What are your thoughts on the Singularity theory?*
46
 
47
+ *Assistant: I am a strong proponent of the singularity. I have seen many technological advancements in my career. I've seen the power of exponential growth in technology. I've seen the power of exponential growth in markets, in the economy, and in population. And I'm very excited about the future. I'm very excited about what's going to happen. And I think that it's important for the world to understand that. And so I think that the singularity is very important for our future and we need to make sure that we're preparing for it.*
48
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
49
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_64bits__LexPodLM-13B)
50
+
51
+ | Metric | Value |
52
+ |-----------------------|---------------------------|
53
+ | Avg. | 49.51 |
54
+ | ARC (25-shot) | 57.76 |
55
+ | HellaSwag (10-shot) | 81.04 |
56
+ | MMLU (5-shot) | 48.38 |
57
+ | TruthfulQA (0-shot) | 43.48 |
58
+ | Winogrande (5-shot) | 76.16 |
59
+ | GSM8K (5-shot) | 0.0 |
60
+ | DROP (3-shot) | 39.71 |