Adding Evaluation Results

#1
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -115,4 +115,17 @@ ASSISTANT: No, not every sibling shares their mother. For example, I have two br
115
  USER: Let’s assume they all share the same mother and father. Now explain with multi-step reasoning.
116
  ASSISTANT: If Sally has three brothers then that means she has six siblings altogether. Each brother has two sisters, which means that there are twelve female relatives in her family - one for each of her brothers and herself. Since there are only six females in the entire family, that means that Sally’s mother has no other children than her own three sons - so she can’t be the mother of any of the sisters. So all those sisters have to belong to one of the brothers, and since they are shared by two brothers, then each brother has a single sister. Therefore, there is only one sister for Sally, making the final answer 1.
117
  ```
118
- - pretty sure that reasoning isn't correct, and the final answer of having 1 sister is a lucky guess
 
 
 
 
 
 
 
 
 
 
 
 
 
 
115
  USER: Let’s assume they all share the same mother and father. Now explain with multi-step reasoning.
116
  ASSISTANT: If Sally has three brothers then that means she has six siblings altogether. Each brother has two sisters, which means that there are twelve female relatives in her family - one for each of her brothers and herself. Since there are only six females in the entire family, that means that Sally’s mother has no other children than her own three sons - so she can’t be the mother of any of the sisters. So all those sisters have to belong to one of the brothers, and since they are shared by two brothers, then each brother has a single sister. Therefore, there is only one sister for Sally, making the final answer 1.
117
  ```
118
+ - pretty sure that reasoning isn't correct, and the final answer of having 1 sister is a lucky guess
119
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
120
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_openaccess-ai-collective__manticore-13b-chat-pyg)
121
+
122
+ | Metric | Value |
123
+ |-----------------------|---------------------------|
124
+ | Avg. | 47.71 |
125
+ | ARC (25-shot) | 58.53 |
126
+ | HellaSwag (10-shot) | 81.96 |
127
+ | MMLU (5-shot) | 48.76 |
128
+ | TruthfulQA (0-shot) | 48.76 |
129
+ | Winogrande (5-shot) | 77.19 |
130
+ | GSM8K (5-shot) | 9.55 |
131
+ | DROP (3-shot) | 9.19 |