eren23 leaderboard-pr-bot commited on
Commit
57e9c59
1 Parent(s): f2ecd6f

Adding Evaluation Results (#3)

Browse files

- Adding Evaluation Results (c1f616dc8792f0b682d60a0d075689fcdad59c1d)


Co-authored-by: Open LLM Leaderboard PR Bot <leaderboard-pr-bot@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +121 -5
README.md CHANGED
@@ -1,4 +1,8 @@
1
  ---
 
 
 
 
2
  tags:
3
  - merge
4
  - mergekit
@@ -10,10 +14,109 @@ base_model:
10
  - mlabonne/Monarch-7B
11
  - paulml/OGNO-7B
12
  - bardsai/jaskier-7b-dpo-v5.6
13
- license: cc-by-nc-4.0
14
- language:
15
- - en
16
- library_name: diffusers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  ---
18
 
19
  # DPO Fine-tuned version
@@ -81,4 +184,17 @@ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7,
81
  print(outputs[0]["generated_text"])
82
  ```
83
 
84
- GGUF Version: https://huggingface.co/eren23/ogno-monarch-jaskier-merge-7b-GGUF
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
+ license: cc-by-nc-4.0
5
+ library_name: diffusers
6
  tags:
7
  - merge
8
  - mergekit
 
14
  - mlabonne/Monarch-7B
15
  - paulml/OGNO-7B
16
  - bardsai/jaskier-7b-dpo-v5.6
17
+ model-index:
18
+ - name: ogno-monarch-jaskier-merge-7b
19
+ results:
20
+ - task:
21
+ type: text-generation
22
+ name: Text Generation
23
+ dataset:
24
+ name: AI2 Reasoning Challenge (25-Shot)
25
+ type: ai2_arc
26
+ config: ARC-Challenge
27
+ split: test
28
+ args:
29
+ num_few_shot: 25
30
+ metrics:
31
+ - type: acc_norm
32
+ value: 73.04
33
+ name: normalized accuracy
34
+ source:
35
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=eren23/ogno-monarch-jaskier-merge-7b
36
+ name: Open LLM Leaderboard
37
+ - task:
38
+ type: text-generation
39
+ name: Text Generation
40
+ dataset:
41
+ name: HellaSwag (10-Shot)
42
+ type: hellaswag
43
+ split: validation
44
+ args:
45
+ num_few_shot: 10
46
+ metrics:
47
+ - type: acc_norm
48
+ value: 89.09
49
+ name: normalized accuracy
50
+ source:
51
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=eren23/ogno-monarch-jaskier-merge-7b
52
+ name: Open LLM Leaderboard
53
+ - task:
54
+ type: text-generation
55
+ name: Text Generation
56
+ dataset:
57
+ name: MMLU (5-Shot)
58
+ type: cais/mmlu
59
+ config: all
60
+ split: test
61
+ args:
62
+ num_few_shot: 5
63
+ metrics:
64
+ - type: acc
65
+ value: 64.78
66
+ name: accuracy
67
+ source:
68
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=eren23/ogno-monarch-jaskier-merge-7b
69
+ name: Open LLM Leaderboard
70
+ - task:
71
+ type: text-generation
72
+ name: Text Generation
73
+ dataset:
74
+ name: TruthfulQA (0-shot)
75
+ type: truthful_qa
76
+ config: multiple_choice
77
+ split: validation
78
+ args:
79
+ num_few_shot: 0
80
+ metrics:
81
+ - type: mc2
82
+ value: 77.44
83
+ source:
84
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=eren23/ogno-monarch-jaskier-merge-7b
85
+ name: Open LLM Leaderboard
86
+ - task:
87
+ type: text-generation
88
+ name: Text Generation
89
+ dataset:
90
+ name: Winogrande (5-shot)
91
+ type: winogrande
92
+ config: winogrande_xl
93
+ split: validation
94
+ args:
95
+ num_few_shot: 5
96
+ metrics:
97
+ - type: acc
98
+ value: 84.77
99
+ name: accuracy
100
+ source:
101
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=eren23/ogno-monarch-jaskier-merge-7b
102
+ name: Open LLM Leaderboard
103
+ - task:
104
+ type: text-generation
105
+ name: Text Generation
106
+ dataset:
107
+ name: GSM8k (5-shot)
108
+ type: gsm8k
109
+ config: main
110
+ split: test
111
+ args:
112
+ num_few_shot: 5
113
+ metrics:
114
+ - type: acc
115
+ value: 69.45
116
+ name: accuracy
117
+ source:
118
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=eren23/ogno-monarch-jaskier-merge-7b
119
+ name: Open LLM Leaderboard
120
  ---
121
 
122
  # DPO Fine-tuned version
 
184
  print(outputs[0]["generated_text"])
185
  ```
186
 
187
+ GGUF Version: https://huggingface.co/eren23/ogno-monarch-jaskier-merge-7b-GGUF
188
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
189
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_eren23__ogno-monarch-jaskier-merge-7b)
190
+
191
+ | Metric |Value|
192
+ |---------------------------------|----:|
193
+ |Avg. |76.43|
194
+ |AI2 Reasoning Challenge (25-Shot)|73.04|
195
+ |HellaSwag (10-Shot) |89.09|
196
+ |MMLU (5-Shot) |64.78|
197
+ |TruthfulQA (0-shot) |77.44|
198
+ |Winogrande (5-shot) |84.77|
199
+ |GSM8k (5-shot) |69.45|
200
+