leaderboard-pr-bot commited on
Commit
732f7bb
1 Parent(s): e88684d

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +118 -2
README.md CHANGED
@@ -1,11 +1,114 @@
1
  ---
2
- license: cc-by-nc-4.0
3
  language:
4
  - en
 
5
  tags:
6
  - mixtral
7
  - uncensored
8
  - high-intelligence
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ---
10
 
11
  # Orochi
@@ -39,4 +142,17 @@ As an uncensored model, Orochi may generate content that is unsuitable for all a
39
 
40
  Orochi is a product of numerous contributions from the fields of machine learning and language modeling. Special thanks to the teams behind Mixtral, mergekit, and all the individual models integrated into Orochi.
41
 
42
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  language:
3
  - en
4
+ license: cc-by-nc-4.0
5
  tags:
6
  - mixtral
7
  - uncensored
8
  - high-intelligence
9
+ model-index:
10
+ - name: MixtralOrochi8x7B
11
+ results:
12
+ - task:
13
+ type: text-generation
14
+ name: Text Generation
15
+ dataset:
16
+ name: AI2 Reasoning Challenge (25-Shot)
17
+ type: ai2_arc
18
+ config: ARC-Challenge
19
+ split: test
20
+ args:
21
+ num_few_shot: 25
22
+ metrics:
23
+ - type: acc_norm
24
+ value: 70.31
25
+ name: normalized accuracy
26
+ source:
27
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=smelborp/MixtralOrochi8x7B
28
+ name: Open LLM Leaderboard
29
+ - task:
30
+ type: text-generation
31
+ name: Text Generation
32
+ dataset:
33
+ name: HellaSwag (10-Shot)
34
+ type: hellaswag
35
+ split: validation
36
+ args:
37
+ num_few_shot: 10
38
+ metrics:
39
+ - type: acc_norm
40
+ value: 86.1
41
+ name: normalized accuracy
42
+ source:
43
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=smelborp/MixtralOrochi8x7B
44
+ name: Open LLM Leaderboard
45
+ - task:
46
+ type: text-generation
47
+ name: Text Generation
48
+ dataset:
49
+ name: MMLU (5-Shot)
50
+ type: cais/mmlu
51
+ config: all
52
+ split: test
53
+ args:
54
+ num_few_shot: 5
55
+ metrics:
56
+ - type: acc
57
+ value: 70.13
58
+ name: accuracy
59
+ source:
60
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=smelborp/MixtralOrochi8x7B
61
+ name: Open LLM Leaderboard
62
+ - task:
63
+ type: text-generation
64
+ name: Text Generation
65
+ dataset:
66
+ name: TruthfulQA (0-shot)
67
+ type: truthful_qa
68
+ config: multiple_choice
69
+ split: validation
70
+ args:
71
+ num_few_shot: 0
72
+ metrics:
73
+ - type: mc2
74
+ value: 63.99
75
+ source:
76
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=smelborp/MixtralOrochi8x7B
77
+ name: Open LLM Leaderboard
78
+ - task:
79
+ type: text-generation
80
+ name: Text Generation
81
+ dataset:
82
+ name: Winogrande (5-shot)
83
+ type: winogrande
84
+ config: winogrande_xl
85
+ split: validation
86
+ args:
87
+ num_few_shot: 5
88
+ metrics:
89
+ - type: acc
90
+ value: 79.87
91
+ name: accuracy
92
+ source:
93
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=smelborp/MixtralOrochi8x7B
94
+ name: Open LLM Leaderboard
95
+ - task:
96
+ type: text-generation
97
+ name: Text Generation
98
+ dataset:
99
+ name: GSM8k (5-shot)
100
+ type: gsm8k
101
+ config: main
102
+ split: test
103
+ args:
104
+ num_few_shot: 5
105
+ metrics:
106
+ - type: acc
107
+ value: 17.29
108
+ name: accuracy
109
+ source:
110
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=smelborp/MixtralOrochi8x7B
111
+ name: Open LLM Leaderboard
112
  ---
113
 
114
  # Orochi
 
142
 
143
  Orochi is a product of numerous contributions from the fields of machine learning and language modeling. Special thanks to the teams behind Mixtral, mergekit, and all the individual models integrated into Orochi.
144
 
145
+ ---
146
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
147
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_smelborp__MixtralOrochi8x7B)
148
+
149
+ | Metric |Value|
150
+ |---------------------------------|----:|
151
+ |Avg. |64.62|
152
+ |AI2 Reasoning Challenge (25-Shot)|70.31|
153
+ |HellaSwag (10-Shot) |86.10|
154
+ |MMLU (5-Shot) |70.13|
155
+ |TruthfulQA (0-shot) |63.99|
156
+ |Winogrande (5-shot) |79.87|
157
+ |GSM8k (5-shot) |17.29|
158
+