Files changed (1) hide show
  1. README.md +120 -6
README.md CHANGED
@@ -1,10 +1,14 @@
1
  ---
 
 
2
  license: apache-2.0
3
  tags:
4
  - LLMs
5
  - mistral
6
  - math
7
  - Intel
 
 
8
  model-index:
9
  - name: neural-chat-7b-v3-2
10
  results:
@@ -12,11 +16,11 @@ model-index:
12
  type: Large Language Model
13
  name: Large Language Model
14
  dataset:
15
- type: meta-math/MetaMathQA
16
  name: meta-math/MetaMathQA
 
17
  metrics:
18
  - type: ARC (25-shot)
19
- value: 67.49
20
  name: ARC (25-shot)
21
  verified: true
22
  - type: HellaSwag (10-shot)
@@ -39,10 +43,106 @@ model-index:
39
  value: 55.12
40
  name: GSM8K (5-shot)
41
  verified: true
42
- datasets:
43
- - meta-math/MetaMathQA
44
- language:
45
- - en
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
  ---
47
 
48
  ## Model Details: Neural-Chat-v3-2
@@ -248,3 +348,17 @@ Here are a couple of useful links to learn more about Intel's AI software:
248
  ## Disclaimer
249
 
250
  The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please cosult an attorney before using this model for commercial purposes.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
  license: apache-2.0
5
  tags:
6
  - LLMs
7
  - mistral
8
  - math
9
  - Intel
10
+ datasets:
11
+ - meta-math/MetaMathQA
12
  model-index:
13
  - name: neural-chat-7b-v3-2
14
  results:
 
16
  type: Large Language Model
17
  name: Large Language Model
18
  dataset:
 
19
  name: meta-math/MetaMathQA
20
+ type: meta-math/MetaMathQA
21
  metrics:
22
  - type: ARC (25-shot)
23
+ value: 67.49
24
  name: ARC (25-shot)
25
  verified: true
26
  - type: HellaSwag (10-shot)
 
43
  value: 55.12
44
  name: GSM8K (5-shot)
45
  verified: true
46
+ - task:
47
+ type: text-generation
48
+ name: Text Generation
49
+ dataset:
50
+ name: AI2 Reasoning Challenge (25-Shot)
51
+ type: ai2_arc
52
+ config: ARC-Challenge
53
+ split: test
54
+ args:
55
+ num_few_shot: 25
56
+ metrics:
57
+ - type: acc_norm
58
+ value: 67.49
59
+ name: normalized accuracy
60
+ source:
61
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Intel/neural-chat-7b-v3-2
62
+ name: Open LLM Leaderboard
63
+ - task:
64
+ type: text-generation
65
+ name: Text Generation
66
+ dataset:
67
+ name: HellaSwag (10-Shot)
68
+ type: hellaswag
69
+ split: validation
70
+ args:
71
+ num_few_shot: 10
72
+ metrics:
73
+ - type: acc_norm
74
+ value: 83.92
75
+ name: normalized accuracy
76
+ source:
77
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Intel/neural-chat-7b-v3-2
78
+ name: Open LLM Leaderboard
79
+ - task:
80
+ type: text-generation
81
+ name: Text Generation
82
+ dataset:
83
+ name: MMLU (5-Shot)
84
+ type: cais/mmlu
85
+ config: all
86
+ split: test
87
+ args:
88
+ num_few_shot: 5
89
+ metrics:
90
+ - type: acc
91
+ value: 63.55
92
+ name: accuracy
93
+ source:
94
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Intel/neural-chat-7b-v3-2
95
+ name: Open LLM Leaderboard
96
+ - task:
97
+ type: text-generation
98
+ name: Text Generation
99
+ dataset:
100
+ name: TruthfulQA (0-shot)
101
+ type: truthful_qa
102
+ config: multiple_choice
103
+ split: validation
104
+ args:
105
+ num_few_shot: 0
106
+ metrics:
107
+ - type: mc2
108
+ value: 59.68
109
+ source:
110
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Intel/neural-chat-7b-v3-2
111
+ name: Open LLM Leaderboard
112
+ - task:
113
+ type: text-generation
114
+ name: Text Generation
115
+ dataset:
116
+ name: Winogrande (5-shot)
117
+ type: winogrande
118
+ config: winogrande_xl
119
+ split: validation
120
+ args:
121
+ num_few_shot: 5
122
+ metrics:
123
+ - type: acc
124
+ value: 79.95
125
+ name: accuracy
126
+ source:
127
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Intel/neural-chat-7b-v3-2
128
+ name: Open LLM Leaderboard
129
+ - task:
130
+ type: text-generation
131
+ name: Text Generation
132
+ dataset:
133
+ name: GSM8k (5-shot)
134
+ type: gsm8k
135
+ config: main
136
+ split: test
137
+ args:
138
+ num_few_shot: 5
139
+ metrics:
140
+ - type: acc
141
+ value: 55.12
142
+ name: accuracy
143
+ source:
144
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Intel/neural-chat-7b-v3-2
145
+ name: Open LLM Leaderboard
146
  ---
147
 
148
  ## Model Details: Neural-Chat-v3-2
 
348
  ## Disclaimer
349
 
350
  The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please cosult an attorney before using this model for commercial purposes.
351
+
352
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
353
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Intel__neural-chat-7b-v3-2)
354
+
355
+ | Metric |Value|
356
+ |---------------------------------|----:|
357
+ |Avg. |68.29|
358
+ |AI2 Reasoning Challenge (25-Shot)|67.49|
359
+ |HellaSwag (10-Shot) |83.92|
360
+ |MMLU (5-Shot) |63.55|
361
+ |TruthfulQA (0-shot) |59.68|
362
+ |Winogrande (5-shot) |79.95|
363
+ |GSM8k (5-shot) |55.12|
364
+