leaderboard-pr-bot commited on
Commit
fc7b43a
•
1 Parent(s): 0f57b17

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +140 -19
README.md CHANGED
@@ -1,11 +1,11 @@
1
  ---
2
- license: apache-2.0
3
  language:
4
  - en
5
- pipeline_tag: text-generation
6
  datasets:
7
  - Skylion007/openwebtext
8
  - Locutusque/TM-DATA
 
9
  inference:
10
  parameters:
11
  do_sample: true
@@ -15,22 +15,130 @@ inference:
15
  max_new_tokens: 250
16
  repetition_penalty: 1.16
17
  widget:
18
- - text: >-
19
- TITLE: Dirichlet density QUESTION [5 upvotes]: How to solve the following
20
- exercise: Let $q$ be prime. Show that the set of primes p for which $p
21
- \equiv 1\pmod q$ and $2^{(p-1)/q} \equiv 1 \pmod p$ has Dirichlet density
22
- $\dfrac{1}{q(q-1)}$. I want to show that $X^q-2$ (mod $p$) has a solution
23
- and $q$ divides $p-1$ , these two conditions are simultaneonusly satisfied
24
- iff p splits completely in $\Bbb{Q}(\zeta_q,2^{\frac{1}{q}})$. $\zeta_q $ is
25
- primitive $q^{th}$ root of unity. If this is proved the I can conclude the
26
- result by Chebotarev density theorem. REPLY [2 votes]:
27
- - text: >-
28
- An emerging clinical approach to treat substance abuse disorders involves a
29
- form of cognitive-behavioral therapy whereby addicts learn to reduce their
30
- reactivity to drug-paired stimuli through cue-exposure or extinction
31
- training. It is, however,
32
- - text: >-
33
- \begin{document} \begin{frontmatter} \author{Mahouton Norbert Hounkonnou\corref{cor1}${}^1$} \cortext[cor1]{norbert.hounkonnou@cipma.uac.bj} \author{Sama Arjika\corref{cor2}${}^1$} \cortext[cor2]{rjksama2008@gmail.com} \author{ Won Sang Chung\corref{cor3}${}^2$ } \cortext[cor3]{mimip4444@hanmail.net} \title{\bf New families of $q$ and $(q;p)-$Hermite polynomials } \address{${}^1$International Chair of Mathematical Physics and Applications \\ (ICMPA-UNESCO Chair), University of Abomey-Calavi,\\ 072 B. P.: 50 Cotonou, Republic of Benin,\\ ${}^2$Department of Physics and Research Institute of Natural Science, \\ College of Natural Science, \\ Gyeongsang National University, Jinju 660-701, Korea } \begin{abstract} In this paper, we construct a new family of $q-$Hermite polynomials denoted by $H_n(x,s|q).$ Main properties and relations are established and
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  ---
35
  # Training
36
  This model was trained on two datasets, shown in this model page.
@@ -41,4 +149,17 @@ Training took approximately 500 GPU hours on a single Titan V.
41
  You can look at the training metrics here:
42
  https://wandb.ai/locutusque/TinyMistral-V2/runs/g0rvw6wc
43
 
44
- 🔥 This model performed excellently on TruthfulQA, outperforming models more than 720x its size. These models include: mistralai/Mixtral-8x7B-v0.1, tiiuae/falcon-180B, berkeley-nest/Starling-LM-7B-alpha, upstage/SOLAR-10.7B-v1.0, and more. 🔥
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  language:
3
  - en
4
+ license: apache-2.0
5
  datasets:
6
  - Skylion007/openwebtext
7
  - Locutusque/TM-DATA
8
+ pipeline_tag: text-generation
9
  inference:
10
  parameters:
11
  do_sample: true
 
15
  max_new_tokens: 250
16
  repetition_penalty: 1.16
17
  widget:
18
+ - text: 'TITLE: Dirichlet density QUESTION [5 upvotes]: How to solve the following
19
+ exercise: Let $q$ be prime. Show that the set of primes p for which $p \equiv
20
+ 1\pmod q$ and $2^{(p-1)/q} \equiv 1 \pmod p$ has Dirichlet density $\dfrac{1}{q(q-1)}$.
21
+ I want to show that $X^q-2$ (mod $p$) has a solution and $q$ divides $p-1$ , these
22
+ two conditions are simultaneonusly satisfied iff p splits completely in $\Bbb{Q}(\zeta_q,2^{\frac{1}{q}})$.
23
+ $\zeta_q $ is primitive $q^{th}$ root of unity. If this is proved the I can conclude
24
+ the result by Chebotarev density theorem. REPLY [2 votes]:'
25
+ - text: An emerging clinical approach to treat substance abuse disorders involves
26
+ a form of cognitive-behavioral therapy whereby addicts learn to reduce their reactivity
27
+ to drug-paired stimuli through cue-exposure or extinction training. It is, however,
28
+ - text: '\begin{document} \begin{frontmatter} \author{Mahouton Norbert Hounkonnou\corref{cor1}${}^1$}
29
+ \cortext[cor1]{norbert.hounkonnou@cipma.uac.bj} \author{Sama Arjika\corref{cor2}${}^1$}
30
+ \cortext[cor2]{rjksama2008@gmail.com} \author{ Won Sang Chung\corref{cor3}${}^2$
31
+ } \cortext[cor3]{mimip4444@hanmail.net} \title{\bf New families of $q$ and $(q;p)-$Hermite
32
+ polynomials } \address{${}^1$International Chair of Mathematical Physics and Applications
33
+ \\ (ICMPA-UNESCO Chair), University of Abomey-Calavi,\\ 072 B. P.: 50 Cotonou,
34
+ Republic of Benin,\\ ${}^2$Department of Physics and Research Institute of Natural
35
+ Science, \\ College of Natural Science, \\ Gyeongsang National University, Jinju
36
+ 660-701, Korea } \begin{abstract} In this paper, we construct a new family of
37
+ $q-$Hermite polynomials denoted by $H_n(x,s|q).$ Main properties and relations
38
+ are established and'
39
+ model-index:
40
+ - name: TinyMistral-248M-v2
41
+ results:
42
+ - task:
43
+ type: text-generation
44
+ name: Text Generation
45
+ dataset:
46
+ name: AI2 Reasoning Challenge (25-Shot)
47
+ type: ai2_arc
48
+ config: ARC-Challenge
49
+ split: test
50
+ args:
51
+ num_few_shot: 25
52
+ metrics:
53
+ - type: acc_norm
54
+ value: 21.25
55
+ name: normalized accuracy
56
+ source:
57
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/TinyMistral-248M-v2
58
+ name: Open LLM Leaderboard
59
+ - task:
60
+ type: text-generation
61
+ name: Text Generation
62
+ dataset:
63
+ name: HellaSwag (10-Shot)
64
+ type: hellaswag
65
+ split: validation
66
+ args:
67
+ num_few_shot: 10
68
+ metrics:
69
+ - type: acc_norm
70
+ value: 26.56
71
+ name: normalized accuracy
72
+ source:
73
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/TinyMistral-248M-v2
74
+ name: Open LLM Leaderboard
75
+ - task:
76
+ type: text-generation
77
+ name: Text Generation
78
+ dataset:
79
+ name: MMLU (5-Shot)
80
+ type: cais/mmlu
81
+ config: all
82
+ split: test
83
+ args:
84
+ num_few_shot: 5
85
+ metrics:
86
+ - type: acc
87
+ value: 23.39
88
+ name: accuracy
89
+ source:
90
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/TinyMistral-248M-v2
91
+ name: Open LLM Leaderboard
92
+ - task:
93
+ type: text-generation
94
+ name: Text Generation
95
+ dataset:
96
+ name: TruthfulQA (0-shot)
97
+ type: truthful_qa
98
+ config: multiple_choice
99
+ split: validation
100
+ args:
101
+ num_few_shot: 0
102
+ metrics:
103
+ - type: mc2
104
+ value: 49.6
105
+ source:
106
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/TinyMistral-248M-v2
107
+ name: Open LLM Leaderboard
108
+ - task:
109
+ type: text-generation
110
+ name: Text Generation
111
+ dataset:
112
+ name: Winogrande (5-shot)
113
+ type: winogrande
114
+ config: winogrande_xl
115
+ split: validation
116
+ args:
117
+ num_few_shot: 5
118
+ metrics:
119
+ - type: acc
120
+ value: 51.85
121
+ name: accuracy
122
+ source:
123
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/TinyMistral-248M-v2
124
+ name: Open LLM Leaderboard
125
+ - task:
126
+ type: text-generation
127
+ name: Text Generation
128
+ dataset:
129
+ name: GSM8k (5-shot)
130
+ type: gsm8k
131
+ config: main
132
+ split: test
133
+ args:
134
+ num_few_shot: 5
135
+ metrics:
136
+ - type: acc
137
+ value: 0.0
138
+ name: accuracy
139
+ source:
140
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/TinyMistral-248M-v2
141
+ name: Open LLM Leaderboard
142
  ---
143
  # Training
144
  This model was trained on two datasets, shown in this model page.
 
149
  You can look at the training metrics here:
150
  https://wandb.ai/locutusque/TinyMistral-V2/runs/g0rvw6wc
151
 
152
+ 🔥 This model performed excellently on TruthfulQA, outperforming models more than 720x its size. These models include: mistralai/Mixtral-8x7B-v0.1, tiiuae/falcon-180B, berkeley-nest/Starling-LM-7B-alpha, upstage/SOLAR-10.7B-v1.0, and more. 🔥
153
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
154
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Locutusque__TinyMistral-248M-v2)
155
+
156
+ | Metric |Value|
157
+ |---------------------------------|----:|
158
+ |Avg. |28.78|
159
+ |AI2 Reasoning Challenge (25-Shot)|21.25|
160
+ |HellaSwag (10-Shot) |26.56|
161
+ |MMLU (5-Shot) |23.39|
162
+ |TruthfulQA (0-shot) |49.60|
163
+ |Winogrande (5-shot) |51.85|
164
+ |GSM8k (5-shot) | 0.00|
165
+