leaderboard-pt-pr-bot commited on
Commit
f335745
1 Parent(s): ba96502

Adding the Open Portuguese LLM Leaderboard Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard

The purpose of this PR is to add evaluation results from the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard) to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show
  1. README.md +171 -6
README.md CHANGED
@@ -1,8 +1,4 @@
1
  ---
2
- license: other
3
- license_name: tongyi-qianwen-license-agreement
4
- license_link: >-
5
- https://huggingface.co/Qwen/Qwen1.5-14B/blob/39b74a78357df4d2296e838d87565967d663a67a/LICENSE
6
  language:
7
  - zh
8
  - en
@@ -13,9 +9,159 @@ language:
13
  - it
14
  - ru
15
  - fi
 
 
 
 
16
  pipeline_tag: text-generation
17
  inference: false
18
- library_name: transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  ---
20
 
21
 
@@ -70,4 +216,23 @@ By using OpenBuddy, you agree to these terms and conditions, and acknowledge tha
70
 
71
  OpenBuddy按“原样”提供,不附带任何种类的明示或暗示的保证,包括但不限于适销性、特定目的的适用性和非侵权的暗示保证。在任何情况下,作者、贡献���或版权所有者均不对因软件或使用或其他软件交易而产生的任何索赔、损害赔偿或其他责任(无论是合同、侵权还是其他原因)承担责任。
72
 
73
- 使用OpenBuddy即表示您同意这些条款和条件,并承认您了解其使用可能带来的潜在风险。您还同意赔偿并使作者、贡献者和版权所有者免受因您使用OpenBuddy而产生的任何索赔、损害赔偿或责任的影响。
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
 
 
2
  language:
3
  - zh
4
  - en
 
9
  - it
10
  - ru
11
  - fi
12
+ license: other
13
+ library_name: transformers
14
+ license_name: tongyi-qianwen-license-agreement
15
+ license_link: https://huggingface.co/Qwen/Qwen1.5-14B/blob/39b74a78357df4d2296e838d87565967d663a67a/LICENSE
16
  pipeline_tag: text-generation
17
  inference: false
18
+ model-index:
19
+ - name: openbuddy-qwen1.5-32b-v21.2-32k
20
+ results:
21
+ - task:
22
+ type: text-generation
23
+ name: Text Generation
24
+ dataset:
25
+ name: ENEM Challenge (No Images)
26
+ type: eduagarcia/enem_challenge
27
+ split: train
28
+ args:
29
+ num_few_shot: 3
30
+ metrics:
31
+ - type: acc
32
+ value: 75.02
33
+ name: accuracy
34
+ source:
35
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=OpenBuddy/openbuddy-qwen1.5-32b-v21.2-32k
36
+ name: Open Portuguese LLM Leaderboard
37
+ - task:
38
+ type: text-generation
39
+ name: Text Generation
40
+ dataset:
41
+ name: BLUEX (No Images)
42
+ type: eduagarcia-temp/BLUEX_without_images
43
+ split: train
44
+ args:
45
+ num_few_shot: 3
46
+ metrics:
47
+ - type: acc
48
+ value: 61.34
49
+ name: accuracy
50
+ source:
51
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=OpenBuddy/openbuddy-qwen1.5-32b-v21.2-32k
52
+ name: Open Portuguese LLM Leaderboard
53
+ - task:
54
+ type: text-generation
55
+ name: Text Generation
56
+ dataset:
57
+ name: OAB Exams
58
+ type: eduagarcia/oab_exams
59
+ split: train
60
+ args:
61
+ num_few_shot: 3
62
+ metrics:
63
+ - type: acc
64
+ value: 51.44
65
+ name: accuracy
66
+ source:
67
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=OpenBuddy/openbuddy-qwen1.5-32b-v21.2-32k
68
+ name: Open Portuguese LLM Leaderboard
69
+ - task:
70
+ type: text-generation
71
+ name: Text Generation
72
+ dataset:
73
+ name: Assin2 RTE
74
+ type: assin2
75
+ split: test
76
+ args:
77
+ num_few_shot: 15
78
+ metrics:
79
+ - type: f1_macro
80
+ value: 92.21
81
+ name: f1-macro
82
+ source:
83
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=OpenBuddy/openbuddy-qwen1.5-32b-v21.2-32k
84
+ name: Open Portuguese LLM Leaderboard
85
+ - task:
86
+ type: text-generation
87
+ name: Text Generation
88
+ dataset:
89
+ name: Assin2 STS
90
+ type: eduagarcia/portuguese_benchmark
91
+ split: test
92
+ args:
93
+ num_few_shot: 15
94
+ metrics:
95
+ - type: pearson
96
+ value: 80.36
97
+ name: pearson
98
+ source:
99
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=OpenBuddy/openbuddy-qwen1.5-32b-v21.2-32k
100
+ name: Open Portuguese LLM Leaderboard
101
+ - task:
102
+ type: text-generation
103
+ name: Text Generation
104
+ dataset:
105
+ name: FaQuAD NLI
106
+ type: ruanchaves/faquad-nli
107
+ split: test
108
+ args:
109
+ num_few_shot: 15
110
+ metrics:
111
+ - type: f1_macro
112
+ value: 81.69
113
+ name: f1-macro
114
+ source:
115
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=OpenBuddy/openbuddy-qwen1.5-32b-v21.2-32k
116
+ name: Open Portuguese LLM Leaderboard
117
+ - task:
118
+ type: text-generation
119
+ name: Text Generation
120
+ dataset:
121
+ name: HateBR Binary
122
+ type: ruanchaves/hatebr
123
+ split: test
124
+ args:
125
+ num_few_shot: 25
126
+ metrics:
127
+ - type: f1_macro
128
+ value: 88.6
129
+ name: f1-macro
130
+ source:
131
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=OpenBuddy/openbuddy-qwen1.5-32b-v21.2-32k
132
+ name: Open Portuguese LLM Leaderboard
133
+ - task:
134
+ type: text-generation
135
+ name: Text Generation
136
+ dataset:
137
+ name: PT Hate Speech Binary
138
+ type: hate_speech_portuguese
139
+ split: test
140
+ args:
141
+ num_few_shot: 25
142
+ metrics:
143
+ - type: f1_macro
144
+ value: 65.98
145
+ name: f1-macro
146
+ source:
147
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=OpenBuddy/openbuddy-qwen1.5-32b-v21.2-32k
148
+ name: Open Portuguese LLM Leaderboard
149
+ - task:
150
+ type: text-generation
151
+ name: Text Generation
152
+ dataset:
153
+ name: tweetSentBR
154
+ type: eduagarcia/tweetsentbr_fewshot
155
+ split: test
156
+ args:
157
+ num_few_shot: 25
158
+ metrics:
159
+ - type: f1_macro
160
+ value: 73.38
161
+ name: f1-macro
162
+ source:
163
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=OpenBuddy/openbuddy-qwen1.5-32b-v21.2-32k
164
+ name: Open Portuguese LLM Leaderboard
165
  ---
166
 
167
 
 
216
 
217
  OpenBuddy按“原样”提供,不附带任何种类的明示或暗示的保证,包括但不限于适销性、特定目的的适用性和非侵权的暗示保证。在任何情况下,作者、贡献���或版权所有者均不对因软件或使用或其他软件交易而产生的任何索赔、损害赔偿或其他责任(无论是合同、侵权还是其他原因)承担责任。
218
 
219
+ 使用OpenBuddy即表示您同意这些条款和条件,并承认您了解其使用可能带来的潜在风险。您还同意赔偿并使作者、贡献者和版权所有者免受因您使用OpenBuddy而产生的任何索赔、损害赔偿或责任的影响。
220
+
221
+
222
+ # Open Portuguese LLM Leaderboard Evaluation Results
223
+
224
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/OpenBuddy/openbuddy-qwen1.5-32b-v21.2-32k) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
225
+
226
+ | Metric | Value |
227
+ |--------------------------|---------|
228
+ |Average |**74.45**|
229
+ |ENEM Challenge (No Images)| 75.02|
230
+ |BLUEX (No Images) | 61.34|
231
+ |OAB Exams | 51.44|
232
+ |Assin2 RTE | 92.21|
233
+ |Assin2 STS | 80.36|
234
+ |FaQuAD NLI | 81.69|
235
+ |HateBR Binary | 88.60|
236
+ |PT Hate Speech Binary | 65.98|
237
+ |tweetSentBR | 73.38|
238
+