leaderboard-pt-pr-bot commited on
Commit
904c51c
1 Parent(s): 2e73faa

Adding the Open Portuguese LLM Leaderboard Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard

The purpose of this PR is to add evaluation results from the Open Portuguese LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show
  1. README.md +166 -3
README.md CHANGED
@@ -9,6 +9,8 @@ tags:
9
  - gemma
10
  - portugues
11
  - instrucao
 
 
12
  pipeline_tag: text-generation
13
  widget:
14
  - text: Me explique como funciona um computador.
@@ -19,8 +21,153 @@ widget:
19
  example_title: História.
20
  - text: Escreva um poema bem interessante sobre o Sol e as flores.
21
  example_title: Escreva um poema.
22
- datasets:
23
- - rhaymison/superset
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
  ---
25
 
26
  # gemma-portuguese-2b-luana
@@ -151,4 +298,20 @@ email: rhaymisoncristian@gmail.com
151
  <a href="https://github.com/rhaymisonbetini" target="_blank">
152
  <img src="https://img.shields.io/badge/GitHub-100000?style=for-the-badge&logo=github&logoColor=white">
153
  </a>
154
- </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  - gemma
10
  - portugues
11
  - instrucao
12
+ datasets:
13
+ - rhaymison/superset
14
  pipeline_tag: text-generation
15
  widget:
16
  - text: Me explique como funciona um computador.
 
21
  example_title: História.
22
  - text: Escreva um poema bem interessante sobre o Sol e as flores.
23
  example_title: Escreva um poema.
24
+ model-index:
25
+ - name: gemma-portuguese-luana-2b
26
+ results:
27
+ - task:
28
+ type: text-generation
29
+ name: Text Generation
30
+ dataset:
31
+ name: ENEM Challenge (No Images)
32
+ type: eduagarcia/enem_challenge
33
+ split: train
34
+ args:
35
+ num_few_shot: 3
36
+ metrics:
37
+ - type: acc
38
+ value: 24.42
39
+ name: accuracy
40
+ source:
41
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/gemma-portuguese-luana-2b
42
+ name: Open Portuguese LLM Leaderboard
43
+ - task:
44
+ type: text-generation
45
+ name: Text Generation
46
+ dataset:
47
+ name: BLUEX (No Images)
48
+ type: eduagarcia-temp/BLUEX_without_images
49
+ split: train
50
+ args:
51
+ num_few_shot: 3
52
+ metrics:
53
+ - type: acc
54
+ value: 24.34
55
+ name: accuracy
56
+ source:
57
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/gemma-portuguese-luana-2b
58
+ name: Open Portuguese LLM Leaderboard
59
+ - task:
60
+ type: text-generation
61
+ name: Text Generation
62
+ dataset:
63
+ name: OAB Exams
64
+ type: eduagarcia/oab_exams
65
+ split: train
66
+ args:
67
+ num_few_shot: 3
68
+ metrics:
69
+ - type: acc
70
+ value: 27.11
71
+ name: accuracy
72
+ source:
73
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/gemma-portuguese-luana-2b
74
+ name: Open Portuguese LLM Leaderboard
75
+ - task:
76
+ type: text-generation
77
+ name: Text Generation
78
+ dataset:
79
+ name: Assin2 RTE
80
+ type: assin2
81
+ split: test
82
+ args:
83
+ num_few_shot: 15
84
+ metrics:
85
+ - type: f1_macro
86
+ value: 70.86
87
+ name: f1-macro
88
+ source:
89
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/gemma-portuguese-luana-2b
90
+ name: Open Portuguese LLM Leaderboard
91
+ - task:
92
+ type: text-generation
93
+ name: Text Generation
94
+ dataset:
95
+ name: Assin2 STS
96
+ type: eduagarcia/portuguese_benchmark
97
+ split: test
98
+ args:
99
+ num_few_shot: 15
100
+ metrics:
101
+ - type: pearson
102
+ value: 1.51
103
+ name: pearson
104
+ source:
105
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/gemma-portuguese-luana-2b
106
+ name: Open Portuguese LLM Leaderboard
107
+ - task:
108
+ type: text-generation
109
+ name: Text Generation
110
+ dataset:
111
+ name: FaQuAD NLI
112
+ type: ruanchaves/faquad-nli
113
+ split: test
114
+ args:
115
+ num_few_shot: 15
116
+ metrics:
117
+ - type: f1_macro
118
+ value: 43.97
119
+ name: f1-macro
120
+ source:
121
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/gemma-portuguese-luana-2b
122
+ name: Open Portuguese LLM Leaderboard
123
+ - task:
124
+ type: text-generation
125
+ name: Text Generation
126
+ dataset:
127
+ name: HateBR Binary
128
+ type: ruanchaves/hatebr
129
+ split: test
130
+ args:
131
+ num_few_shot: 25
132
+ metrics:
133
+ - type: f1_macro
134
+ value: 40.05
135
+ name: f1-macro
136
+ source:
137
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/gemma-portuguese-luana-2b
138
+ name: Open Portuguese LLM Leaderboard
139
+ - task:
140
+ type: text-generation
141
+ name: Text Generation
142
+ dataset:
143
+ name: PT Hate Speech Binary
144
+ type: hate_speech_portuguese
145
+ split: test
146
+ args:
147
+ num_few_shot: 25
148
+ metrics:
149
+ - type: f1_macro
150
+ value: 51.83
151
+ name: f1-macro
152
+ source:
153
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/gemma-portuguese-luana-2b
154
+ name: Open Portuguese LLM Leaderboard
155
+ - task:
156
+ type: text-generation
157
+ name: Text Generation
158
+ dataset:
159
+ name: tweetSentBR
160
+ type: eduagarcia/tweetsentbr_fewshot
161
+ split: test
162
+ args:
163
+ num_few_shot: 25
164
+ metrics:
165
+ - type: f1_macro
166
+ value: 30.42
167
+ name: f1-macro
168
+ source:
169
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/gemma-portuguese-luana-2b
170
+ name: Open Portuguese LLM Leaderboard
171
  ---
172
 
173
  # gemma-portuguese-2b-luana
 
298
  <a href="https://github.com/rhaymisonbetini" target="_blank">
299
  <img src="https://img.shields.io/badge/GitHub-100000?style=for-the-badge&logo=github&logoColor=white">
300
  </a>
301
+ </div>
302
+ # Open Portuguese LLM Leaderboard Evaluation Results
303
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/rhaymison/gemma-portuguese-luana-2b) and on the [�� Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
304
+
305
+ | Metric | Value |
306
+ |--------------------------|---------|
307
+ |Average |**34.94**|
308
+ |ENEM Challenge (No Images)| 24.42|
309
+ |BLUEX (No Images) | 24.34|
310
+ |OAB Exams | 27.11|
311
+ |Assin2 RTE | 70.86|
312
+ |Assin2 STS | 1.51|
313
+ |FaQuAD NLI | 43.97|
314
+ |HateBR Binary | 40.05|
315
+ |PT Hate Speech Binary | 51.83|
316
+ |tweetSentBR | 30.42|
317
+