leaderboard-pt-pr-bot commited on
Commit
c8439bd
1 Parent(s): e0e28b6

Adding the Open Portuguese LLM Leaderboard Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard

The purpose of this PR is to add evaluation results from the Open Portuguese LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show
  1. README.md +139 -2
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
- license: mit
3
  language:
4
  - pt
 
5
  tags:
6
  - gervasio-pt*
7
  - gervasio-ptpt
@@ -18,6 +18,127 @@ tags:
18
  datasets:
19
  - PORTULAN/extraglue
20
  - PORTULAN/extraglue-instruct
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  ---
22
  </br>
23
  </br>
@@ -172,4 +293,20 @@ grant PINFRA/22117/2016; research project GPT-PT - Transformer-based Decoder for
172
  grant CPCA-IAC/AV/478395/2022; innovation project
173
  ACCELERAT.AI - Multilingual Intelligent Contact Centers, funded by IAPMEI, I.P. - Agência para a Competitividade e Inovação
174
  under the grant C625734525-00462629, of Plano de Recuperação e Resiliência,
175
- call RE-C05-i01.01 – Agendas/Alianças Mobilizadoras para a Reindustrialização.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  language:
3
  - pt
4
+ license: mit
5
  tags:
6
  - gervasio-pt*
7
  - gervasio-ptpt
 
18
  datasets:
19
  - PORTULAN/extraglue
20
  - PORTULAN/extraglue-instruct
21
+ model-index:
22
+ - name: gervasio-7b-portuguese-ptbr-decoder
23
+ results:
24
+ - task:
25
+ type: text-generation
26
+ name: Text Generation
27
+ dataset:
28
+ name: ENEM Challenge (No Images)
29
+ type: eduagarcia/enem_challenge
30
+ split: train
31
+ args:
32
+ num_few_shot: 3
33
+ metrics:
34
+ - type: acc
35
+ value: 21.34
36
+ name: accuracy
37
+ source:
38
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=PORTULAN/gervasio-7b-portuguese-ptbr-decoder
39
+ name: Open Portuguese LLM Leaderboard
40
+ - task:
41
+ type: text-generation
42
+ name: Text Generation
43
+ dataset:
44
+ name: BLUEX (No Images)
45
+ type: eduagarcia-temp/BLUEX_without_images
46
+ split: train
47
+ args:
48
+ num_few_shot: 3
49
+ metrics:
50
+ - type: acc
51
+ value: 21.0
52
+ name: accuracy
53
+ source:
54
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=PORTULAN/gervasio-7b-portuguese-ptbr-decoder
55
+ name: Open Portuguese LLM Leaderboard
56
+ - task:
57
+ type: text-generation
58
+ name: Text Generation
59
+ dataset:
60
+ name: OAB Exams
61
+ type: eduagarcia/oab_exams
62
+ split: train
63
+ args:
64
+ num_few_shot: 3
65
+ metrics:
66
+ - type: acc
67
+ value: 26.29
68
+ name: accuracy
69
+ source:
70
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=PORTULAN/gervasio-7b-portuguese-ptbr-decoder
71
+ name: Open Portuguese LLM Leaderboard
72
+ - task:
73
+ type: text-generation
74
+ name: Text Generation
75
+ dataset:
76
+ name: Assin2 RTE
77
+ type: assin2
78
+ split: test
79
+ args:
80
+ num_few_shot: 15
81
+ metrics:
82
+ - type: f1_macro
83
+ value: 83.15
84
+ name: f1-macro
85
+ - type: pearson
86
+ value: 69.55
87
+ name: pearson
88
+ source:
89
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=PORTULAN/gervasio-7b-portuguese-ptbr-decoder
90
+ name: Open Portuguese LLM Leaderboard
91
+ - task:
92
+ type: text-generation
93
+ name: Text Generation
94
+ dataset:
95
+ name: FaQuAD NLI
96
+ type: ruanchaves/faquad-nli
97
+ split: test
98
+ args:
99
+ num_few_shot: 15
100
+ metrics:
101
+ - type: f1_macro
102
+ value: 18.59
103
+ name: f1-macro
104
+ source:
105
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=PORTULAN/gervasio-7b-portuguese-ptbr-decoder
106
+ name: Open Portuguese LLM Leaderboard
107
+ - task:
108
+ type: text-generation
109
+ name: Text Generation
110
+ dataset:
111
+ name: HateBR Binary
112
+ type: eduagarcia/portuguese_benchmark
113
+ split: test
114
+ args:
115
+ num_few_shot: 25
116
+ metrics:
117
+ - type: f1_macro
118
+ value: 53.8
119
+ name: f1-macro
120
+ - type: f1_macro
121
+ value: 47.24
122
+ name: f1-macro
123
+ source:
124
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=PORTULAN/gervasio-7b-portuguese-ptbr-decoder
125
+ name: Open Portuguese LLM Leaderboard
126
+ - task:
127
+ type: text-generation
128
+ name: Text Generation
129
+ dataset:
130
+ name: tweetSentBR
131
+ type: eduagarcia-temp/tweetsentbr
132
+ split: test
133
+ args:
134
+ num_few_shot: 25
135
+ metrics:
136
+ - type: f1_macro
137
+ value: 14.21
138
+ name: f1-macro
139
+ source:
140
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=PORTULAN/gervasio-7b-portuguese-ptbr-decoder
141
+ name: Open Portuguese LLM Leaderboard
142
  ---
143
  </br>
144
  </br>
 
293
  grant CPCA-IAC/AV/478395/2022; innovation project
294
  ACCELERAT.AI - Multilingual Intelligent Contact Centers, funded by IAPMEI, I.P. - Agência para a Competitividade e Inovação
295
  under the grant C625734525-00462629, of Plano de Recuperação e Resiliência,
296
+ call RE-C05-i01.01 – Agendas/Alianças Mobilizadoras para a Reindustrialização.
297
+ # [Open Portuguese LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
298
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/PORTULAN/gervasio-7b-portuguese-ptbr-decoder)
299
+
300
+ | Metric | Value |
301
+ |--------------------------|---------|
302
+ |Average |**39.46**|
303
+ |ENEM Challenge (No Images)| 21.34|
304
+ |BLUEX (No Images) | 21|
305
+ |OAB Exams | 26.29|
306
+ |Assin2 RTE | 83.15|
307
+ |Assin2 STS | 69.55|
308
+ |FaQuAD NLI | 18.59|
309
+ |HateBR Binary | 53.80|
310
+ |PT Hate Speech Binary | 47.24|
311
+ |tweetSentBR | 14.21|
312
+