leaderboard-pt-pr-bot commited on
Commit
899003d
1 Parent(s): 863ba2a

Adding the Open Portuguese LLM Leaderboard Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard

The purpose of this PR is to add evaluation results from the Open Portuguese LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show
  1. README.md +147 -10
README.md CHANGED
@@ -1,18 +1,139 @@
1
  ---
2
- tags:
3
- - text-generation
4
- - pytorch
5
- - LLM
6
- - Portuguese
7
- - Llama 2
8
- inference: false
9
- license: llama2
10
  language:
11
  - pt
12
- pipeline_tag: text-generation
13
  library_name: transformers
 
 
 
 
 
 
14
  datasets:
15
  - dominguesm/CC-MAIN-2023-23
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  ---
17
 
18
  <p align="center">
@@ -94,4 +215,20 @@ Glória, e sua governanta, a governanta Josefa. No entanto, no outono de
94
  Capitu, uma moça de 14 anos, que se tornará sua companheira por muitos anos.
95
  ```
96
 
97
- **NOTE**: README under construction
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
 
 
 
 
 
 
2
  language:
3
  - pt
4
+ license: llama2
5
  library_name: transformers
6
+ tags:
7
+ - text-generation
8
+ - pytorch
9
+ - LLM
10
+ - Portuguese
11
+ - Llama 2
12
  datasets:
13
  - dominguesm/CC-MAIN-2023-23
14
+ inference: false
15
+ pipeline_tag: text-generation
16
+ model-index:
17
+ - name: Canarim-7B-Instruct
18
+ results:
19
+ - task:
20
+ type: text-generation
21
+ name: Text Generation
22
+ dataset:
23
+ name: ENEM Challenge (No Images)
24
+ type: eduagarcia/enem_challenge
25
+ split: train
26
+ args:
27
+ num_few_shot: 3
28
+ metrics:
29
+ - type: acc
30
+ value: 27.5
31
+ name: accuracy
32
+ source:
33
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=dominguesm/Canarim-7B-Instruct
34
+ name: Open Portuguese LLM Leaderboard
35
+ - task:
36
+ type: text-generation
37
+ name: Text Generation
38
+ dataset:
39
+ name: BLUEX (No Images)
40
+ type: eduagarcia-temp/BLUEX_without_images
41
+ split: train
42
+ args:
43
+ num_few_shot: 3
44
+ metrics:
45
+ - type: acc
46
+ value: 26.15
47
+ name: accuracy
48
+ source:
49
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=dominguesm/Canarim-7B-Instruct
50
+ name: Open Portuguese LLM Leaderboard
51
+ - task:
52
+ type: text-generation
53
+ name: Text Generation
54
+ dataset:
55
+ name: OAB Exams
56
+ type: eduagarcia/oab_exams
57
+ split: train
58
+ args:
59
+ num_few_shot: 3
60
+ metrics:
61
+ - type: acc
62
+ value: 29.93
63
+ name: accuracy
64
+ source:
65
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=dominguesm/Canarim-7B-Instruct
66
+ name: Open Portuguese LLM Leaderboard
67
+ - task:
68
+ type: text-generation
69
+ name: Text Generation
70
+ dataset:
71
+ name: Assin2 RTE
72
+ type: assin2
73
+ split: test
74
+ args:
75
+ num_few_shot: 15
76
+ metrics:
77
+ - type: f1_macro
78
+ value: 75.74
79
+ name: f1-macro
80
+ - type: pearson
81
+ value: 12.08
82
+ name: pearson
83
+ source:
84
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=dominguesm/Canarim-7B-Instruct
85
+ name: Open Portuguese LLM Leaderboard
86
+ - task:
87
+ type: text-generation
88
+ name: Text Generation
89
+ dataset:
90
+ name: FaQuAD NLI
91
+ type: ruanchaves/faquad-nli
92
+ split: test
93
+ args:
94
+ num_few_shot: 15
95
+ metrics:
96
+ - type: f1_macro
97
+ value: 43.92
98
+ name: f1-macro
99
+ source:
100
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=dominguesm/Canarim-7B-Instruct
101
+ name: Open Portuguese LLM Leaderboard
102
+ - task:
103
+ type: text-generation
104
+ name: Text Generation
105
+ dataset:
106
+ name: HateBR Binary
107
+ type: eduagarcia/portuguese_benchmark
108
+ split: test
109
+ args:
110
+ num_few_shot: 25
111
+ metrics:
112
+ - type: f1_macro
113
+ value: 79.57
114
+ name: f1-macro
115
+ - type: f1_macro
116
+ value: 64.01
117
+ name: f1-macro
118
+ source:
119
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=dominguesm/Canarim-7B-Instruct
120
+ name: Open Portuguese LLM Leaderboard
121
+ - task:
122
+ type: text-generation
123
+ name: Text Generation
124
+ dataset:
125
+ name: tweetSentBR
126
+ type: eduagarcia-temp/tweetsentbr
127
+ split: test
128
+ args:
129
+ num_few_shot: 25
130
+ metrics:
131
+ - type: f1_macro
132
+ value: 66.0
133
+ name: f1-macro
134
+ source:
135
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=dominguesm/Canarim-7B-Instruct
136
+ name: Open Portuguese LLM Leaderboard
137
  ---
138
 
139
  <p align="center">
 
215
  Capitu, uma moça de 14 anos, que se tornará sua companheira por muitos anos.
216
  ```
217
 
218
+ **NOTE**: README under construction
219
+ # [Open Portuguese LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
220
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/dominguesm/Canarim-7B-Instruct)
221
+
222
+ | Metric | Value |
223
+ |--------------------------|---------|
224
+ |Average |**47.21**|
225
+ |ENEM Challenge (No Images)| 27.50|
226
+ |BLUEX (No Images) | 26.15|
227
+ |OAB Exams | 29.93|
228
+ |Assin2 RTE | 75.74|
229
+ |Assin2 STS | 12.08|
230
+ |FaQuAD NLI | 43.92|
231
+ |HateBR Binary | 79.57|
232
+ |PT Hate Speech Binary | 64.01|
233
+ |tweetSentBR | 66|
234
+