eduagarcia commited on
Commit
fb0f1c7
1 Parent(s): 1266df4

Adding the Open Portuguese LLM Leaderboard Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard

The purpose of this PR is to add evaluation results from the Open Portuguese LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show
  1. README.md +138 -0
README.md CHANGED
@@ -1,3 +1,141 @@
1
  ---
2
  license: llama2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: llama2
3
+ model-index:
4
+ - name: cabrita_7b_pt_850000
5
+ results:
6
+ - task:
7
+ type: text-generation
8
+ name: Text Generation
9
+ dataset:
10
+ name: ENEM Challenge (No Images)
11
+ type: eduagarcia/enem_challenge
12
+ split: train
13
+ args:
14
+ num_few_shot: 3
15
+ metrics:
16
+ - type: acc
17
+ value: 22.53
18
+ name: accuracy
19
+ source:
20
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=22h/cabrita_7b_pt_850000
21
+ name: Open Portuguese LLM Leaderboard
22
+ - task:
23
+ type: text-generation
24
+ name: Text Generation
25
+ dataset:
26
+ name: BLUEX (No Images)
27
+ type: eduagarcia-temp/BLUEX_without_images
28
+ split: train
29
+ args:
30
+ num_few_shot: 3
31
+ metrics:
32
+ - type: acc
33
+ value: 23.09
34
+ name: accuracy
35
+ source:
36
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=22h/cabrita_7b_pt_850000
37
+ name: Open Portuguese LLM Leaderboard
38
+ - task:
39
+ type: text-generation
40
+ name: Text Generation
41
+ dataset:
42
+ name: OAB Exams
43
+ type: eduagarcia/oab_exams
44
+ split: train
45
+ args:
46
+ num_few_shot: 3
47
+ metrics:
48
+ - type: acc
49
+ value: 29.2
50
+ name: accuracy
51
+ source:
52
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=22h/cabrita_7b_pt_850000
53
+ name: Open Portuguese LLM Leaderboard
54
+ - task:
55
+ type: text-generation
56
+ name: Text Generation
57
+ dataset:
58
+ name: Assin2 RTE
59
+ type: assin2
60
+ split: test
61
+ args:
62
+ num_few_shot: 15
63
+ metrics:
64
+ - type: f1_macro
65
+ value: 33.33
66
+ name: f1-macro
67
+ - type: pearson
68
+ value: 12.65
69
+ name: pearson
70
+ source:
71
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=22h/cabrita_7b_pt_850000
72
+ name: Open Portuguese LLM Leaderboard
73
+ - task:
74
+ type: text-generation
75
+ name: Text Generation
76
+ dataset:
77
+ name: FaQuAD NLI
78
+ type: ruanchaves/faquad-nli
79
+ split: test
80
+ args:
81
+ num_few_shot: 15
82
+ metrics:
83
+ - type: f1_macro
84
+ value: 17.72
85
+ name: f1-macro
86
+ source:
87
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=22h/cabrita_7b_pt_850000
88
+ name: Open Portuguese LLM Leaderboard
89
+ - task:
90
+ type: text-generation
91
+ name: Text Generation
92
+ dataset:
93
+ name: HateBR Binary
94
+ type: eduagarcia/portuguese_benchmark
95
+ split: test
96
+ args:
97
+ num_few_shot: 25
98
+ metrics:
99
+ - type: f1_macro
100
+ value: 55.98
101
+ name: f1-macro
102
+ - type: f1_macro
103
+ value: 49.02
104
+ name: f1-macro
105
+ source:
106
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=22h/cabrita_7b_pt_850000
107
+ name: Open Portuguese LLM Leaderboard
108
+ - task:
109
+ type: text-generation
110
+ name: Text Generation
111
+ dataset:
112
+ name: tweetSentBR
113
+ type: eduagarcia-temp/tweetsentbr
114
+ split: test
115
+ args:
116
+ num_few_shot: 25
117
+ metrics:
118
+ - type: f1_macro
119
+ value: 45.75
120
+ name: f1-macro
121
+ source:
122
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=22h/cabrita_7b_pt_850000
123
+ name: Open Portuguese LLM Leaderboard
124
  ---
125
+
126
+ # [Open Portuguese LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
127
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/22h/cabrita_7b_pt_850000)
128
+
129
+ | Metric | Value |
130
+ |--------------------------|---------|
131
+ |Average |**32.14**|
132
+ |ENEM Challenge (No Images)| 22.53|
133
+ |BLUEX (No Images) | 23.09|
134
+ |OAB Exams | 29.20|
135
+ |Assin2 RTE | 33.33|
136
+ |Assin2 STS | 12.65|
137
+ |FaQuAD NLI | 17.72|
138
+ |HateBR Binary | 55.98|
139
+ |PT Hate Speech Binary | 49.02|
140
+ |tweetSentBR | 45.75|
141
+