leaderboard-pt-pr-bot commited on
Commit
2d2eaa7
1 Parent(s): cd4ad4d

Fixing some errors of the leaderboard evaluation results in the ModelCard yaml

Browse files

The name of a few benchmarks are incorrect on the model metadata. This commit fixes some minor errors of the [last PR](1) on the ModelCard YAML metadata.

Files changed (1) hide show
  1. README.md +27 -1
README.md CHANGED
@@ -106,6 +106,19 @@ model-index:
106
  - type: f1_macro
107
  value: 43.77
108
  name: f1-macro
 
 
 
 
 
 
 
 
 
 
 
 
 
109
  - type: pearson
110
  value: 4.52
111
  name: pearson
@@ -133,7 +146,7 @@ model-index:
133
  name: Text Generation
134
  dataset:
135
  name: HateBR Binary
136
- type: eduagarcia/portuguese_benchmark
137
  split: test
138
  args:
139
  num_few_shot: 25
@@ -141,6 +154,19 @@ model-index:
141
  - type: f1_macro
142
  value: 33.49
143
  name: f1-macro
 
 
 
 
 
 
 
 
 
 
 
 
 
144
  - type: f1_macro
145
  value: 22.99
146
  name: f1-macro
 
106
  - type: f1_macro
107
  value: 43.77
108
  name: f1-macro
109
+ source:
110
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicholasKluge/TeenyTinyLlama-460m-Chat
111
+ name: Open Portuguese LLM Leaderboard
112
+ - task:
113
+ type: text-generation
114
+ name: Text Generation
115
+ dataset:
116
+ name: Assin2 STS
117
+ type: eduagarcia/portuguese_benchmark
118
+ split: test
119
+ args:
120
+ num_few_shot: 15
121
+ metrics:
122
  - type: pearson
123
  value: 4.52
124
  name: pearson
 
146
  name: Text Generation
147
  dataset:
148
  name: HateBR Binary
149
+ type: ruanchaves/hatebr
150
  split: test
151
  args:
152
  num_few_shot: 25
 
154
  - type: f1_macro
155
  value: 33.49
156
  name: f1-macro
157
+ source:
158
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicholasKluge/TeenyTinyLlama-460m-Chat
159
+ name: Open Portuguese LLM Leaderboard
160
+ - task:
161
+ type: text-generation
162
+ name: Text Generation
163
+ dataset:
164
+ name: PT Hate Speech Binary
165
+ type: hate_speech_portuguese
166
+ split: test
167
+ args:
168
+ num_few_shot: 25
169
+ metrics:
170
  - type: f1_macro
171
  value: 22.99
172
  name: f1-macro