Commit
4f58c7d
1 Parent(s): 78d0a3b

Fixing some errors of the leaderboard evaluation results in the ModelCard yaml (#8)

Browse files

- Fixing some errors of the leaderboard evaluation results in the ModelCard yaml (334b6de9e9daa42e574efe72d84f2feefd202d52)


Co-authored-by: Open PT LLM Leaderboard PR Bot <leaderboard-pt-pr-bot@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +27 -1
README.md CHANGED
@@ -82,6 +82,19 @@ model-index:
82
  - type: f1_macro
83
  value: 79.83
84
  name: f1-macro
 
 
 
 
 
 
 
 
 
 
 
 
 
85
  - type: pearson
86
  value: 43.47
87
  name: pearson
@@ -109,7 +122,7 @@ model-index:
109
  name: Text Generation
110
  dataset:
111
  name: HateBR Binary
112
- type: eduagarcia/portuguese_benchmark
113
  split: test
114
  args:
115
  num_few_shot: 25
@@ -117,6 +130,19 @@ model-index:
117
  - type: f1_macro
118
  value: 85.06
119
  name: f1-macro
 
 
 
 
 
 
 
 
 
 
 
 
 
120
  - type: f1_macro
121
  value: 65.73
122
  name: f1-macro
 
82
  - type: f1_macro
83
  value: 79.83
84
  name: f1-macro
85
+ source:
86
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/bode-7b-alpaca-pt-br
87
+ name: Open Portuguese LLM Leaderboard
88
+ - task:
89
+ type: text-generation
90
+ name: Text Generation
91
+ dataset:
92
+ name: Assin2 STS
93
+ type: eduagarcia/portuguese_benchmark
94
+ split: test
95
+ args:
96
+ num_few_shot: 15
97
+ metrics:
98
  - type: pearson
99
  value: 43.47
100
  name: pearson
 
122
  name: Text Generation
123
  dataset:
124
  name: HateBR Binary
125
+ type: ruanchaves/hatebr
126
  split: test
127
  args:
128
  num_few_shot: 25
 
130
  - type: f1_macro
131
  value: 85.06
132
  name: f1-macro
133
+ source:
134
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/bode-7b-alpaca-pt-br
135
+ name: Open Portuguese LLM Leaderboard
136
+ - task:
137
+ type: text-generation
138
+ name: Text Generation
139
+ dataset:
140
+ name: PT Hate Speech Binary
141
+ type: hate_speech_portuguese
142
+ split: test
143
+ args:
144
+ num_few_shot: 25
145
+ metrics:
146
  - type: f1_macro
147
  value: 65.73
148
  name: f1-macro