leaderboard-pt-pr-bot commited on
Commit
9484868
1 Parent(s): d6e2d34

Adding the Open Portuguese LLM Leaderboard Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard

The purpose of this PR is to add evaluation results from the Open Portuguese LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show
  1. README.md +139 -1
README.md CHANGED
@@ -1,8 +1,129 @@
1
  ---
2
- license: other
3
  language:
4
  - pt
5
  - en
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  ---
7
  **Conheça os nossos outros modelos (bem melhores): [Cabra](https://huggingface.co/collections/botbot-ai/models-6604c2069ceef04f834ba99b)**
8
 
@@ -12,3 +133,20 @@ O Cabra 7b é um qlora finetune do [LLaMA 2 7b Chat](https://huggingface.co/meta
12
 
13
  O modelo precisa de mais treinamento, e pode gerar mentira ou inverdades.
14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  language:
3
  - pt
4
  - en
5
+ license: other
6
+ model-index:
7
+ - name: Cabra
8
+ results:
9
+ - task:
10
+ type: text-generation
11
+ name: Text Generation
12
+ dataset:
13
+ name: ENEM Challenge (No Images)
14
+ type: eduagarcia/enem_challenge
15
+ split: train
16
+ args:
17
+ num_few_shot: 3
18
+ metrics:
19
+ - type: acc
20
+ value: 33.24
21
+ name: accuracy
22
+ source:
23
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/Cabra
24
+ name: Open Portuguese LLM Leaderboard
25
+ - task:
26
+ type: text-generation
27
+ name: Text Generation
28
+ dataset:
29
+ name: BLUEX (No Images)
30
+ type: eduagarcia-temp/BLUEX_without_images
31
+ split: train
32
+ args:
33
+ num_few_shot: 3
34
+ metrics:
35
+ - type: acc
36
+ value: 33.38
37
+ name: accuracy
38
+ source:
39
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/Cabra
40
+ name: Open Portuguese LLM Leaderboard
41
+ - task:
42
+ type: text-generation
43
+ name: Text Generation
44
+ dataset:
45
+ name: OAB Exams
46
+ type: eduagarcia/oab_exams
47
+ split: train
48
+ args:
49
+ num_few_shot: 3
50
+ metrics:
51
+ - type: acc
52
+ value: 32.39
53
+ name: accuracy
54
+ source:
55
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/Cabra
56
+ name: Open Portuguese LLM Leaderboard
57
+ - task:
58
+ type: text-generation
59
+ name: Text Generation
60
+ dataset:
61
+ name: Assin2 RTE
62
+ type: assin2
63
+ split: test
64
+ args:
65
+ num_few_shot: 15
66
+ metrics:
67
+ - type: f1_macro
68
+ value: 81.12
69
+ name: f1-macro
70
+ - type: pearson
71
+ value: 29.38
72
+ name: pearson
73
+ source:
74
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/Cabra
75
+ name: Open Portuguese LLM Leaderboard
76
+ - task:
77
+ type: text-generation
78
+ name: Text Generation
79
+ dataset:
80
+ name: FaQuAD NLI
81
+ type: ruanchaves/faquad-nli
82
+ split: test
83
+ args:
84
+ num_few_shot: 15
85
+ metrics:
86
+ - type: f1_macro
87
+ value: 47.29
88
+ name: f1-macro
89
+ source:
90
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/Cabra
91
+ name: Open Portuguese LLM Leaderboard
92
+ - task:
93
+ type: text-generation
94
+ name: Text Generation
95
+ dataset:
96
+ name: HateBR Binary
97
+ type: eduagarcia/portuguese_benchmark
98
+ split: test
99
+ args:
100
+ num_few_shot: 25
101
+ metrics:
102
+ - type: f1_macro
103
+ value: 81.71
104
+ name: f1-macro
105
+ - type: f1_macro
106
+ value: 64.84
107
+ name: f1-macro
108
+ source:
109
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/Cabra
110
+ name: Open Portuguese LLM Leaderboard
111
+ - task:
112
+ type: text-generation
113
+ name: Text Generation
114
+ dataset:
115
+ name: tweetSentBR
116
+ type: eduagarcia-temp/tweetsentbr
117
+ split: test
118
+ args:
119
+ num_few_shot: 25
120
+ metrics:
121
+ - type: f1_macro
122
+ value: 50.33
123
+ name: f1-macro
124
+ source:
125
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/Cabra
126
+ name: Open Portuguese LLM Leaderboard
127
  ---
128
  **Conheça os nossos outros modelos (bem melhores): [Cabra](https://huggingface.co/collections/botbot-ai/models-6604c2069ceef04f834ba99b)**
129
 
 
133
 
134
  O modelo precisa de mais treinamento, e pode gerar mentira ou inverdades.
135
 
136
+
137
+ # [Open Portuguese LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
138
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/nicolasdec/Cabra)
139
+
140
+ | Metric | Value |
141
+ |--------------------------|---------|
142
+ |Average |**50.41**|
143
+ |ENEM Challenge (No Images)| 33.24|
144
+ |BLUEX (No Images) | 33.38|
145
+ |OAB Exams | 32.39|
146
+ |Assin2 RTE | 81.12|
147
+ |Assin2 STS | 29.38|
148
+ |FaQuAD NLI | 47.29|
149
+ |HateBR Binary | 81.71|
150
+ |PT Hate Speech Binary | 64.84|
151
+ |tweetSentBR | 50.33|
152
+