leaderboard-pt-pr-bot commited on
Commit
9d130f5
1 Parent(s): d8d56fe

Adding the Open Portuguese LLM Leaderboard Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard

The purpose of this PR is to add evaluation results from the Open Portuguese LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show
  1. README.md +143 -6
README.md CHANGED
@@ -1,16 +1,137 @@
1
  ---
2
- library_name: transformers
3
- base_model: codellama/CodeLlama-7b-Instruct-hf
4
- license: llama2
5
- datasets:
6
- - semantixai/Test-Dataset-Lloro
7
  language:
8
  - pt
 
 
9
  tags:
10
  - code
11
  - analytics
12
  - analise-dados
13
  - portugues-BR
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  ---
15
 
16
  **Lloro 7B**
@@ -166,4 +287,20 @@ The following parameters related with the Quantized Low-Rank Adaptation and Qua
166
  | Datasets | 2.14.3 |
167
  | Pytorch | 2.0.1 |
168
  | Tokenizers | 0.14.1 |
169
- | Transformers | 4.34.0 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
 
 
 
2
  language:
3
  - pt
4
+ license: llama2
5
+ library_name: transformers
6
  tags:
7
  - code
8
  - analytics
9
  - analise-dados
10
  - portugues-BR
11
+ base_model: codellama/CodeLlama-7b-Instruct-hf
12
+ datasets:
13
+ - semantixai/Test-Dataset-Lloro
14
+ model-index:
15
+ - name: LloroV2
16
+ results:
17
+ - task:
18
+ type: text-generation
19
+ name: Text Generation
20
+ dataset:
21
+ name: ENEM Challenge (No Images)
22
+ type: eduagarcia/enem_challenge
23
+ split: train
24
+ args:
25
+ num_few_shot: 3
26
+ metrics:
27
+ - type: acc
28
+ value: 26.03
29
+ name: accuracy
30
+ source:
31
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=semantixai/LloroV2
32
+ name: Open Portuguese LLM Leaderboard
33
+ - task:
34
+ type: text-generation
35
+ name: Text Generation
36
+ dataset:
37
+ name: BLUEX (No Images)
38
+ type: eduagarcia-temp/BLUEX_without_images
39
+ split: train
40
+ args:
41
+ num_few_shot: 3
42
+ metrics:
43
+ - type: acc
44
+ value: 29.07
45
+ name: accuracy
46
+ source:
47
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=semantixai/LloroV2
48
+ name: Open Portuguese LLM Leaderboard
49
+ - task:
50
+ type: text-generation
51
+ name: Text Generation
52
+ dataset:
53
+ name: OAB Exams
54
+ type: eduagarcia/oab_exams
55
+ split: train
56
+ args:
57
+ num_few_shot: 3
58
+ metrics:
59
+ - type: acc
60
+ value: 32.53
61
+ name: accuracy
62
+ source:
63
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=semantixai/LloroV2
64
+ name: Open Portuguese LLM Leaderboard
65
+ - task:
66
+ type: text-generation
67
+ name: Text Generation
68
+ dataset:
69
+ name: Assin2 RTE
70
+ type: assin2
71
+ split: test
72
+ args:
73
+ num_few_shot: 15
74
+ metrics:
75
+ - type: f1_macro
76
+ value: 57.19
77
+ name: f1-macro
78
+ - type: pearson
79
+ value: 26.81
80
+ name: pearson
81
+ source:
82
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=semantixai/LloroV2
83
+ name: Open Portuguese LLM Leaderboard
84
+ - task:
85
+ type: text-generation
86
+ name: Text Generation
87
+ dataset:
88
+ name: FaQuAD NLI
89
+ type: ruanchaves/faquad-nli
90
+ split: test
91
+ args:
92
+ num_few_shot: 15
93
+ metrics:
94
+ - type: f1_macro
95
+ value: 43.77
96
+ name: f1-macro
97
+ source:
98
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=semantixai/LloroV2
99
+ name: Open Portuguese LLM Leaderboard
100
+ - task:
101
+ type: text-generation
102
+ name: Text Generation
103
+ dataset:
104
+ name: HateBR Binary
105
+ type: eduagarcia/portuguese_benchmark
106
+ split: test
107
+ args:
108
+ num_few_shot: 25
109
+ metrics:
110
+ - type: f1_macro
111
+ value: 68.02
112
+ name: f1-macro
113
+ - type: f1_macro
114
+ value: 38.53
115
+ name: f1-macro
116
+ source:
117
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=semantixai/LloroV2
118
+ name: Open Portuguese LLM Leaderboard
119
+ - task:
120
+ type: text-generation
121
+ name: Text Generation
122
+ dataset:
123
+ name: tweetSentBR
124
+ type: eduagarcia-temp/tweetsentbr
125
+ split: test
126
+ args:
127
+ num_few_shot: 25
128
+ metrics:
129
+ - type: f1_macro
130
+ value: 35.21
131
+ name: f1-macro
132
+ source:
133
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=semantixai/LloroV2
134
+ name: Open Portuguese LLM Leaderboard
135
  ---
136
 
137
  **Lloro 7B**
 
287
  | Datasets | 2.14.3 |
288
  | Pytorch | 2.0.1 |
289
  | Tokenizers | 0.14.1 |
290
+ | Transformers | 4.34.0 |
291
+ # [Open Portuguese LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
292
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/semantixai/LloroV2)
293
+
294
+ | Metric | Value |
295
+ |--------------------------|---------|
296
+ |Average |**39.68**|
297
+ |ENEM Challenge (No Images)| 26.03|
298
+ |BLUEX (No Images) | 29.07|
299
+ |OAB Exams | 32.53|
300
+ |Assin2 RTE | 57.19|
301
+ |Assin2 STS | 26.81|
302
+ |FaQuAD NLI | 43.77|
303
+ |HateBR Binary | 68.02|
304
+ |PT Hate Speech Binary | 38.53|
305
+ |tweetSentBR | 35.21|
306
+