Adding the Open Portuguese LLM Leaderboard Evaluation Results

#1
Files changed (1) hide show
  1. README.md +168 -2
README.md CHANGED
@@ -7,11 +7,158 @@ tags:
7
  - LoRA
8
  - Llama
9
  - Stanford-Alpaca
 
10
  datasets:
11
  - dominguesm/alpaca-data-pt-br
12
  thumbnail: https://huggingface.co/dominguesm/alpaca-lora-ptbr-7b/resolve/main/assets/alpaca_br_juliet_2.jpg
13
  inference: false
14
- base_model: decapoda-research/llama-7b-hf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  ---
16
 
17
  ## πŸ¦™πŸ‡§πŸ‡· Alpaca-LoRA-PTBR: Low-Rank LLaMA Instruct-Tuning
@@ -222,4 +369,23 @@ LLaMA is a foundational model, and as such, it should not be used for downstream
222
  ## References
223
 
224
  * Workout descriptions and script based on work done by [Eric J. Wang](https://github.com/tloen/alpaca-lora)
225
- * Training data based on original [Stanford Alpaca](https://crfm.stanford.edu/2023/03/13/alpaca.html) work
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  - LoRA
8
  - Llama
9
  - Stanford-Alpaca
10
+ base_model: decapoda-research/llama-7b-hf
11
  datasets:
12
  - dominguesm/alpaca-data-pt-br
13
  thumbnail: https://huggingface.co/dominguesm/alpaca-lora-ptbr-7b/resolve/main/assets/alpaca_br_juliet_2.jpg
14
  inference: false
15
+ model-index:
16
+ - name: alpaca-lora-ptbr-7b
17
+ results:
18
+ - task:
19
+ type: text-generation
20
+ name: Text Generation
21
+ dataset:
22
+ name: ENEM Challenge (No Images)
23
+ type: eduagarcia/enem_challenge
24
+ split: train
25
+ args:
26
+ num_few_shot: 3
27
+ metrics:
28
+ - type: acc
29
+ value: 22.32
30
+ name: accuracy
31
+ source:
32
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=RogerioPiazzon/alpaca-lora-ptbr-7b
33
+ name: Open Portuguese LLM Leaderboard
34
+ - task:
35
+ type: text-generation
36
+ name: Text Generation
37
+ dataset:
38
+ name: BLUEX (No Images)
39
+ type: eduagarcia-temp/BLUEX_without_images
40
+ split: train
41
+ args:
42
+ num_few_shot: 3
43
+ metrics:
44
+ - type: acc
45
+ value: 23.5
46
+ name: accuracy
47
+ source:
48
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=RogerioPiazzon/alpaca-lora-ptbr-7b
49
+ name: Open Portuguese LLM Leaderboard
50
+ - task:
51
+ type: text-generation
52
+ name: Text Generation
53
+ dataset:
54
+ name: OAB Exams
55
+ type: eduagarcia/oab_exams
56
+ split: train
57
+ args:
58
+ num_few_shot: 3
59
+ metrics:
60
+ - type: acc
61
+ value: 26.51
62
+ name: accuracy
63
+ source:
64
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=RogerioPiazzon/alpaca-lora-ptbr-7b
65
+ name: Open Portuguese LLM Leaderboard
66
+ - task:
67
+ type: text-generation
68
+ name: Text Generation
69
+ dataset:
70
+ name: Assin2 RTE
71
+ type: assin2
72
+ split: test
73
+ args:
74
+ num_few_shot: 15
75
+ metrics:
76
+ - type: f1_macro
77
+ value: 33.7
78
+ name: f1-macro
79
+ source:
80
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=RogerioPiazzon/alpaca-lora-ptbr-7b
81
+ name: Open Portuguese LLM Leaderboard
82
+ - task:
83
+ type: text-generation
84
+ name: Text Generation
85
+ dataset:
86
+ name: Assin2 STS
87
+ type: eduagarcia/portuguese_benchmark
88
+ split: test
89
+ args:
90
+ num_few_shot: 15
91
+ metrics:
92
+ - type: pearson
93
+ value: 18.17
94
+ name: pearson
95
+ source:
96
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=RogerioPiazzon/alpaca-lora-ptbr-7b
97
+ name: Open Portuguese LLM Leaderboard
98
+ - task:
99
+ type: text-generation
100
+ name: Text Generation
101
+ dataset:
102
+ name: FaQuAD NLI
103
+ type: ruanchaves/faquad-nli
104
+ split: test
105
+ args:
106
+ num_few_shot: 15
107
+ metrics:
108
+ - type: f1_macro
109
+ value: 56.27
110
+ name: f1-macro
111
+ source:
112
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=RogerioPiazzon/alpaca-lora-ptbr-7b
113
+ name: Open Portuguese LLM Leaderboard
114
+ - task:
115
+ type: text-generation
116
+ name: Text Generation
117
+ dataset:
118
+ name: HateBR Binary
119
+ type: ruanchaves/hatebr
120
+ split: test
121
+ args:
122
+ num_few_shot: 25
123
+ metrics:
124
+ - type: f1_macro
125
+ value: 33.33
126
+ name: f1-macro
127
+ source:
128
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=RogerioPiazzon/alpaca-lora-ptbr-7b
129
+ name: Open Portuguese LLM Leaderboard
130
+ - task:
131
+ type: text-generation
132
+ name: Text Generation
133
+ dataset:
134
+ name: PT Hate Speech Binary
135
+ type: hate_speech_portuguese
136
+ split: test
137
+ args:
138
+ num_few_shot: 25
139
+ metrics:
140
+ - type: f1_macro
141
+ value: 22.99
142
+ name: f1-macro
143
+ source:
144
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=RogerioPiazzon/alpaca-lora-ptbr-7b
145
+ name: Open Portuguese LLM Leaderboard
146
+ - task:
147
+ type: text-generation
148
+ name: Text Generation
149
+ dataset:
150
+ name: tweetSentBR
151
+ type: eduagarcia/tweetsentbr_fewshot
152
+ split: test
153
+ args:
154
+ num_few_shot: 25
155
+ metrics:
156
+ - type: f1_macro
157
+ value: 46.95
158
+ name: f1-macro
159
+ source:
160
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=RogerioPiazzon/alpaca-lora-ptbr-7b
161
+ name: Open Portuguese LLM Leaderboard
162
  ---
163
 
164
  ## πŸ¦™πŸ‡§πŸ‡· Alpaca-LoRA-PTBR: Low-Rank LLaMA Instruct-Tuning
 
369
  ## References
370
 
371
  * Workout descriptions and script based on work done by [Eric J. Wang](https://github.com/tloen/alpaca-lora)
372
+ * Training data based on original [Stanford Alpaca](https://crfm.stanford.edu/2023/03/13/alpaca.html) work
373
+
374
+
375
+ # Open Portuguese LLM Leaderboard Evaluation Results
376
+
377
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/RogerioPiazzon/alpaca-lora-ptbr-7b) and on the [πŸš€ Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
378
+
379
+ | Metric | Value |
380
+ |--------------------------|---------|
381
+ |Average |**31.53**|
382
+ |ENEM Challenge (No Images)| 22.32|
383
+ |BLUEX (No Images) | 23.50|
384
+ |OAB Exams | 26.51|
385
+ |Assin2 RTE | 33.70|
386
+ |Assin2 STS | 18.17|
387
+ |FaQuAD NLI | 56.27|
388
+ |HateBR Binary | 33.33|
389
+ |PT Hate Speech Binary | 22.99|
390
+ |tweetSentBR | 46.95|
391
+