leaderboard-pt-pr-bot
commited on
Commit
•
d86b9ab
1
Parent(s):
8c87091
Adding the Open Portuguese LLM Leaderboard Evaluation Results
Browse filesThis is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard
The purpose of this PR is to add evaluation results from the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard) to your model card.
If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions
README.md
CHANGED
@@ -1,14 +1,14 @@
|
|
1 |
---
|
2 |
-
|
3 |
tags:
|
4 |
- alignment-handbook
|
5 |
- generated_from_trainer
|
|
|
6 |
datasets:
|
7 |
- princeton-nlp/gemma2-ultrafeedback-armorm
|
8 |
model-index:
|
9 |
-
- name: princeton-nlp/gemma-2-9b-it-SimPO
|
10 |
results: []
|
11 |
-
license: mit
|
12 |
---
|
13 |
|
14 |
# gemma-2-9b-it-SimPO Model Card
|
@@ -135,4 +135,23 @@ ArmoRM paper:
|
|
135 |
journal={arXiv preprint arXiv:2406.12845},
|
136 |
year={2024}
|
137 |
}
|
138 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
license: mit
|
3 |
tags:
|
4 |
- alignment-handbook
|
5 |
- generated_from_trainer
|
6 |
+
base_model: google/gemma-2-9b-it
|
7 |
datasets:
|
8 |
- princeton-nlp/gemma2-ultrafeedback-armorm
|
9 |
model-index:
|
10 |
+
- name: princeton-nlp/gemma-2-9b-it-SimPO
|
11 |
results: []
|
|
|
12 |
---
|
13 |
|
14 |
# gemma-2-9b-it-SimPO Model Card
|
|
|
135 |
journal={arXiv preprint arXiv:2406.12845},
|
136 |
year={2024}
|
137 |
}
|
138 |
+
```
|
139 |
+
|
140 |
+
|
141 |
+
# Open Portuguese LLM Leaderboard Evaluation Results
|
142 |
+
|
143 |
+
Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/princeton-nlp/gemma-2-9b-it-SimPO) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
|
144 |
+
|
145 |
+
| Metric | Value |
|
146 |
+
|--------------------------|---------|
|
147 |
+
|Average |**73.28**|
|
148 |
+
|ENEM Challenge (No Images)| 75.09|
|
149 |
+
|BLUEX (No Images) | 65.37|
|
150 |
+
|OAB Exams | 54.21|
|
151 |
+
|Assin2 RTE | 93.82|
|
152 |
+
|Assin2 STS | 77.82|
|
153 |
+
|FaQuAD NLI | 70.45|
|
154 |
+
|HateBR Binary | 89.76|
|
155 |
+
|PT Hate Speech Binary | 66.68|
|
156 |
+
|tweetSentBR | 66.28|
|
157 |
+
|