[DEMO] (https://jjhooww-mistral-reloadbr.hf.space/)
[HF] (JJhooww/MistralReloadBR_v2_ptbr)
LLM Studio Prompt: Mistral Instruction
Evaluations on Brazilian Portuguese benchmarks were performed using a Portuguese implementation of the EleutherAI LM Evaluation Harness (created by Eduardo Garcia).
Tasks | Version | n-shot | Metric | Value |
---|---|---|---|---|
enem | 1.1 | 3 | acc | 0.6193 |
assin2_rte | 1.1 | 15 | f1_macro | 0.9137 |
assin2_sts | 1.1 | 15 | pearson | 0.7758 |
bluex | 1.1 | 3 | acc | 0.4562 |
faquad_nli | 1.1 | 15 | f1_macro | 0.6580 |
hatebr_offensive_binary | 1.0 | 25 | f1_macro | 0.7059 |
oab_exams | 1.5 | 3 | acc | 0.4064 |
portuguese_hate_speech_binary | 1.0 | 25 | f1_macro | 0.6476 |
Complete:
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
assin2_rte | 1.1 | all | 15 | f1_macro | 0.9137 | ± | 0.0040 |
all | 15 | acc | 0.9138 | ± | 0.0040 | ||
assin2_sts | 1.1 | all | 15 | pearson | 0.7758 | ± | 0.0068 |
all | 15 | mse | 0.4613 | ± | N/A | ||
bluex | 1.1 | all | 3 | acc | 0.4562 | ± | 0.0107 |
exam_id__USP_2024 | 3 | acc | 0.5610 | ± | 0.0447 | ||
exam_id__USP_2020 | 3 | acc | 0.4464 | ± | 0.0384 | ||
exam_id__USP_2019 | 3 | acc | 0.4000 | ± | 0.0447 | ||
exam_id__USP_2021 | 3 | acc | 0.4615 | ± | 0.0399 | ||
exam_id__UNICAMP_2021_1 | 3 | acc | 0.4565 | ± | 0.0424 | ||
exam_id__UNICAMP_2021_2 | 3 | acc | 0.4510 | ± | 0.0401 | ||
exam_id__UNICAMP_2023 | 3 | acc | 0.5581 | ± | 0.0438 | ||
exam_id__USP_2022 | 3 | acc | 0.4490 | ± | 0.0411 | ||
exam_id__UNICAMP_2024 | 3 | acc | 0.4222 | ± | 0.0425 | ||
exam_id__USP_2023 | 3 | acc | 0.5682 | ± | 0.0429 | ||
exam_id__UNICAMP_2022 | 3 | acc | 0.5385 | ± | 0.0461 | ||
exam_id__UNICAMP_2019 | 3 | acc | 0.4000 | ± | 0.0400 | ||
exam_id__USP_2018 | 3 | acc | 0.4259 | ± | 0.0389 | ||
exam_id__UNICAMP_2020 | 3 | acc | 0.4545 | ± | 0.0389 | ||
exam_id__UNICAMP_2018 | 3 | acc | 0.3148 | ± | 0.0364 | ||
enem | 1.1 | all | 3 | acc | 0.6193 | ± | 0.0074 |
exam_id__2017 | 3 | acc | 0.6207 | ± | 0.0259 | ||
exam_id__2014 | 3 | acc | 0.6972 | ± | 0.0254 | ||
exam_id__2016 | 3 | acc | 0.6281 | ± | 0.0253 | ||
exam_id__2016_2 | 3 | acc | 0.5935 | ± | 0.0256 | ||
exam_id__2010 | 3 | acc | 0.5812 | ± | 0.0264 | ||
exam_id__2015 | 3 | acc | 0.5798 | ± | 0.0261 | ||
exam_id__2013 | 3 | acc | 0.5926 | ± | 0.0273 | ||
exam_id__2022 | 3 | acc | 0.6015 | ± | 0.0245 | ||
exam_id__2011 | 3 | acc | 0.6752 | ± | 0.0250 | ||
exam_id__2012 | 3 | acc | 0.6034 | ± | 0.0262 | ||
exam_id__2023 | 3 | acc | 0.6667 | ± | 0.0235 | ||
exam_id__2009 | 3 | acc | 0.5913 | ± | 0.0265 | ||
faquad_nli | 1.1 | all | 15 | f1_macro | 0.6580 | ± | 0.0177 |
all | 15 | acc | 0.8308 | ± | 0.0104 | ||
hatebr_offensive_binary | 1.0 | all | 25 | f1_macro | 0.7059 | ± | 0.0089 |
all | 25 | acc | 0.7250 | ± | 0.0084 | ||
oab_exams | 1.5 | all | 3 | acc | 0.4064 | ± | 0.0061 |
exam_id__2011-04 | 3 | acc | 0.4500 | ± | 0.0321 | ||
exam_id__2015-16 | 3 | acc | 0.3500 | ± | 0.0308 | ||
exam_id__2017-22 | 3 | acc | 0.4625 | ± | 0.0322 | ||
exam_id__2016-19 | 3 | acc | 0.4744 | ± | 0.0328 | ||
exam_id__2017-23 | 3 | acc | 0.4000 | ± | 0.0317 | ||
exam_id__2016-20 | 3 | acc | 0.4250 | ± | 0.0319 | ||
exam_id__2013-10 | 3 | acc | 0.4750 | ± | 0.0323 | ||
exam_id__2012-06a | 3 | acc | 0.4000 | ± | 0.0314 | ||
exam_id__2010-02 | 3 | acc | 0.4000 | ± | 0.0283 | ||
exam_id__2010-01 | 3 | acc | 0.3647 | ± | 0.0300 | ||
exam_id__2012-08 | 3 | acc | 0.3375 | ± | 0.0305 | ||
exam_id__2012-09 | 3 | acc | 0.2597 | ± | 0.0289 | ||
exam_id__2015-18 | 3 | acc | 0.4375 | ± | 0.0320 | ||
exam_id__2015-17 | 3 | acc | 0.5385 | ± | 0.0326 | ||
exam_id__2016-21 | 3 | acc | 0.3500 | ± | 0.0307 | ||
exam_id__2013-11 | 3 | acc | 0.4875 | ± | 0.0323 | ||
exam_id__2012-06 | 3 | acc | 0.4375 | ± | 0.0319 | ||
exam_id__2014-14 | 3 | acc | 0.5250 | ± | 0.0322 | ||
exam_id__2016-20a | 3 | acc | 0.3750 | ± | 0.0312 | ||
exam_id__2011-05 | 3 | acc | 0.3750 | ± | 0.0313 | ||
exam_id__2011-03 | 3 | acc | 0.3737 | ± | 0.0280 | ||
exam_id__2014-13 | 3 | acc | 0.3375 | ± | 0.0305 | ||
exam_id__2017-24 | 3 | acc | 0.3125 | ± | 0.0299 | ||
exam_id__2018-25 | 3 | acc | 0.4125 | ± | 0.0317 | ||
exam_id__2012-07 | 3 | acc | 0.3875 | ± | 0.0315 | ||
exam_id__2014-15 | 3 | acc | 0.4487 | ± | 0.0325 | ||
exam_id__2013-12 | 3 | acc | 0.3875 | ± | 0.0315 | ||
portuguese_hate_speech_binary | 1.0 | all | 25 | f1_macro | 0.6476 | ± | 0.0119 |
all | 25 | acc | 0.6710 | ± | 0.0114 |
- Downloads last month
- 59