dominguesm
/

mambarim-110m

@@ -101,21 +101,15 @@ You can use the classic `generate` API:
 Evaluations on Brazilian Portuguese benchmarks were performed using a [Portuguese implementation of the EleutherAI LM Evaluation Harness](https://github.com/eduagarcia/lm-evaluation-harness-pt) (created by [Eduardo Garcia](https://github.com/eduagarcia/lm-evaluation-harness-pt)).
-|                    | **ASSIN2 RTE** | **ASSIN2 STS** | **BLUEX** | **ENEM** | **FAQUAD NLI** | **HateBR** | **OAB Exams** | **Average** |
-|--------------------|----------------|----------------|-----------|----------|----------------|------------|---------------|-------------|
-| Qwen-1.8B          | 64.83          | 19.53          | 26.15     | 30.23    | 43.97          | 33.33      | 27.20         | 35.03       |
-| TinyLlama-1.1B     | 58.93          | 13.57          | 22.81     | 22.25    | 43.97          | 36.92      | 23.64         | 31.72       |
-| TTL-460m           | 53.93          | 12.66          | 22.81     | 19.87    | 49.01          | 33.59      | 27.06         | 31.27       |
-| XGLM-564m          | 49.61          | 22.91          | 19.61     | 19.38    | 43.97          | 33.99      | 23.42         | 30.41       |
-| Bloom-1b7          | 53.60          | 4.81           | 21.42     | 18.96    | 43.97          | 34.89      | 23.05         | 28.67       |
-| TTL-160m           | 53.36          | 2.58           | 21.84     | 18.75    | 43.97          | 36.88      | 22.60         | 28.56       |
-| OPT-125m           | 39.77          | 2.00           | 21.84     | 17.42    | 43.97          | 47.04      | 22.78         | 27.83       |
-| Pythia-160         | 33.33          | 12.81          | 16.13     | 16.66    | 50.36          | 41.09      | 22.82         | 27.60       |
-| OLMo-1b            | 34.12          | 9.28           | 18.92     | 20.29    | 43.97          | 41.33      | 22.96         | 27.26       |
-| Bloom-560m         | 33.33          | 8.48           | 18.92     | 19.03    | 43.97          | 37.07      | 23.05         | 26.26       |
-| Pythia-410m        | 33.33          | 4.80           | 19.47     | 19.45    | 43.97          | 33.33      | 23.01         | 25.33       |
-| OPT-350m           | 33.33          | 3.65           | 20.72     | 17.35    | 44.71          | 33.33      | 23.01         | 25.15       |
-| GPT-2 small        | 33.26          | 0.00           | 10.43     | 11.20    | 43.52          | 33.68      | 13.12         | 20.74       |
-| GPorTuguese        | 33.33          | 3.85           | 14.74     | 3.01     | 28.81          | 33.33      | 21.23         | 19.75       |
-| **Mambarim-110M**  | 40.64          | 3.11           | 13.90     | 14.76    | 00.15          | 49.00      | 20.27         | 17.72       |
-| Samba-1.1B         | 33.33          | 1.30           | 8.07      | 10.22    | 17.72          | 35.79      | 15.03         | 17.35       |

 Evaluations on Brazilian Portuguese benchmarks were performed using a [Portuguese implementation of the EleutherAI LM Evaluation Harness](https://github.com/eduagarcia/lm-evaluation-harness-pt) (created by [Eduardo Garcia](https://github.com/eduagarcia/lm-evaluation-harness-pt)).
+Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/dominguesm/mambarim-110m)
+| Model                                  | **Average** | ENEM  | BLUEX | OAB Exams | ASSIN2 RTE | ASSIN2 STS | FAQNAD NLI | HateBR | PT Hate Speech | tweetSentBR | **Architecture**   |
+| -------------------------------------- | ----------- | ----- | ----- | --------- | ---------- | ---------- | ---------- | ------ | -------------- | ----------- | ------------------ |
+| nicholasKluge/TeenyTinyLlama-460m      | 28.86       | 20.15 | 25.73 | 27.02     | 53.61      | 13         | 46.41      | 33.59  | 22.99          | 17.28       | LlamaForCausalLM   |
+| nicholasKluge/TeenyTinyLlama-160m      | 28.2        | 19.24 | 23.09 | 22.37     | 53.97      | 0.24       | 43.97      | 36.92  | 42.63          | 11.39       | LlamaForCausalLM   |
+| MulaBR/Mula-4x160-v0.1                 | 26.24       | 21.34 | 25.17 | 25.06     | 33.57      | 11.35      | 43.97      | 41.5   | 22.99          | 11.24       | MixtralForCausalLM |
+| nicholasKluge/TeenyTinyLlama-460m-Chat | 25.49       | 20.29 | 25.45 | 26.74     | 43.77      | 4.52       | 34         | 33.49  | 22.99          | 18.13       | LlamaForCausalLM   |
+| **dominguesm/manbarim-110m**           | **14.16**   | 18.4  | 10.57 | 21.87     | 16.09      | 1.89       | 9.29       | 15.75  | 17.77          | 15.79       | MambaForCausalLM   |
+| NOVA-vision-language/GloriaTA-3B       | 4.09        | 1.89  | 3.2   | 5.19      | 0          | 2.32       | 0.26       | 0.28   | 23.52          | 0.19        | GPTNeoForCausalLM  |