File size: 5,920 Bytes

---
license: apache-2.0
tags:
- JJhooww/Mistral-7B-v0.2-Base_ptbr
- J-LAB/BRisa
model-index:
- name: BRisa-7B-Instruct-v0.2
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: ENEM Challenge (No Images)
      type: eduagarcia/enem_challenge
      split: train
      args:
        num_few_shot: 3
    metrics:
    - type: acc
      value: 65.08
      name: accuracy
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=J-LAB/BRisa-7B-Instruct-v0.2
      name: Open Portuguese LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: BLUEX (No Images)
      type: eduagarcia-temp/BLUEX_without_images
      split: train
      args:
        num_few_shot: 3
    metrics:
    - type: acc
      value: 53.69
      name: accuracy
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=J-LAB/BRisa-7B-Instruct-v0.2
      name: Open Portuguese LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: OAB Exams
      type: eduagarcia/oab_exams
      split: train
      args:
        num_few_shot: 3
    metrics:
    - type: acc
      value: 43.37
      name: accuracy
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=J-LAB/BRisa-7B-Instruct-v0.2
      name: Open Portuguese LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Assin2 RTE
      type: assin2
      split: test
      args:
        num_few_shot: 15
    metrics:
    - type: f1_macro
      value: 91.5
      name: f1-macro
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=J-LAB/BRisa-7B-Instruct-v0.2
      name: Open Portuguese LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Assin2 STS
      type: eduagarcia/portuguese_benchmark
      split: test
      args:
        num_few_shot: 15
    metrics:
    - type: pearson
      value: 73.61
      name: pearson
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=J-LAB/BRisa-7B-Instruct-v0.2
      name: Open Portuguese LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: FaQuAD NLI
      type: ruanchaves/faquad-nli
      split: test
      args:
        num_few_shot: 15
    metrics:
    - type: f1_macro
      value: 68.31
      name: f1-macro
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=J-LAB/BRisa-7B-Instruct-v0.2
      name: Open Portuguese LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: HateBR Binary
      type: ruanchaves/hatebr
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: f1_macro
      value: 74.28
      name: f1-macro
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=J-LAB/BRisa-7B-Instruct-v0.2
      name: Open Portuguese LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: PT Hate Speech Binary
      type: hate_speech_portuguese
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: f1_macro
      value: 65.12
      name: f1-macro
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=J-LAB/BRisa-7B-Instruct-v0.2
      name: Open Portuguese LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: tweetSentBR
      type: eduagarcia/tweetsentbr_fewshot
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: f1_macro
      value: 60.77
      name: f1-macro
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=J-LAB/BRisa-7B-Instruct-v0.2
      name: Open Portuguese LLM Leaderboard
---

# BRisa 7B Instruct

This is an instruction model trained for good performance in Portuguese. The initial base is the Mistral 7B v2 Model ([source](https://huggingface.co/mistral-community/Mistral-7B-v0.2)). We utilized the JJhooww/Mistral-7B-v0.2-Base_ptbr version pre-trained on 1 billion tokens in Portuguese ([source](https://huggingface.co/JJhooww/Mistral-7B-v0.2-Base_ptbr)).

The base model has good performance in Portuguese but faces significant challenges following instructions. We therefore used the version mistralai/Mistral-7B-Instruct-v0.2 and fine-tuned it for responses in Portuguese, then merged it with the base JJhooww/Mistral-7B-v0.2-Base_ptbr (https://huggingface.co/JJhooww/Mistral-7B-v0.2-Base_ptbr).

- **Developed by:** ([J-LAB](https://huggingface.co/J-LAB/))
- **Language(s) (NLP):** Portuguese
- **License:** *APACHE*
- **Finetuned from model:** ([source](https://huggingface.co/JJhooww/Mistral-7B-v0.2-Base_ptbr))

### Model Sources

- **Demo:** ([Demonstracao da Versão DPO](https://huggingface.co/spaces/J-LAB/BRisa-7B))


# Open Portuguese LLM Leaderboard Evaluation Results  

Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/J-LAB/BRisa-7B-Instruct-v0.2) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)

|          Metric          |  Value  |
|--------------------------|---------|
|Average                   |**66.19**|
|ENEM Challenge (No Images)|    65.08|
|BLUEX (No Images)         |    53.69|
|OAB Exams                 |    43.37|
|Assin2 RTE                |    91.50|
|Assin2 STS                |    73.61|
|FaQuAD NLI                |    68.31|
|HateBR Binary             |    74.28|
|PT Hate Speech Binary     |    65.12|
|tweetSentBR               |    60.77|