---
language:
- pt
license: mit
library_name: peft
tags:
- gptq
- ptbr
base_model: TheBloke/zephyr-7B-beta-GPTQ
revision: gptq-8bit-32g-actorder_True
model-index:
- name: cesar-ptbr
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: ENEM Challenge (No Images)
      type: eduagarcia/enem_challenge
      split: train
      args:
        num_few_shot: 3
    metrics:
    - type: acc
      value: 53.74
      name: accuracy
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
      name: Open Portuguese LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: BLUEX (No Images)
      type: eduagarcia-temp/BLUEX_without_images
      split: train
      args:
        num_few_shot: 3
    metrics:
    - type: acc
      value: 46.87
      name: accuracy
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
      name: Open Portuguese LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: OAB Exams
      type: eduagarcia/oab_exams
      split: train
      args:
        num_few_shot: 3
    metrics:
    - type: acc
      value: 38.27
      name: accuracy
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
      name: Open Portuguese LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Assin2 RTE
      type: assin2
      split: test
      args:
        num_few_shot: 15
    metrics:
    - type: f1_macro
      value: 58.32
      name: f1-macro
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
      name: Open Portuguese LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Assin2 STS
      type: eduagarcia/portuguese_benchmark
      split: test
      args:
        num_few_shot: 15
    metrics:
    - type: pearson
      value: 68.49
      name: pearson
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
      name: Open Portuguese LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: FaQuAD NLI
      type: ruanchaves/faquad-nli
      split: test
      args:
        num_few_shot: 15
    metrics:
    - type: f1_macro
      value: 73.81
      name: f1-macro
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
      name: Open Portuguese LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: HateBR Binary
      type: ruanchaves/hatebr
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: f1_macro
      value: 83.3
      name: f1-macro
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
      name: Open Portuguese LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: PT Hate Speech Binary
      type: hate_speech_portuguese
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: f1_macro
      value: 67.49
      name: f1-macro
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
      name: Open Portuguese LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: tweetSentBR
      type: eduagarcia/tweetsentbr_fewshot
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: f1_macro
      value: 42.71
      name: f1-macro
    source:
      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
      name: Open Portuguese LLM Leaderboard
---
## Training procedure


The following `bitsandbytes` quantization config was used during training:
- quant_method: gptq
- bits: 8
- tokenizer: None
- dataset: None
- group_size: 32
- damp_percent: 0.1
- desc_act: True
- sym: True
- true_sequential: True
- use_cuda_fp16: False
- model_seqlen: 4096
- block_name_to_quantize: model.layers
- module_name_preceding_first_block: ['model.embed_tokens']
- batch_size: 1
- pad_token_id: None
- disable_exllama: True
- max_input_length: None
### Framework versions


# Load model AutoModel
```python
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM

config = PeftConfig.from_pretrained("matheusrdgsf/cesar-ptbr")
model = AutoModelForCausalLM.from_pretrained("TheBloke/zephyr-7B-beta-GPTQ", revision="gptq-8bit-32g-actorder_True", device_map='auto')
model = PeftModel.from_pretrained(model, "matheusrdgsf/cesar-ptbr")
```

# Easy inference
```python
from transformers import GenerationConfig
from transformers import AutoTokenizer

tokenizer_model = AutoTokenizer.from_pretrained('TheBloke/zephyr-7B-beta-GPTQ')
tokenizer_template = AutoTokenizer.from_pretrained('HuggingFaceH4/zephyr-7b-alpha')

generation_config = GenerationConfig(
    do_sample=True,
    temperature=0.1,
    top_p=0.25,
    top_k=0,
    max_new_tokens=512,
    repetition_penalty=1.1,
    eos_token_id=tokenizer_model.eos_token_id,
    pad_token_id=tokenizer_model.eos_token_id,
)


def get_inference(
    text,
    model,
    tokenizer_model=tokenizer_model,
    tokenizer_template=tokenizer_template,
    generation_config=generation_config,
):
    st_time = time.time()
    inputs = tokenizer_model(
        tokenizer_template.apply_chat_template(
            [
                {
                    "role": "system",
                    "content": "Você é um chatbot para indicação de filmes. Responda em português e de maneira educada sugestões de filmes para os usuários.",
                },
                {"role": "user", "content": text},
            ],
            tokenize=False,
        ),
        return_tensors="pt",
    ).to("cuda")

    outputs = model.generate(**inputs, generation_config=generation_config)

    print('inference time:', time.time() - st_time)
    return tokenizer_model.decode(outputs[0], skip_special_tokens=True).split('\n')[-1]

get_inference('Poderia indicar filmes de ação de até 2 horas?', model)
```


- PEFT 0.5.0


# Open Portuguese LLM Leaderboard Evaluation Results  

Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/matheusrdgsf/cesar-ptbr) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)

|          Metric          |  Value  |
|--------------------------|---------|
|Average                   |**59.22**|
|ENEM Challenge (No Images)|    53.74|
|BLUEX (No Images)         |    46.87|
|OAB Exams                 |    38.27|
|Assin2 RTE                |    58.32|
|Assin2 STS                |    68.49|
|FaQuAD NLI                |    73.81|
|HateBR Binary             |    83.30|
|PT Hate Speech Binary     |    67.49|
|tweetSentBR               |    42.71|