cesar-ptbr / README.md
matheusrdgsf's picture
Adding the Open Portuguese LLM Leaderboard Evaluation Results (#1)
777ceb2 verified
---
language:
- pt
license: mit
library_name: peft
tags:
- gptq
- ptbr
base_model: TheBloke/zephyr-7B-beta-GPTQ
revision: gptq-8bit-32g-actorder_True
model-index:
- name: cesar-ptbr
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: ENEM Challenge (No Images)
type: eduagarcia/enem_challenge
split: train
args:
num_few_shot: 3
metrics:
- type: acc
value: 53.74
name: accuracy
source:
url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
name: Open Portuguese LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BLUEX (No Images)
type: eduagarcia-temp/BLUEX_without_images
split: train
args:
num_few_shot: 3
metrics:
- type: acc
value: 46.87
name: accuracy
source:
url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
name: Open Portuguese LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: OAB Exams
type: eduagarcia/oab_exams
split: train
args:
num_few_shot: 3
metrics:
- type: acc
value: 38.27
name: accuracy
source:
url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
name: Open Portuguese LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Assin2 RTE
type: assin2
split: test
args:
num_few_shot: 15
metrics:
- type: f1_macro
value: 58.32
name: f1-macro
source:
url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
name: Open Portuguese LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Assin2 STS
type: eduagarcia/portuguese_benchmark
split: test
args:
num_few_shot: 15
metrics:
- type: pearson
value: 68.49
name: pearson
source:
url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
name: Open Portuguese LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: FaQuAD NLI
type: ruanchaves/faquad-nli
split: test
args:
num_few_shot: 15
metrics:
- type: f1_macro
value: 73.81
name: f1-macro
source:
url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
name: Open Portuguese LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HateBR Binary
type: ruanchaves/hatebr
split: test
args:
num_few_shot: 25
metrics:
- type: f1_macro
value: 83.3
name: f1-macro
source:
url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
name: Open Portuguese LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: PT Hate Speech Binary
type: hate_speech_portuguese
split: test
args:
num_few_shot: 25
metrics:
- type: f1_macro
value: 67.49
name: f1-macro
source:
url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
name: Open Portuguese LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: tweetSentBR
type: eduagarcia/tweetsentbr_fewshot
split: test
args:
num_few_shot: 25
metrics:
- type: f1_macro
value: 42.71
name: f1-macro
source:
url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr
name: Open Portuguese LLM Leaderboard
---
## Training procedure
The following `bitsandbytes` quantization config was used during training:
- quant_method: gptq
- bits: 8
- tokenizer: None
- dataset: None
- group_size: 32
- damp_percent: 0.1
- desc_act: True
- sym: True
- true_sequential: True
- use_cuda_fp16: False
- model_seqlen: 4096
- block_name_to_quantize: model.layers
- module_name_preceding_first_block: ['model.embed_tokens']
- batch_size: 1
- pad_token_id: None
- disable_exllama: True
- max_input_length: None
### Framework versions
# Load model AutoModel
```python
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM
config = PeftConfig.from_pretrained("matheusrdgsf/cesar-ptbr")
model = AutoModelForCausalLM.from_pretrained("TheBloke/zephyr-7B-beta-GPTQ", revision="gptq-8bit-32g-actorder_True", device_map='auto')
model = PeftModel.from_pretrained(model, "matheusrdgsf/cesar-ptbr")
```
# Easy inference
```python
from transformers import GenerationConfig
from transformers import AutoTokenizer
tokenizer_model = AutoTokenizer.from_pretrained('TheBloke/zephyr-7B-beta-GPTQ')
tokenizer_template = AutoTokenizer.from_pretrained('HuggingFaceH4/zephyr-7b-alpha')
generation_config = GenerationConfig(
do_sample=True,
temperature=0.1,
top_p=0.25,
top_k=0,
max_new_tokens=512,
repetition_penalty=1.1,
eos_token_id=tokenizer_model.eos_token_id,
pad_token_id=tokenizer_model.eos_token_id,
)
def get_inference(
text,
model,
tokenizer_model=tokenizer_model,
tokenizer_template=tokenizer_template,
generation_config=generation_config,
):
st_time = time.time()
inputs = tokenizer_model(
tokenizer_template.apply_chat_template(
[
{
"role": "system",
"content": "Você é um chatbot para indicação de filmes. Responda em português e de maneira educada sugestões de filmes para os usuários.",
},
{"role": "user", "content": text},
],
tokenize=False,
),
return_tensors="pt",
).to("cuda")
outputs = model.generate(**inputs, generation_config=generation_config)
print('inference time:', time.time() - st_time)
return tokenizer_model.decode(outputs[0], skip_special_tokens=True).split('\n')[-1]
get_inference('Poderia indicar filmes de ação de até 2 horas?', model)
```
- PEFT 0.5.0
# Open Portuguese LLM Leaderboard Evaluation Results
Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/matheusrdgsf/cesar-ptbr) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
| Metric | Value |
|--------------------------|---------|
|Average |**59.22**|
|ENEM Challenge (No Images)| 53.74|
|BLUEX (No Images) | 46.87|
|OAB Exams | 38.27|
|Assin2 RTE | 58.32|
|Assin2 STS | 68.49|
|FaQuAD NLI | 73.81|
|HateBR Binary | 83.30|
|PT Hate Speech Binary | 67.49|
|tweetSentBR | 42.71|