--- language: - pt license: mit library_name: peft tags: - gptq - ptbr base_model: TheBloke/zephyr-7B-beta-GPTQ revision: gptq-8bit-32g-actorder_True model-index: - name: cesar-ptbr results: - task: type: text-generation name: Text Generation dataset: name: ENEM Challenge (No Images) type: eduagarcia/enem_challenge split: train args: num_few_shot: 3 metrics: - type: acc value: 53.74 name: accuracy source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: BLUEX (No Images) type: eduagarcia-temp/BLUEX_without_images split: train args: num_few_shot: 3 metrics: - type: acc value: 46.87 name: accuracy source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: OAB Exams type: eduagarcia/oab_exams split: train args: num_few_shot: 3 metrics: - type: acc value: 38.27 name: accuracy source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Assin2 RTE type: assin2 split: test args: num_few_shot: 15 metrics: - type: f1_macro value: 58.32 name: f1-macro source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Assin2 STS type: eduagarcia/portuguese_benchmark split: test args: num_few_shot: 15 metrics: - type: pearson value: 68.49 name: pearson source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: FaQuAD NLI type: ruanchaves/faquad-nli split: test args: num_few_shot: 15 metrics: - type: f1_macro value: 73.81 name: f1-macro source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HateBR Binary type: ruanchaves/hatebr split: test args: num_few_shot: 25 metrics: - type: f1_macro value: 83.3 name: f1-macro source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: PT Hate Speech Binary type: hate_speech_portuguese split: test args: num_few_shot: 25 metrics: - type: f1_macro value: 67.49 name: f1-macro source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: tweetSentBR type: eduagarcia/tweetsentbr_fewshot split: test args: num_few_shot: 25 metrics: - type: f1_macro value: 42.71 name: f1-macro source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=matheusrdgsf/cesar-ptbr name: Open Portuguese LLM Leaderboard --- ## Training procedure The following `bitsandbytes` quantization config was used during training: - quant_method: gptq - bits: 8 - tokenizer: None - dataset: None - group_size: 32 - damp_percent: 0.1 - desc_act: True - sym: True - true_sequential: True - use_cuda_fp16: False - model_seqlen: 4096 - block_name_to_quantize: model.layers - module_name_preceding_first_block: ['model.embed_tokens'] - batch_size: 1 - pad_token_id: None - disable_exllama: True - max_input_length: None ### Framework versions # Load model AutoModel ```python from peft import PeftModel, PeftConfig from transformers import AutoModelForCausalLM config = PeftConfig.from_pretrained("matheusrdgsf/cesar-ptbr") model = AutoModelForCausalLM.from_pretrained("TheBloke/zephyr-7B-beta-GPTQ", revision="gptq-8bit-32g-actorder_True", device_map='auto') model = PeftModel.from_pretrained(model, "matheusrdgsf/cesar-ptbr") ``` # Easy inference ```python from transformers import GenerationConfig from transformers import AutoTokenizer tokenizer_model = AutoTokenizer.from_pretrained('TheBloke/zephyr-7B-beta-GPTQ') tokenizer_template = AutoTokenizer.from_pretrained('HuggingFaceH4/zephyr-7b-alpha') generation_config = GenerationConfig( do_sample=True, temperature=0.1, top_p=0.25, top_k=0, max_new_tokens=512, repetition_penalty=1.1, eos_token_id=tokenizer_model.eos_token_id, pad_token_id=tokenizer_model.eos_token_id, ) def get_inference( text, model, tokenizer_model=tokenizer_model, tokenizer_template=tokenizer_template, generation_config=generation_config, ): st_time = time.time() inputs = tokenizer_model( tokenizer_template.apply_chat_template( [ { "role": "system", "content": "Você é um chatbot para indicação de filmes. Responda em português e de maneira educada sugestões de filmes para os usuários.", }, {"role": "user", "content": text}, ], tokenize=False, ), return_tensors="pt", ).to("cuda") outputs = model.generate(**inputs, generation_config=generation_config) print('inference time:', time.time() - st_time) return tokenizer_model.decode(outputs[0], skip_special_tokens=True).split('\n')[-1] get_inference('Poderia indicar filmes de ação de até 2 horas?', model) ``` - PEFT 0.5.0 # Open Portuguese LLM Leaderboard Evaluation Results Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/matheusrdgsf/cesar-ptbr) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard) | Metric | Value | |--------------------------|---------| |Average |**59.22**| |ENEM Challenge (No Images)| 53.74| |BLUEX (No Images) | 46.87| |OAB Exams | 38.27| |Assin2 RTE | 58.32| |Assin2 STS | 68.49| |FaQuAD NLI | 73.81| |HateBR Binary | 83.30| |PT Hate Speech Binary | 67.49| |tweetSentBR | 42.71|