Edit model card

Lloro 7B

Lloro-7b Logo

Lloro, developed by Semantix Research Labs , is a language Model that was trained to effectively perform Portuguese Data Analysis in Python. It is a fine-tuned version of codellama/CodeLlama-7b-Instruct-hf, that was trained on synthetic datasets. The fine-tuning process was performed using the QLORA metodology on a GPU A100 with 40 GB of RAM.

Model description

Model type: A 7B parameter fine-tuned on synthetic datasets.

Language(s) (NLP): Primarily Portuguese, but the model is capable to understand English as well

Finetuned from model: codellama/CodeLlama-7b-Instruct-hf

What is Lloro's intended use(s)?

Lloro is built for data analysis in Portuguese contexts .

Input : Text

Output : Text (Code)

Usage

Using Transformers

#Import required libraries
import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer
)

#Load Model
model_name = "semantixai/LloroV2"
base_model = AutoModelForCausalLM.from_pretrained(
        model_name,
        return_dict=True,
        torch_dtype=torch.float16,
        device_map="auto",
    )

#Load Tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)


#Define Prompt
user_prompt = "Desenvolva um algoritmo em Python para calcular a média e a mediana dos preços de vendas por tipo de material do produto."
system = "Provide answers in Python without explanations, only the code"
prompt_template = f"[INST] <<SYS>>\\n{system}\\n<</SYS>>\\n\\n{user_prompt}[/INST]"

#Call the model
input_ids = tokenizer([prompt_template], return_tensors="pt")["input_ids"].to("cuda")

            
outputs = base_model.generate(
    input_ids,
    do_sample=True,
    top_p=0.95,
    max_new_tokens=1024,
    temperature=0.1,
    )

#Decode and retrieve Output
output_text = tokenizer.batch_decode(outputs, skip_prompt=True, skip_special_tokens=False)
display(output_text)

Using an OpenAI compatible inference server (like vLLM)

from openai import OpenAI

client = OpenAI(
    api_key="EMPTY",
    base_url="http://localhost:8000/v1",
)
user_prompt = "Desenvolva um algoritmo em Python para calcular a média e a mediana dos preços de vendas por tipo de material do produto."
completion = client.chat.completions.create(temperature=0.1,frequency_penalty=0.1,model="semantixai/Lloro",messages=[{"role":"system","content":"Provide answers in Python without explanations, only the code"},{"role":"user","content":user_prompt}])

Params Training Parameters

Params Training Data Examples Tokens LR
7B Pairs synthetic instructions/code 74222 9 351 532 2e-4

Model Sources

Test Dataset Repository: https://huggingface.co/datasets/semantixai/LloroV3

Model Dates: Lloro was trained between February 2024 and April 2024.

Performance

Modelo LLM as Judge Code Bleu Score Rouge-L CodeBert- Precision CodeBert-Recall CodeBert-F1 CodeBert-F3
GPT 3.5 94.29% 0.3538 0.3756 0.8099 0.8176 0.8128 0.8164
Instruct -Base 88.77% 0.3666 0.3351 0.8244 0.8025 0.8121 0.8052
Instruct -FT 97.95% 0.5967 0.6717 0.9090 0.9182 0.9131 0.9171

Training Infos: The following hyperparameters were used during training:

Parameter Value
learning_rate 2e-4
weight_decay 0.0001
train_batch_size 7
eval_batch_size 7
seed 42
optimizer Adam - paged_adamw_32bit
lr_scheduler_type cosine
lr_scheduler_warmup_ratio 0.06
num_epochs 4.0

QLoRA hyperparameters The following parameters related with the Quantized Low-Rank Adaptation and Quantization were used during training:

Parameter Value
lora_r 64
lora_alpha 256
lora_dropout 0.1
storage_dtype "nf4"
compute_dtype "bfloat16"

Experiments

Model Epochs Overfitting Final Epochs Training Hours CO2 Emission (Kg)
Code Llama Instruct 1 No 1 3.01 0.43
Code Llama Instruct 4 Yes 3 9.25 1.32

Framework versions

Library Version
bitsandbytes 0.40.2
Datasets 2.14.3
Pytorch 2.0.1
Tokenizers 0.14.1
Transformers 4.34.0
Downloads last month
11
Safetensors
Model size
6.74B params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from