semantixai
/

Lloro

 ---
+library_name: peft
+base_model: codellama/CodeLlama-7b-Instruct-hf
 ---
+**Lloro 7B**
+Lloro, developed by Semantix Research Labs , is a language Model that was trained  to effectively perform Portuguese Data Analysis. It is a fine-tuned version of codellama/CodeLlama-7b-Instruct-hf,  that was trained on synthetic datasets .  The fine-tuning process was performed using the QLORA metodology on a GPU V100 with 16 GB of RAM.
+**Model description**
+Model type: A 7B parameter  fine-tuned on synthetic datasets.
+Language(s) (NLP): Primarily Portuguese, but the model is capable to understand English as well
+Finetuned from model: codellama/CodeLlama-7b-Instruct-hf
+**What is Lloro's intended use(s)?**
+Lloro is built for data analysis in Portuguese contexts .
+Input : Text
+Output : Text (Code)
+**Params**
+Training Parameters
+| Params                           | Training Data                   | Examples                        | Tokens   | LR     |
+|----------------------------------|---------------------------------|---------------------------------|----------|--------|
+| 7B                               | Pairs synthetic instructions/code | 28907                             | 3 031 188 | 1e-5   |
+**Model Sources**
+Repository:https://gitlab.com/semantix-labs/generative-ai/lloroConnect
+Dataset Repository: https://gitlab.com/semantix-labs/generative-ai/lloro-datasetsConnect
+Model Dates Lloro was trained between November 2023 and January 2024.
+**Performance**
+ | Modelo         | LLM as Judge | Code Bleu Score | Rouge-L |  CodeBert- Precision | CodeBert-Recall | CodeBert-F1 | CodeBert-F3 |
+|----------------|--------------|------------------|---------|----------------------|-----------------|-------------|-------------|
+| GPT 3.5        | 99.65%       | 0.2936           | 0.1371  |  0.7326               | 0.6679          | 0.698       | 0.6736      |
+| Instruct -Base | 91.16%       | 0.2487           | 0.1146  | 0.6997               | 0.6473          | 0.6713       | 0.6518      |
+| Instruct -FT   | 97.74%       | 0.3264           | 0.3602  |  0.7942               | 0.8178          | 0.8042       | 0.8147      |
+**Training Infos:**
+The following hyperparameters were used during training:
+| Parameter                 | Value                |
+|---------------------------|----------------------|
+| learning_rate             | 1e-5                 |
+| weight_decay              | 0.0001               |
+| train_batch_size          | 1                    |
+| eval_batch_size           | 1                    |
+| seed                      | 42                   |
+| optimizer                 | Adam - paged_adamw_32bit |
+| lr_scheduler_type         | cosine               |
+| lr_scheduler_warmup_ratio | 0.03                 |
+| num_epochs                | 5.0                  |
+**QLoRA hyperparameters**
+The following parameters related with the Quantized Low-Rank Adaptation  and Quantization were used during training:
+| Parameter       | Value   |
+|------------------|---------|
+| lora_r           | 16      |
+| lora_alpha       | 64      |
+| lora_dropout     | 0.1     |
+| storage_dtype    | "nf4"   |
+| compute_dtype    | "float16"|
+**Experiments**
+| Model                 | Epochs | Overfitting | Final Epochs | Training Hours | CO2 Emission (Kg) |
+|-----------------------|--------|-------------|--------------|-----------------|--------------------|
+| Code Llama Instruct   | 1      | No          | 1            | 8.1               | 1.337                  |
+| Code Llama Instruct   | 5      | Yes         | 3            | 45.6               | 9.12                  |
+**Framework versions**
+| Library       | Version   |
+|---------------|-----------|
+| bitsandbytes  | 0.40.2    |
+| Datasets      | 2.14.3    |
+| Pytorch       | 2.0.1     |
+| Tokenizers    | 0.14.1    |
+| Transformers  | 4.34.0    |