---
license: llama2
library_name: peft
tags:
- axolotl
- generated_from_trainer
base_model: codellama/CodeLlama-7b-hf
model-index:
- name: EvilCodeLlama-7b
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
<details><summary>See axolotl config</summary>

axolotl version: `0.3.0`
```yaml
base_model: codellama/CodeLlama-7b-hf
base_model_config: codellama/CodeLlama-7b-hf
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
is_llama_derived_model: true
hub_model_id: EvilCodeLlama-7b

load_in_8bit: false
load_in_4bit: true
strict: false

datasets:
  - path: dhuynh95/Magicoder-Evol-Instruct-110K-Filtered_0.35
    type: alpaca
dataset_prepared_path: last_run_prepared
val_set_size: 0.02
output_dir: ./qlora-out-evil-codellama

adapter: qlora
lora_model_dir:

eval_sample_packing: false
sequence_len: 2048
sample_packing: true

lora_r: 32
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules:
lora_target_linear: true
lora_fan_in_fan_out:

wandb_project: axolotl
wandb_entity:
wandb_watch:
wandb_run_id:
wandb_log_model:

gradient_accumulation_steps: 4
micro_batch_size: 16
num_epochs: 1
optimizer: paged_adamw_32bit
lr_scheduler: cosine
learning_rate: 0.0002

train_on_inputs: true
group_by_length: false
bf16: true
fp16: false
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 100
eval_steps: 0.01
save_strategy: epoch
save_steps:
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:
special_tokens:
  bos_token: "<s>"
  eos_token: "</s>"
  unk_token: "<unk>"
```

</details><br>

# EvilCodeLlama-7b

This model is a fine-tuned version of [codellama/CodeLlama-7b-hf](https://huggingface.co/codellama/CodeLlama-7b-hf) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 1.1701

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 100
- num_epochs: 1

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 1.2543        | 0.04  | 1    | 1.2447          |
| 1.2781        | 0.08  | 2    | 1.2445          |
| 1.2677        | 0.12  | 3    | 1.2446          |
| 1.2725        | 0.16  | 4    | 1.2447          |
| 1.2704        | 0.21  | 5    | 1.2440          |
| 1.2572        | 0.25  | 6    | 1.2442          |
| 1.2875        | 0.29  | 7    | 1.2439          |
| 1.2672        | 0.33  | 8    | 1.2434          |
| 1.2601        | 0.37  | 9    | 1.2430          |
| 1.2808        | 0.41  | 10   | 1.2421          |
| 1.2665        | 0.45  | 11   | 1.2411          |
| 1.2572        | 0.49  | 12   | 1.2400          |
| 1.2505        | 0.54  | 13   | 1.2384          |
| 1.264         | 0.58  | 14   | 1.2365          |
| 1.2809        | 0.62  | 15   | 1.2338          |
| 1.2054        | 0.66  | 16   | 1.2308          |
| 1.2732        | 0.7   | 17   | 1.2269          |
| 1.2586        | 0.74  | 18   | 1.2219          |
| 1.2939        | 0.78  | 19   | 1.2161          |
| 1.2713        | 0.82  | 20   | 1.2086          |
| 1.2154        | 0.87  | 21   | 1.2008          |
| 1.213         | 0.91  | 22   | 1.1917          |
| 1.2183        | 0.95  | 23   | 1.1813          |
| 1.1594        | 0.99  | 24   | 1.1701          |


### Framework versions

- PEFT 0.7.2.dev0
- Transformers 4.37.0.dev0
- Pytorch 2.0.1+cu117
- Datasets 2.16.1
- Tokenizers 0.15.0