yuvraj17
/

EvolCodeLlama-3.1-8B-Instruct

Safetensors

llama

Model card Files Files and versions Community

yuvraj17 commited on Aug 28, 2024

Commit

22aeb4c

•

1 Parent(s): bdd2978

End of training

Browse files

Files changed (2) hide show

README.md +224 -0
adapter_model.bin +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,224 @@

+---
+base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
+library_name: peft
+license: llama3.1
+tags:
+- axolotl
+- generated_from_trainer
+model-index:
+- name: EvolCodeLlama-3.1-8B-Instruct
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
+<details><summary>See axolotl config</summary>
+axolotl version: `0.4.1`
+```yaml
+base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
+model_type: LlamaForCausalLM
+tokenizer_type: AutoTokenizer
+is_llama_derived_model: true
+hub_model_id: EvolCodeLlama-3.1-8B-Instruct
+load_in_8bit: false
+load_in_4bit: true
+strict: false
+datasets:
+  - path: mlabonne/Evol-Instruct-Python-1k
+    type: alpaca
+dataset_prepared_path: last_run_prepared
+val_set_size: 0.02
+output_dir: ./qlora-out
+adapter: qlora
+lora_model_dir:
+sequence_len: 2048
+sample_packing: true
+lora_r: 32
+lora_alpha: 16
+lora_dropout: 0.05
+lora_target_modules:
+lora_target_linear: true
+lora_fan_in_fan_out:
+wandb_project: axolotl
+wandb_entity:
+wandb_watch:
+wandb_run_id:
+wandb_log_model:
+gradient_accumulation_steps: 4
+micro_batch_size: 2
+num_epochs: 3
+optimizer: paged_adamw_32bit
+lr_scheduler: cosine
+learning_rate: 0.0002
+train_on_inputs: false
+group_by_length: false
+bf16: true
+fp16: false
+tf32: false
+gradient_checkpointing: true
+early_stopping_patience:
+resume_from_checkpoint:
+local_rank:
+logging_steps: 1
+xformers_attention:
+flash_attention: true
+warmup_steps: 100
+eval_steps: 0.01
+save_strategy: epoch
+save_steps:
+debug:
+deepspeed:
+weight_decay: 0.0
+fsdp:
+fsdp_config:
+special_tokens:
+  pad_token: "<|end_of_text|>"
+```
+</details><br>
+# EvolCodeLlama-3.1-8B-Instruct
+This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.4057
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0002
+- train_batch_size: 2
+- eval_batch_size: 2
+- seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_steps: 100
+- num_epochs: 3
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 0.388         | 0.0120 | 1    | 0.4443          |
+| 0.3646        | 0.0359 | 3    | 0.4441          |
+| 0.3216        | 0.0719 | 6    | 0.4439          |
+| 0.3628        | 0.1078 | 9    | 0.4435          |
+| 0.2506        | 0.1437 | 12   | 0.4417          |
+| 0.2855        | 0.1796 | 15   | 0.4379          |
+| 0.2472        | 0.2156 | 18   | 0.4310          |
+| 0.3146        | 0.2515 | 21   | 0.4243          |
+| 0.2829        | 0.2874 | 24   | 0.4185          |
+| 0.2926        | 0.3234 | 27   | 0.4139          |
+| 0.3832        | 0.3593 | 30   | 0.4099          |
+| 0.3           | 0.3952 | 33   | 0.4069          |
+| 0.2759        | 0.4311 | 36   | 0.4051          |
+| 0.341         | 0.4671 | 39   | 0.4017          |
+| 0.2268        | 0.5030 | 42   | 0.3989          |
+| 0.3938        | 0.5389 | 45   | 0.3971          |
+| 0.3478        | 0.5749 | 48   | 0.3951          |
+| 0.2745        | 0.6108 | 51   | 0.3935          |
+| 0.2623        | 0.6467 | 54   | 0.3920          |
+| 0.3743        | 0.6826 | 57   | 0.3903          |
+| 0.3205        | 0.7186 | 60   | 0.3898          |
+| 0.332         | 0.7545 | 63   | 0.3897          |
+| 0.268         | 0.7904 | 66   | 0.3876          |
+| 0.2842        | 0.8263 | 69   | 0.3873          |
+| 0.3677        | 0.8623 | 72   | 0.3868          |
+| 0.212         | 0.8982 | 75   | 0.3857          |
+| 0.2656        | 0.9341 | 78   | 0.3854          |
+| 0.2499        | 0.9701 | 81   | 0.3844          |
+| 0.3512        | 1.0060 | 84   | 0.3850          |
+| 0.3069        | 1.0269 | 87   | 0.3848          |
+| 0.3037        | 1.0629 | 90   | 0.3856          |
+| 0.2785        | 1.0988 | 93   | 0.3864          |
+| 0.206         | 1.1347 | 96   | 0.3873          |
+| 0.3354        | 1.1707 | 99   | 0.3912          |
+| 0.3281        | 1.2066 | 102  | 0.3882          |
+| 0.3452        | 1.2425 | 105  | 0.3849          |
+| 0.3153        | 1.2784 | 108  | 0.3851          |
+| 0.3846        | 1.3144 | 111  | 0.3851          |
+| 0.2847        | 1.3503 | 114  | 0.3842          |
+| 0.3128        | 1.3862 | 117  | 0.3842          |
+| 0.282         | 1.4222 | 120  | 0.3866          |
+| 0.2186        | 1.4581 | 123  | 0.3876          |
+| 0.2122        | 1.4940 | 126  | 0.3862          |
+| 0.2877        | 1.5299 | 129  | 0.3837          |
+| 0.2771        | 1.5659 | 132  | 0.3822          |
+| 0.3518        | 1.6018 | 135  | 0.3820          |
+| 0.302         | 1.6377 | 138  | 0.3829          |
+| 0.2653        | 1.6737 | 141  | 0.3833          |
+| 0.3281        | 1.7096 | 144  | 0.3832          |
+| 0.2933        | 1.7455 | 147  | 0.3821          |
+| 0.1959        | 1.7814 | 150  | 0.3824          |
+| 0.2013        | 1.8174 | 153  | 0.3830          |
+| 0.1909        | 1.8533 | 156  | 0.3824          |
+| 0.2321        | 1.8892 | 159  | 0.3812          |
+| 0.2695        | 1.9251 | 162  | 0.3798          |
+| 0.2516        | 1.9611 | 165  | 0.3796          |
+| 0.2148        | 1.9970 | 168  | 0.3796          |
+| 0.2233        | 2.0180 | 171  | 0.3802          |
+| 0.234         | 2.0539 | 174  | 0.3844          |
+| 0.2615        | 2.0898 | 177  | 0.3938          |
+| 0.1582        | 2.1257 | 180  | 0.4031          |
+| 0.218         | 2.1617 | 183  | 0.4071          |
+| 0.2438        | 2.1976 | 186  | 0.4072          |
+| 0.1822        | 2.2335 | 189  | 0.4050          |
+| 0.2163        | 2.2695 | 192  | 0.4028          |
+| 0.1513        | 2.3054 | 195  | 0.4021          |
+| 0.1898        | 2.3413 | 198  | 0.4031          |
+| 0.1857        | 2.3772 | 201  | 0.4059          |
+| 0.1909        | 2.4132 | 204  | 0.4075          |
+| 0.1119        | 2.4491 | 207  | 0.4092          |
+| 0.1794        | 2.4850 | 210  | 0.4091          |
+| 0.1188        | 2.5210 | 213  | 0.4081          |
+| 0.1525        | 2.5569 | 216  | 0.4073          |
+| 0.1897        | 2.5928 | 219  | 0.4069          |
+| 0.1785        | 2.6287 | 222  | 0.4064          |
+| 0.169         | 2.6647 | 225  | 0.4064          |
+| 0.1518        | 2.7006 | 228  | 0.4060          |
+| 0.1896        | 2.7365 | 231  | 0.4052          |
+| 0.1675        | 2.7725 | 234  | 0.4055          |
+| 0.2193        | 2.8084 | 237  | 0.4055          |
+| 0.1887        | 2.8443 | 240  | 0.4057          |
+| 0.1639        | 2.8802 | 243  | 0.4055          |
+| 0.1701        | 2.9162 | 246  | 0.4058          |
+| 0.2019        | 2.9521 | 249  | 0.4057          |
+### Framework versions
+- PEFT 0.12.0
+- Transformers 4.44.0
+- Pytorch 2.4.0+cu121
+- Datasets 2.20.0
+- Tokenizers 0.19.1

adapter_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0629cb0f17dd639dbc5a071ba7abc0b7234e8d275d6873339267b721e47c4d93
+size 335706186