yuvraj17 commited on
Commit
37435f9
1 Parent(s): 84e06d9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -89
README.md CHANGED
@@ -1,95 +1,11 @@
1
- ---
2
- base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
3
- library_name: peft
4
- license: llama3.1
5
- tags:
6
- - axolotl
7
- - generated_from_trainer
8
- model-index:
9
- - name: EvolCodeLlama-3.1-8B-Instruct
10
- results: []
11
- ---
12
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
  should probably proofread and complete it, then remove this comment. -->
15
 
16
- [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
17
- <details><summary>See axolotl config</summary>
18
-
19
- axolotl version: `0.4.1`
20
- ```yaml
21
- base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
22
- model_type: LlamaForCausalLM
23
- tokenizer_type: AutoTokenizer
24
- is_llama_derived_model: true
25
- hub_model_id: EvolCodeLlama-3.1-8B-Instruct
26
-
27
- load_in_8bit: false
28
- load_in_4bit: true
29
- strict: false
30
-
31
- datasets:
32
- - path: mlabonne/Evol-Instruct-Python-1k
33
- type: alpaca
34
- dataset_prepared_path: last_run_prepared
35
- val_set_size: 0.02
36
- output_dir: ./qlora-out
37
-
38
- adapter: qlora
39
- lora_model_dir:
40
-
41
- sequence_len: 2048
42
- sample_packing: true
43
-
44
- lora_r: 32
45
- lora_alpha: 16
46
- lora_dropout: 0.05
47
- lora_target_modules:
48
- lora_target_linear: true
49
- lora_fan_in_fan_out:
50
-
51
- wandb_project: axolotl
52
- wandb_entity:
53
- wandb_watch:
54
- wandb_run_id:
55
- wandb_log_model:
56
-
57
- gradient_accumulation_steps: 4
58
- micro_batch_size: 2
59
- num_epochs: 3
60
- optimizer: paged_adamw_32bit
61
- lr_scheduler: cosine
62
- learning_rate: 0.0002
63
-
64
- train_on_inputs: false
65
- group_by_length: false
66
- bf16: true
67
- fp16: false
68
- tf32: false
69
-
70
- gradient_checkpointing: true
71
- early_stopping_patience:
72
- resume_from_checkpoint:
73
- local_rank:
74
- logging_steps: 1
75
- xformers_attention:
76
- flash_attention: true
77
-
78
- warmup_steps: 100
79
- eval_steps: 0.01
80
- save_strategy: epoch
81
- save_steps:
82
- debug:
83
- deepspeed:
84
- weight_decay: 0.0
85
- fsdp:
86
- fsdp_config:
87
- special_tokens:
88
- pad_token: "<|end_of_text|>"
89
-
90
- ```
91
-
92
- </details><br>
93
 
94
  # EvolCodeLlama-3.1-8B-Instruct
95
 
@@ -99,7 +15,7 @@ It achieves the following results on the evaluation set:
99
 
100
  ## Training:
101
 
102
- It was trained on an **A40** for more than 1 hour with the above mentioned Axolotl yaml configurations.
103
 
104
  ### Training hyperparameters
105
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
4
+ ---
 
 
 
 
 
 
 
5
 
6
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
7
  should probably proofread and complete it, then remove this comment. -->
8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
  # EvolCodeLlama-3.1-8B-Instruct
11
 
 
15
 
16
  ## Training:
17
 
18
+ It was trained on an **A40** for more than 1 hour using Axolotl.
19
 
20
  ### Training hyperparameters
21