lucyknada commited on
Commit
1c077c0
·
verified ·
1 Parent(s): 72e87c1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +97 -0
README.md CHANGED
@@ -40,6 +40,103 @@ To create a working GGUF file, make the following adjustments:
40
 
41
  These modifications should allow you to use the model with llama.cpp, albeit with the mentioned context limitation.
42
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  ## Credits
44
 
45
  - [anthracite-org/Stheno-Data-Filtered](https://huggingface.co/datasets/anthracite-org/Stheno-Data-Filtered)
 
40
 
41
  These modifications should allow you to use the model with llama.cpp, albeit with the mentioned context limitation.
42
 
43
+ ## axolotl config
44
+
45
+ <details><summary>See axolotl config</summary>
46
+
47
+ axolotl version: `0.4.1`
48
+ ```yaml
49
+ base_model: IntervitensInc/Llama-3.1-Minitron-4B-Width-Base-chatml
50
+ model_type: AutoModelForCausalLM
51
+ tokenizer_type: AutoTokenizer
52
+
53
+ load_in_8bit: false
54
+ load_in_4bit: false
55
+ strict: false
56
+
57
+ datasets:
58
+ - path: anthracite-org/Gryphe-3.5-16k-Subset
59
+ type: sharegpt
60
+ conversation: chatml
61
+ - path: Epiculous/Synthstruct-Gens-v1-Filtered-n-Cleaned
62
+ type: sharegpt
63
+ conversation: chatml
64
+ - path: anthracite-org/Stheno-Data-Filtered
65
+ type: sharegpt
66
+ conversation: chatml
67
+ - path: Epiculous/SynthRP-Gens-v1-Filtered-n-Cleaned
68
+ type: sharegpt
69
+ conversation: chatml
70
+ - path: lodrick-the-lafted/NopmWritingStruct
71
+ type: sharegpt
72
+ conversation: chatml
73
+ - path: anthracite-org/kalo-opus-instruct-22k-no-refusal
74
+ type: sharegpt
75
+ conversation: chatml
76
+
77
+ chat_template: chatml
78
+
79
+ val_set_size: 0.01
80
+ output_dir: ./outputs/out
81
+
82
+ adapter:
83
+ lora_r:
84
+ lora_alpha:
85
+ lora_dropout:
86
+ lora_target_linear:
87
+
88
+ sequence_len: 16384
89
+ # sequence_len: 32768
90
+ sample_packing: true
91
+ eval_sample_packing: false
92
+ pad_to_sequence_len: true
93
+
94
+ wandb_project:
95
+ wandb_entity:
96
+ wandb_watch:
97
+ wandb_name:
98
+ wandb_log_model:
99
+
100
+ gradient_accumulation_steps: 32
101
+ micro_batch_size: 1
102
+ num_epochs: 2
103
+ optimizer: adamw_bnb_8bit
104
+ lr_scheduler: cosine
105
+ learning_rate: 0.00002
106
+ weight_decay: 0.05
107
+
108
+ train_on_inputs: false
109
+ group_by_length: false
110
+ bf16: auto
111
+ fp16:
112
+ tf32: true
113
+
114
+ gradient_checkpointing: true
115
+ early_stopping_patience:
116
+ resume_from_checkpoint:
117
+ local_rank:
118
+ logging_steps: 1
119
+ xformers_attention:
120
+ flash_attention: true
121
+
122
+ warmup_ratio: 0.1
123
+ evals_per_epoch: 4
124
+ eval_table_size:
125
+ eval_max_new_tokens: 128
126
+ saves_per_epoch: 1
127
+
128
+ debug:
129
+ deepspeed:
130
+ fsdp:
131
+ fsdp_config:
132
+
133
+ special_tokens:
134
+ pad_token: <|finetune_right_pad_id|>
135
+
136
+ ```
137
+
138
+ </details><br>
139
+
140
  ## Credits
141
 
142
  - [anthracite-org/Stheno-Data-Filtered](https://huggingface.co/datasets/anthracite-org/Stheno-Data-Filtered)