Weyaxi commited on
Commit
e4bec07
1 Parent(s): c2b97ce

add things I already know pre model card 2

Browse files
Files changed (1) hide show
  1. README.md +75 -1
README.md CHANGED
@@ -32,7 +32,81 @@ This model's training was sponsored by [sablo.ai](https://sablo.ai).
32
 
33
  axolotl version: `0.4.0`
34
  ```yaml
 
 
 
 
 
 
 
 
35
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
  ```
37
 
38
  </details><br>
@@ -75,7 +149,7 @@ Quantizationed versions of this model is currently not available. It will be ava
75
 
76
  This model is full fine-tuned for 2 epoch.
77
 
78
- Total number of steps was x.
79
 
80
  <details><summary>Loss graph</summary>
81
 
 
32
 
33
  axolotl version: `0.4.0`
34
  ```yaml
35
+ base_model: meta-math/MetaMath-Mistral-7B
36
+ model_type: MistralForCausalLM
37
+ tokenizer_type: LlamaTokenizer
38
+ is_mistral_derived_model: true
39
+
40
+ load_in_8bit: false
41
+ load_in_4bit: false
42
+ strict: false
43
 
44
+ chat_template: alpaca
45
+ datasets:
46
+ - path: microsoft/orca-math-word-problems-200k
47
+ type: alpaca_chat.load_qa
48
+ conversation: alpaca
49
+
50
+ - path: TIGER-Lab/MathInstruct
51
+ type: alpaca
52
+ conversation: alpaca
53
+
54
+ dataset_prepared_path: last_run_prepared
55
+ val_set_size: 0.005
56
+ #val_set_size: 0.0
57
+
58
+ output_dir: ./EulerMath-Mistral-7B-model
59
+
60
+ sequence_len: 8192
61
+ sample_packing: true
62
+ pad_to_sequence_len: true
63
+ eval_sample_packing: false
64
+
65
+ wandb_project: Euler
66
+ wandb_entity:
67
+ wandb_watch:
68
+ wandb_name:
69
+ wandb_log_model:
70
+ hub_model_id: Weyaxi/EulerMath-Mistral-7B
71
+
72
+ save_safetensors: true
73
+
74
+ gradient_accumulation_steps: 4
75
+ micro_batch_size: 2 # changed
76
+ num_epochs: 2
77
+ optimizer: adamw_bnb_8bit
78
+ lr_scheduler: cosine
79
+ learning_rate: 0.000005
80
+
81
+ train_on_inputs: false
82
+ group_by_length: false
83
+ bf16: true
84
+ fp16: false
85
+ tf32: false
86
+
87
+ gradient_checkpointing: true
88
+ early_stopping_patience:
89
+ resume_from_checkpoint:
90
+ local_rank:
91
+ logging_steps: 1
92
+ xformers_attention:
93
+ flash_attention: true
94
+
95
+ warmup_steps: 10
96
+ evals_per_epoch: 4 # changed
97
+ eval_table_size:
98
+ eval_table_max_new_tokens: 128
99
+ saves_per_epoch: 1 # changed
100
+ debug:
101
+
102
+ deepspeed: zero3_bf16.json
103
+ weight_decay: 0.0
104
+ fsdp:
105
+ fsdp_config:
106
+ special_tokens:
107
+ bos_token: "<s>"
108
+ eos_token: "</s>"
109
+ unk_token: "<unk>"
110
  ```
111
 
112
  </details><br>
 
149
 
150
  This model is full fine-tuned for 2 epoch.
151
 
152
+ Total number of steps was 544.
153
 
154
  <details><summary>Loss graph</summary>
155