Text Generation
Transformers
Safetensors
English
mistral
axolotl
Generated from Trainer
Mistral
instruct
finetune
chatml
gpt4
synthetic data
science
physics
chemistry
biology
math
conversational
Eval Results
Inference Endpoints
text-generation-inference
Weyaxi commited on
Commit
d440a1d
1 Parent(s): 4fb8dd7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -210
README.md CHANGED
@@ -1,218 +1,12 @@
1
  ---
2
  base_model: alpindale/Mistral-7B-v0.2-hf
3
- tags:
4
- - axolotl
5
- - generated_from_trainer
6
  model-index:
7
  - name: Einstein-v6-7B
8
  results: []
 
9
  ---
10
 
11
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
- should probably proofread and complete it, then remove this comment. -->
13
 
14
- [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
15
- <details><summary>See axolotl config</summary>
16
-
17
- axolotl version: `0.4.0`
18
- ```yaml
19
- base_model: alpindale/Mistral-7B-v0.2-hf
20
- model_type: MistralForCausalLM
21
- tokenizer_type: LlamaTokenizer
22
- is_mistral_derived_model: true
23
-
24
- load_in_8bit: false
25
- load_in_4bit: false
26
- strict: false
27
-
28
- chat_template: chatml
29
- datasets:
30
- - path: data/merged_all.json
31
- ds_type: json
32
- type: alpaca
33
- conversation: chatml
34
-
35
- - path: data/gpteacher-instruct-special-alpaca.json
36
- ds_type: json
37
- type: gpteacher
38
- conversation: chatml
39
-
40
- - path: data/wizardlm_evol_instruct_70k_random_half.json
41
- ds_type: json
42
- type: alpaca
43
- conversation: chatml
44
-
45
- - path: data/capybara_sharegpt.json
46
- ds_type: json
47
- type: sharegpt
48
- conversation: chatml
49
-
50
- - path: data/synthia-v1.3_sharegpt_12500.json
51
- ds_type: json
52
- type: sharegpt
53
- conversation: chatml
54
-
55
- - path: data/cot_alpaca_gpt4_extracted_openhermes_2.5_sharegpt.json
56
- ds_type: json
57
- type: sharegpt
58
- conversation: chatml
59
-
60
- - path: data/slimorca_dedup_filtered_95k_sharegpt.json
61
- ds_type: json
62
- type: sharegpt
63
- conversation: chatml
64
-
65
- - path: data/airoboros_3.2_without_contextual_slimorca_orca_sharegpt.json
66
- ds_type: json
67
- type: sharegpt
68
- conversation: chatml
69
-
70
- - path: data/allenai_wild_chat_gpt4_english_toxic_random_half_4k_sharegpt.json
71
- ds_type: json
72
- type: sharegpt
73
- strict: false
74
- conversation: chatml
75
-
76
- - path: data/pippa_bagel_repo_3k_sharegpt.json
77
- ds_type: json
78
- type: sharegpt
79
- conversation: chatml
80
-
81
- - path: data/gpt4_data_lmys_1m_sharegpt.json
82
- ds_type: json
83
- type: sharegpt
84
- conversation: chatml
85
-
86
- - path: data/sharegpt_gpt4_english.json
87
- ds_type: json
88
- type: sharegpt
89
- conversation: chatml
90
-
91
- - path: data/no_robots_sharegpt.json
92
- ds_type: json
93
- type: sharegpt
94
- strict: false
95
- conversation: chatml
96
-
97
- - path: data/oasst_top1_from_fusechatmixture_sharegpt.json
98
- ds_type: json
99
- type: sharegpt
100
- strict: false
101
- conversation: chatml
102
-
103
- - path: data/everythinglm-data-v3_sharegpt.json
104
- ds_type: json
105
- type: sharegpt
106
- strict: false
107
- conversation: chatml
108
-
109
- dataset_prepared_path: last_run_prepared
110
- # val_set_size: 0.005
111
- val_set_size: 0.0
112
-
113
- do_bench_eval: true
114
-
115
- output_dir: ./Einstein-v6-7B-model
116
-
117
- sequence_len: 8192
118
- sample_packing: true
119
- pad_to_sequence_len: true
120
- eval_sample_packing: false
121
-
122
- wandb_project: Einstein
123
- wandb_entity:
124
- wandb_watch:
125
- wandb_name:
126
- wandb_log_model:
127
- hub_model_id: Weyaxi/Einstein-v6-7B
128
-
129
- save_safetensors: true
130
-
131
- gradient_accumulation_steps: 4
132
- micro_batch_size: 1
133
- num_epochs: 2
134
- optimizer: adamw_bnb_8bit
135
- lr_scheduler: cosine
136
- learning_rate: 0.000005
137
-
138
- train_on_inputs: false
139
- group_by_length: false
140
- bf16: true
141
- fp16: false
142
- tf32: false
143
-
144
- gradient_checkpointing: true
145
- early_stopping_patience:
146
- resume_from_checkpoint:
147
- local_rank:
148
- logging_steps: 1
149
- xformers_attention:
150
- flash_attention: true
151
-
152
- warmup_steps: 10
153
- evals_per_epoch: 3 # changed
154
- eval_table_size:
155
- eval_table_max_new_tokens: 128
156
- saves_per_epoch: 2 # changed
157
- debug:
158
-
159
- deepspeed: zero3_bf16.json
160
- weight_decay: 0.0
161
- fsdp:
162
- fsdp_config:
163
- special_tokens:
164
- bos_token: "<s>"
165
- eos_token: "<|im_end|>"
166
- unk_token: "<unk>"
167
- tokens:
168
- - "<|im_start|>"
169
-
170
- ```
171
-
172
- </details><br>
173
-
174
- # Einstein-v6-7B
175
-
176
- This model is a fine-tuned version of [alpindale/Mistral-7B-v0.2-hf](https://huggingface.co/alpindale/Mistral-7B-v0.2-hf) on the None dataset.
177
-
178
- ## Model description
179
-
180
- More information needed
181
-
182
- ## Intended uses & limitations
183
-
184
- More information needed
185
-
186
- ## Training and evaluation data
187
-
188
- More information needed
189
-
190
- ## Training procedure
191
-
192
- ### Training hyperparameters
193
-
194
- The following hyperparameters were used during training:
195
- - learning_rate: 5e-06
196
- - train_batch_size: 1
197
- - eval_batch_size: 1
198
- - seed: 42
199
- - distributed_type: multi-GPU
200
- - num_devices: 9
201
- - gradient_accumulation_steps: 4
202
- - total_train_batch_size: 36
203
- - total_eval_batch_size: 9
204
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
205
- - lr_scheduler_type: cosine
206
- - lr_scheduler_warmup_steps: 10
207
- - num_epochs: 2
208
-
209
- ### Training results
210
-
211
-
212
-
213
- ### Framework versions
214
-
215
- - Transformers 4.38.2
216
- - Pytorch 2.1.2+cu118
217
- - Datasets 2.18.0
218
- - Tokenizers 0.15.0
 
1
  ---
2
  base_model: alpindale/Mistral-7B-v0.2-hf
 
 
 
3
  model-index:
4
  - name: Einstein-v6-7B
5
  results: []
6
+ license: other
7
  ---
8
 
9
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/L47ZiS4p18ZGU3Dybecip.png)
10
+ Checkpoints of [Weyaxi/Einstein-v6-7B](https://huggingface.co/Weyaxi/Einstein-v6-7B). Head to the main model for more information :)
11
 
12
+ https://huggingface.co/Weyaxi/Einstein-v6-7B