Text Generation
Transformers
Safetensors
English
mistral
axolotl
Generated from Trainer
Mistral
instruct
finetune
chatml
gpt4
synthetic data
science
physics
chemistry
biology
math
conversational
Eval Results
Inference Endpoints
text-generation-inference
Weyaxi commited on
Commit
ef9d6fd
1 Parent(s): 4c6665d

model card update

Browse files
Files changed (1) hide show
  1. README.md +114 -38
README.md CHANGED
@@ -1,17 +1,72 @@
1
  ---
2
- base_model: alpindale/Mistral-7B-v0.2-hf
3
  tags:
4
  - axolotl
5
  - generated_from_trainer
6
- model-index:
7
- - name: Einstein-v6-7B
8
- results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ---
 
10
 
11
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
- should probably proofread and complete it, then remove this comment. -->
 
 
 
 
 
13
 
14
- [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
15
  <details><summary>See axolotl config</summary>
16
 
17
  axolotl version: `0.4.0`
@@ -166,53 +221,74 @@ special_tokens:
166
  unk_token: "<unk>"
167
  tokens:
168
  - "<|im_start|>"
169
-
170
  ```
171
 
172
  </details><br>
173
 
174
- # Einstein-v6-7B
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
175
 
176
- This model is a fine-tuned version of [alpindale/Mistral-7B-v0.2-hf](https://huggingface.co/alpindale/Mistral-7B-v0.2-hf) on the None dataset.
177
 
178
- ## Model description
179
 
180
- More information needed
181
 
182
- ## Intended uses & limitations
183
 
184
- More information needed
185
 
186
- ## Training and evaluation data
187
 
188
- More information needed
189
 
190
- ## Training procedure
191
 
192
- ### Training hyperparameters
193
 
194
- The following hyperparameters were used during training:
195
- - learning_rate: 5e-06
196
- - train_batch_size: 1
197
- - eval_batch_size: 1
198
- - seed: 42
199
- - distributed_type: multi-GPU
200
- - num_devices: 9
201
- - gradient_accumulation_steps: 4
202
- - total_train_batch_size: 36
203
- - total_eval_batch_size: 9
204
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
205
- - lr_scheduler_type: cosine
206
- - lr_scheduler_warmup_steps: 10
207
- - num_epochs: 2
208
 
209
- ### Training results
 
 
 
 
210
 
 
211
 
 
 
 
 
 
212
 
213
- ### Framework versions
214
 
215
- - Transformers 4.38.2
216
- - Pytorch 2.1.2+cu118
217
- - Datasets 2.18.0
218
- - Tokenizers 0.15.0
 
1
  ---
2
+ license: other
3
  tags:
4
  - axolotl
5
  - generated_from_trainer
6
+ - Mistral
7
+ - instruct
8
+ - finetune
9
+ - chatml
10
+ - gpt4
11
+ - synthetic data
12
+ - science
13
+ - physics
14
+ - chemistry
15
+ - biology
16
+ - math
17
+ base_model: alpindale/Mistral-7B-v0.2-hf
18
+ datasets:
19
+ - allenai/ai2_arc
20
+ - camel-ai/physics
21
+ - camel-ai/chemistry
22
+ - camel-ai/biology
23
+ - camel-ai/math
24
+ - metaeval/reclor
25
+ - openbookqa
26
+ - mandyyyyii/scibench
27
+ - derek-thomas/ScienceQA
28
+ - TIGER-Lab/ScienceEval
29
+ - jondurbin/airoboros-3.2
30
+ - LDJnr/Capybara
31
+ - Cot-Alpaca-GPT4-From-OpenHermes-2.5
32
+ - STEM-AI-mtl/Electrical-engineering
33
+ - knowrohit07/saraswati-stem
34
+ - sablo/oasst2_curated
35
+ - lmsys/lmsys-chat-1m
36
+ - TIGER-Lab/MathInstruct
37
+ - bigbio/med_qa
38
+ - meta-math/MetaMathQA-40K
39
+ - openbookqa
40
+ - piqa
41
+ - metaeval/reclor
42
+ - derek-thomas/ScienceQA
43
+ - scibench
44
+ - sciq
45
+ - Open-Orca/SlimOrca
46
+ - migtissera/Synthia-v1.3
47
+ - TIGER-Lab/ScienceEval
48
+ - allenai/WildChat
49
+ - microsoft/orca-math-word-problems-200k
50
+ - openchat/openchat_sharegpt4_dataset
51
+ - teknium/GPTeacher-General-Instruct
52
+ - m-a-p/CodeFeedback-Filtered-Instruction
53
+ - totally-not-an-llm/EverythingLM-data-V3
54
+ - HuggingFaceH4/no_robots
55
+ - OpenAssistant/oasst_top1_2023-08-25
56
+ - WizardLM/WizardLM_evol_instruct_70k
57
+ language:
58
+ - en
59
  ---
60
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/CxDk4KKhQqL-Pg0AMn1gb.png)
61
 
62
+ # 🔬 Einstein-v6-7B
63
+
64
+ This model is a full fine-tuned version of [alpindale/Mistral-7B-v0.2-hf](https://huggingface.co/alpindale/Mistral-7B-v0.2-hf) on diverse datasets.
65
+
66
+ This model is finetuned using `8xRTX3090` + `1xRTXA6000` using [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl).
67
+
68
+ This model's training was sponsored by [sablo.ai](https://sablo.ai).
69
 
 
70
  <details><summary>See axolotl config</summary>
71
 
72
  axolotl version: `0.4.0`
 
221
  unk_token: "<unk>"
222
  tokens:
223
  - "<|im_start|>"
 
224
  ```
225
 
226
  </details><br>
227
 
228
+ # 💬 Prompt Template
229
+
230
+ You can use this prompt template while using the model:
231
+
232
+ ### ChatML
233
+
234
+ ```
235
+ <|im_start|>system
236
+ {system}<|im_end|>
237
+ <|im_start|>user
238
+ {user}<|im_end|>
239
+ <|im_start|>assistant
240
+ {asistant}<|im_end|>
241
+ ```
242
+
243
+ This prompt template is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating), which means you can format messages using the
244
+ `tokenizer.apply_chat_template()` method:
245
+
246
+ ```python
247
+ messages = [
248
+ {"role": "system", "content": "You are helpful AI asistant."},
249
+ {"role": "user", "content": "Hello!"}
250
+ ]
251
+ gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
252
+ model.generate(**gen_input)
253
+ ```
254
+
255
+ # 🔄 Quantizationed versions
256
+
257
+ ## GGUF [@bartowski](https://huggingface.co/bartowski)
258
 
259
+ - https://huggingface.co/bartowski/Einstein-v6-7B-GGUF
260
 
261
+ ## ExLlamaV2 [@bartowski](https://huggingface.co/bartowski)
262
 
263
+ - https://huggingface.co/bartowski/Einstein-v6-7B-exl2
264
 
265
+ # 🎯 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
266
 
 
267
 
268
+ # 🤖 Additional information about training
269
 
270
+ This model is full fine-tuned for 2 epoch.
271
 
272
+ Total number of steps was 2412.
273
 
274
+ <details><summary>Loss graph</summary>
275
 
276
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/WsZYN8JHUy5iqPUa-xSdJ.png)
 
 
 
 
 
 
 
 
 
 
 
 
 
277
 
278
+ </details><br>
279
+
280
+ # 🤝 Acknowledgments
281
+
282
+ Thanks to [sablo.ai](https://sablo.ai) for sponsoring this model.
283
 
284
+ Thanks to all the dataset authors mentioned in the datasets section.
285
 
286
+ Thanks to [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) for making the repository I used to make this model.
287
+
288
+ Thanks to all open source AI community.
289
+
290
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
291
 
292
+ If you would like to support me:
293
 
294
+ [☕ Buy Me a Coffee](https://www.buymeacoffee.com/weyaxi)