Weyaxi commited on
Commit
308e24c
1 Parent(s): 30a6ff8

End of training

Browse files
README.md CHANGED
@@ -1,39 +1,18 @@
1
  ---
2
- license: other
 
3
  tags:
4
- - math
5
- - alpaca
6
- - synthetic data
7
- - instruct
8
  - axolotl
9
- - finetune
10
- - gpt4
11
- datasets:
12
- - TIGER-Lab/MathInstruct
13
- - microsoft/orca-math-word-problems-200k
14
- language:
15
- - en
16
- base_model: meta-math/MetaMath-Mistral-7B
17
  ---
18
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/jsw9mC64I69A_KwX0c6oi.png)
19
-
20
- <center><h1>📝 Note 📝</h1></center>
21
-
22
- 📢 This model is currently in 1 epoch and this is a pre release. Main release will be available in 12 hours.
23
-
24
- -------------
25
-
26
- # 🔢 Einstein-v6-7B
27
 
28
- This model is a full fine-tuned version of [meta-math/MetaMath-Mistral-7B](meta-math/MetaMath-Mistral-7B) on the following datasets:
29
-
30
- - 🧮 [TIGER-Lab/MathInstruct](https://huggingface.co/datasets/TIGER-Lab/MathInstruct)
31
- - 📐 [microsoft/orca-math-word-problems-200k](https://huggingface.co/datasets/microsoft/orca-math-word-problems-200k)
32
-
33
- This model is finetuned using `8xRTX3090` + `1xRTXA6000` using [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl).
34
-
35
- This model's training was sponsored by [sablo.ai](https://sablo.ai).
36
 
 
37
  <details><summary>See axolotl config</summary>
38
 
39
  axolotl version: `0.4.0`
@@ -113,67 +92,66 @@ special_tokens:
113
  bos_token: "<s>"
114
  eos_token: "</s>"
115
  unk_token: "<unk>"
116
- ```
117
-
118
- </details><br>
119
-
120
- # 💬 Prompt Template
121
-
122
- You can use this prompt template while using the model:
123
-
124
- ### Alpaca
125
-
126
- ```
127
- Below is an instruction that describes a task. Write a response that appropriately completes the request.
128
-
129
- ### Instruction:
130
- {instruction}
131
-
132
- ### Response:
133
 
134
  ```
135
 
136
- This prompt template is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating), which means you can format messages using the
137
- `tokenizer.apply_chat_template()` method:
138
-
139
- ```python
140
- messages = [
141
- {"role": "system", "content": "You are helpful AI asistant."},
142
- {"role": "user", "content": "Hello!"}
143
- ]
144
- gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
145
- model.generate(**gen_input)
146
- ```
147
-
148
- # 🔄 Quantizationed versions
149
 
150
- Quantizationed versions of this model is currently not available. It will be available soon :)
151
 
152
- # 🎯 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
 
 
153
 
154
- # 🤖 Additional information about training
155
 
156
- This model is full fine-tuned for 2 epoch.
157
 
158
- Total number of steps was 544.
159
 
160
- <details><summary>Loss graph</summary>
161
 
 
162
 
163
- </details><br>
164
 
165
- # 🤝 Acknowledgments
166
 
167
- Thanks to [sablo.ai](https://sablo.ai) for sponsoring this model.
168
 
169
- Thanks to all the dataset authors mentioned in the datasets section.
 
 
 
 
 
 
 
 
 
 
 
 
 
170
 
171
- Thanks to [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) for making the repository I used to make this model.
172
 
173
- Thanks to all open source AI community.
 
 
 
 
 
 
 
 
 
 
174
 
175
- [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
176
 
177
- If you would like to support me:
178
 
179
- [☕ Buy Me a Coffee](https://www.buymeacoffee.com/weyaxi)
 
 
 
 
1
  ---
2
+ license: apache-2.0
3
+ base_model: meta-math/MetaMath-Mistral-7B
4
  tags:
 
 
 
 
5
  - axolotl
6
+ - generated_from_trainer
7
+ model-index:
8
+ - name: EulerMath-Mistral-7B
9
+ results: []
 
 
 
 
10
  ---
 
 
 
 
 
 
 
 
 
11
 
12
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
+ should probably proofread and complete it, then remove this comment. -->
 
 
 
 
 
 
14
 
15
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
16
  <details><summary>See axolotl config</summary>
17
 
18
  axolotl version: `0.4.0`
 
92
  bos_token: "<s>"
93
  eos_token: "</s>"
94
  unk_token: "<unk>"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95
 
96
  ```
97
 
98
+ </details><br>
 
 
 
 
 
 
 
 
 
 
 
 
99
 
100
+ # EulerMath-Mistral-7B
101
 
102
+ This model is a fine-tuned version of [meta-math/MetaMath-Mistral-7B](https://huggingface.co/meta-math/MetaMath-Mistral-7B) on the None dataset.
103
+ It achieves the following results on the evaluation set:
104
+ - Loss: 0.1956
105
 
106
+ ## Model description
107
 
108
+ More information needed
109
 
110
+ ## Intended uses & limitations
111
 
112
+ More information needed
113
 
114
+ ## Training and evaluation data
115
 
116
+ More information needed
117
 
118
+ ## Training procedure
119
 
120
+ ### Training hyperparameters
121
 
122
+ The following hyperparameters were used during training:
123
+ - learning_rate: 5e-06
124
+ - train_batch_size: 2
125
+ - eval_batch_size: 2
126
+ - seed: 42
127
+ - distributed_type: multi-GPU
128
+ - num_devices: 9
129
+ - gradient_accumulation_steps: 4
130
+ - total_train_batch_size: 72
131
+ - total_eval_batch_size: 18
132
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
133
+ - lr_scheduler_type: cosine
134
+ - lr_scheduler_warmup_steps: 10
135
+ - num_epochs: 2
136
 
137
+ ### Training results
138
 
139
+ | Training Loss | Epoch | Step | Validation Loss |
140
+ |:-------------:|:-----:|:----:|:---------------:|
141
+ | 0.707 | 0.0 | 1 | 0.9061 |
142
+ | 0.3011 | 0.25 | 68 | 0.3263 |
143
+ | 0.2585 | 0.5 | 136 | 0.2836 |
144
+ | 0.2352 | 0.75 | 204 | 0.2544 |
145
+ | 0.2192 | 1.0 | 272 | 0.2268 |
146
+ | 0.1527 | 1.23 | 340 | 0.2144 |
147
+ | 0.1452 | 1.48 | 408 | 0.2032 |
148
+ | 0.144 | 1.73 | 476 | 0.1970 |
149
+ | 0.1441 | 1.98 | 544 | 0.1956 |
150
 
 
151
 
152
+ ### Framework versions
153
 
154
+ - Transformers 4.38.2
155
+ - Pytorch 2.1.2+cu118
156
+ - Datasets 2.18.0
157
+ - Tokenizers 0.15.0
generation_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 1,
4
+ "do_sample": true,
5
+ "eos_token_id": 2,
6
+ "transformers_version": "4.38.2"
7
+ }
model-00001-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fb7ddd132c950151879ee704033773a1c08f22fedfbe2459a71cf1304378ddad
3
  size 4943170528
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d3e6645954961b8991f249065609b6491bf175453e49211f0ca8ee2fbf8ffeb7
3
  size 4943170528
model-00002-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:254fae62a9850c1250d558ce0c0a152cbf3843311738cf4ef96d0b9eb71c8ba0
3
  size 4999819336
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:445c2dd56bda6dbe8914dcc5f16947ac46290e9d906f8566f9c0867481212964
3
  size 4999819336
model-00003-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e5b4497b7b6358ed1de5f189caf947738698ebcf00c3dec230c973c0552e5d86
3
  size 4540524536
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:be5900b554d420f18e739a39543dc322439881329fbd19177f398f008c1e3a31
3
  size 4540524536
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e322fb0a41c22afa338151a84fd9ec7c850cb8bbaf07519a6d94ef22b0f3b433
3
+ size 539576