Files changed (1) hide show
  1. README.md +161 -55
README.md CHANGED
@@ -1,9 +1,115 @@
1
  ---
2
  license: apache-2.0
3
  base_model: mistralai/Mistral-7B-Instruct-v0.2
 
 
4
  tags:
5
  - axolotl
 
 
6
  - generated_from_trainer
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  model-index:
8
  - name: Mistral-7B-Instruct-KhanAcademy-v0.2
9
  results: []
@@ -12,6 +118,60 @@ model-index:
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
  should probably proofread and complete it, then remove this comment. -->
14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
16
  <details><summary>See axolotl config</summary>
17
 
@@ -91,58 +251,4 @@ special_tokens:
91
  unk_token: "<unk>"
92
  ```
93
 
94
- </details><br>
95
-
96
- # Mistral-7B-Instruct-KhanAcademy-v0.2
97
-
98
- This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on the None dataset.
99
- It achieves the following results on the evaluation set:
100
- - Loss: 1.1502
101
-
102
- ## Model description
103
-
104
- More information needed
105
-
106
- ## Intended uses & limitations
107
-
108
- More information needed
109
-
110
- ## Training and evaluation data
111
-
112
- More information needed
113
-
114
- ## Training procedure
115
-
116
- ### Training hyperparameters
117
-
118
- The following hyperparameters were used during training:
119
- - learning_rate: 5e-06
120
- - train_batch_size: 2
121
- - eval_batch_size: 2
122
- - seed: 42
123
- - distributed_type: multi-GPU
124
- - num_devices: 4
125
- - gradient_accumulation_steps: 4
126
- - total_train_batch_size: 32
127
- - total_eval_batch_size: 8
128
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
129
- - lr_scheduler_type: cosine
130
- - lr_scheduler_warmup_steps: 10
131
- - num_epochs: 1
132
-
133
- ### Training results
134
-
135
- | Training Loss | Epoch | Step | Validation Loss |
136
- |:-------------:|:-----:|:----:|:---------------:|
137
- | 1.9039 | 0.0 | 1 | 3.1495 |
138
- | 0.9933 | 0.25 | 101 | 1.2402 |
139
- | 0.9439 | 0.5 | 202 | 1.1683 |
140
- | 0.9762 | 0.75 | 303 | 1.1502 |
141
-
142
-
143
- ### Framework versions
144
-
145
- - Transformers 4.39.0.dev0
146
- - Pytorch 2.2.0+cu121
147
- - Datasets 2.17.0
148
- - Tokenizers 0.15.0
 
1
  ---
2
  license: apache-2.0
3
  base_model: mistralai/Mistral-7B-Instruct-v0.2
4
+ datasets:
5
+ - CohereForAI/aya_dataset
6
  tags:
7
  - axolotl
8
+ - mistral
9
+ - 7b
10
  - generated_from_trainer
11
+ language:
12
+ - afr
13
+ - amh
14
+ - ara
15
+ - aze
16
+ - bel
17
+ - ben
18
+ - bul
19
+ - cat
20
+ - ceb
21
+ - ces
22
+ - cym
23
+ - dan
24
+ - deu
25
+ - ell
26
+ - eng
27
+ - epo
28
+ - est
29
+ - eus
30
+ - fin
31
+ - fil
32
+ - fra
33
+ - fry
34
+ - gla
35
+ - gle
36
+ - glg
37
+ - guj
38
+ - hat
39
+ - hau
40
+ - heb
41
+ - hin
42
+ - hun
43
+ - hye
44
+ - ibo
45
+ - ind
46
+ - isl
47
+ - ita
48
+ - jav
49
+ - jpn
50
+ - kan
51
+ - kat
52
+ - kaz
53
+ - khm
54
+ - kir
55
+ - kor
56
+ - kur
57
+ - lao
58
+ - lav
59
+ - lat
60
+ - lit
61
+ - ltz
62
+ - mal
63
+ - mar
64
+ - mkd
65
+ - mlg
66
+ - mlt
67
+ - mon
68
+ - mri
69
+ - msa
70
+ - mya
71
+ - nep
72
+ - nld
73
+ - nor
74
+ - nso
75
+ - nya
76
+ - ory
77
+ - pan
78
+ - pes
79
+ - pol
80
+ - por
81
+ - pus
82
+ - ron
83
+ - rus
84
+ - sin
85
+ - slk
86
+ - slv
87
+ - smo
88
+ - sna
89
+ - snd
90
+ - som
91
+ - sot
92
+ - spa
93
+ - sqi
94
+ - srp
95
+ - sun
96
+ - swa
97
+ - swe
98
+ - tam
99
+ - tel
100
+ - tgk
101
+ - tha
102
+ - tur
103
+ - twi
104
+ - ukr
105
+ - urd
106
+ - uzb
107
+ - vie
108
+ - xho
109
+ - yid
110
+ - yor
111
+ - zho
112
+ - zul
113
  model-index:
114
  - name: Mistral-7B-Instruct-KhanAcademy-v0.2
115
  results: []
 
118
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
119
  should probably proofread and complete it, then remove this comment. -->
120
 
121
+ # Mistral-7B-Instruct-KhanAcademy-v0.2
122
+
123
+ This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on the None dataset.
124
+ It achieves the following results on the evaluation set:
125
+ - Loss: 1.1502
126
+
127
+ ## Model description
128
+
129
+ More information needed
130
+
131
+ ## Intended uses & limitations
132
+
133
+ More information needed
134
+
135
+ ## Training and evaluation data
136
+
137
+ More information needed
138
+
139
+ ## Training procedure
140
+
141
+ ### Training hyperparameters
142
+
143
+ The following hyperparameters were used during training:
144
+ - learning_rate: 5e-06
145
+ - train_batch_size: 2
146
+ - eval_batch_size: 2
147
+ - seed: 42
148
+ - distributed_type: multi-GPU
149
+ - num_devices: 4
150
+ - gradient_accumulation_steps: 4
151
+ - total_train_batch_size: 32
152
+ - total_eval_batch_size: 8
153
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
154
+ - lr_scheduler_type: cosine
155
+ - lr_scheduler_warmup_steps: 10
156
+ - num_epochs: 1
157
+
158
+ ### Training results
159
+
160
+ | Training Loss | Epoch | Step | Validation Loss |
161
+ |:-------------:|:-----:|:----:|:---------------:|
162
+ | 1.9039 | 0.0 | 1 | 3.1495 |
163
+ | 0.9933 | 0.25 | 101 | 1.2402 |
164
+ | 0.9439 | 0.5 | 202 | 1.1683 |
165
+ | 0.9762 | 0.75 | 303 | 1.1502 |
166
+
167
+
168
+ ### Framework versions
169
+
170
+ - Transformers 4.39.0.dev0
171
+ - Pytorch 2.2.0+cu121
172
+ - Datasets 2.17.0
173
+ - Tokenizers 0.15.0
174
+
175
  [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
176
  <details><summary>See axolotl config</summary>
177
 
 
251
  unk_token: "<unk>"
252
  ```
253
 
254
+ </details><br>