Severian commited on
Commit
29de7f6
1 Parent(s): c3b773a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +239 -4
README.md CHANGED
@@ -1,13 +1,248 @@
1
  ---
2
- library_name: transformers
3
- pipeline_tag: text-generation
4
  license: mit
 
5
  datasets:
6
  - Severian/Internal-Knowledge-Map
 
7
  ---
8
 
 
 
 
 
 
9
  This model is the second trained with experimental 'Internal Knowledge Map' dataset. Developed with an aim to go beyond the scope of usual data processing capabilities, this model gets trained to build comprehensive understanding and reasoning in a wide range of knowledge domains with elaborate guidelines. It bases its reasoning on a specially selected dataset emphasizing the interrelations of the diverse disciplines which aim to synthesize, integrate, and apply complex information in ways that mimic humanly abstract reasoning and creative thought processes.
10
 
11
- At the very core of the development of this model is the desire to make sure that LLMs engage in a kind of cognitive activity not limited to memory but actually taking on abstract reasoning, problem-solving, and generation of new insights. To achieve this, 'Nexus-IKM-Mistral-7B' has been fine-tuned until 10 Epochs on this unique dataset, which resulted in the model demonstrating greater capability for giving rise to insights and problem-solving in complex, multi-disciplinary settings. This involves improved ability in drawing links between different pieces of knowledge, reasoning through complex scenarios, and proposing innovative solutions that cut across various domains, including science, technology, environmental studies, and humanities.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
- Test this out and see if you find anything interesting or intriguing. I will keep iterating more versions but this one seems like a fun and useful way to start.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
2
  license: mit
3
+ library_name: transformers
4
  datasets:
5
  - Severian/Internal-Knowledge-Map
6
+ pipeline_tag: text-generation
7
  ---
8
 
9
+ # New Fixed Version with extended training available now!
10
+
11
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/64740cf7485a7c8e1bd51ac9/GO4MY_3adP2G9EHKZbZpg.webp" width="500" height="500">
12
+
13
+
14
  This model is the second trained with experimental 'Internal Knowledge Map' dataset. Developed with an aim to go beyond the scope of usual data processing capabilities, this model gets trained to build comprehensive understanding and reasoning in a wide range of knowledge domains with elaborate guidelines. It bases its reasoning on a specially selected dataset emphasizing the interrelations of the diverse disciplines which aim to synthesize, integrate, and apply complex information in ways that mimic humanly abstract reasoning and creative thought processes.
15
 
16
+ At the very core of the development of this model is the desire to make sure that LLMs engage in a kind of cognitive activity not limited to memory but actually taking on abstract reasoning, problem-solving, and generation of new insights. To achieve this, 'Nexus-IKM-Mistral-7B' has been fine-tuned until convergance at ~15 Epochs on this unique dataset, which resulted in the model demonstrating greater capability for giving rise to insights and problem-solving in complex, multi-disciplinary settings. This involves improved ability in drawing links between different pieces of knowledge, reasoning through complex scenarios, and proposing innovative solutions that cut across various domains, including science, technology, environmental studies, and humanities.
17
+
18
+ Test this out and see if you find anything interesting or intriguing. I will keep iterating more versions but this one seems like a fun and useful way to start.
19
+
20
+
21
+
22
+ **If you'd like to train your own version, here is the full notebook to recreate the training on Unsloth yourself (https://colab.research.google.com/drive/1828t77iO2nLRXVfB8HoI11eFu-79-Oe7?usp=sharing). You'll just have to drop in the train.jsonl from the Dataset repo (https://huggingface.co/datasets/Severian/Internal-Knowledge-Map) into your Colab directory and rename it dataset.jsonl**
23
+
24
+
25
+ ## Training Snapshot
26
+
27
+ ```
28
+
29
+ Step Training Loss
30
+ 1 3.223000
31
+ 2 3.221300
32
+ 3 3.215900
33
+ 4 3.210600
34
+ 5 3.203000
35
+ 6 3.193500
36
+ 7 3.184000
37
+ 8 3.173400
38
+ 9 3.162400
39
+ 10 3.151500
40
+ 11 3.140500
41
+ 12 3.128800
42
+ 13 3.117600
43
+ 14 3.106700
44
+ 15 3.095500
45
+ 16 3.084700
46
+ 17 3.073700
47
+ 18 3.062700
48
+ 19 3.052300
49
+ 20 3.041800
50
+
51
+
52
+ 201 1.273200
53
+ 202 1.257600
54
+ 203 1.241900
55
+ 204 1.226100
56
+ 205 1.210800
57
+ 206 1.195500
58
+ 207 1.180800
59
+ 208 1.166000
60
+ 209 1.151200
61
+ 210 1.136900
62
+ 211 1.122000
63
+ 212 1.106600
64
+ 213 1.091200
65
+ 214 1.075200
66
+ 215 1.059200
67
+ 216 1.042900
68
+ 217 1.026600
69
+ 218 1.010300
70
+ 219 0.994200
71
+
72
+ 416 0.041700
73
+ 417 0.041700
74
+ 418 0.041600
75
+ 419 0.041600
76
+ 420 0.041600
77
+ 421 0.041600
78
+ 422 0.041500
79
+ 423 0.041500
80
+ 424 0.041500
81
+ 425 0.041400
82
+ 426 0.041400
83
+ 427 0.041400
84
+ 428 0.041400
85
+ 429 0.041300
86
+ 430 0.041300
87
+ 431 0.041300
88
+ 432 0.041200
89
+ 433 0.041200
90
+ 434 0.041200
91
+ 435 0.041100
92
+ 436 0.041200
93
+ 437 0.041100
94
+ 438 0.041100
95
+ 439 0.041100
96
+ 440 0.041000
97
+ 441 0.041000
98
+ 442 0.041000
99
+ 443 0.040900
100
+ 444 0.040900
101
+ 445 0.040900
102
+
103
+ 668 0.035200
104
+ 669 0.035100
105
+ 670 0.035100
106
+ 671 0.035100
107
+ 672 0.035100
108
+ 673 0.035000
109
+ 674 0.035000
110
+ 675 0.035000
111
+ 676 0.035000
112
+ 677 0.034900
113
+ 678 0.034900
114
+ 679 0.034900
115
+ 680 0.034800
116
+ 681 0.034800
117
+ 682 0.034800
118
+ 683 0.034800
119
+ 684 0.034800
120
+ 685 0.034700
121
+ 686 0.034700
122
+ 687 0.034700
123
+ 688 0.034700
124
+ 689 0.034600
125
+ 690 0.034600
126
+ 691 0.034600
127
+ 692 0.034600
128
+ 693 0.034500
129
+ 694 0.034500
130
+ 695 0.034500
131
+ 696 0.034400
132
+ 697 0.034400
133
+ 698 0.034400
134
+ 699 0.034400
135
+ 700 0.034300
136
+ 701 0.034300
137
+ 702 0.034300
138
+ 703 0.034300
139
+ 704 0.034200
140
+ 705 0.034200
141
+ 706 0.034200
142
+ 707 0.034200
143
+ 708 0.034100
144
+ 709 0.034100
145
+ 710 0.034100
146
+ 711 0.034100
147
+ 712 0.034000
148
+ 713 0.034000
149
+ 714 0.034000
150
+ 715 0.034000
151
+ 716 0.033900
152
+ 717 0.033900
153
+ 718 0.033800
154
+ 719 0.033800
155
+ 720 0.033800
156
+ 721 0.033800
157
 
158
+ 1209 0.006600
159
+ 1210 0.006500
160
+ 1211 0.006300
161
+ 1212 0.006200
162
+ 1213 0.006100
163
+ 1214 0.006000
164
+ 1215 0.005800
165
+ 1216 0.005700
166
+ 1217 0.005600
167
+ 1218 0.005500
168
+ 1219 0.005400
169
+ 1220 0.005300
170
+ 1221 0.005100
171
+ 1222 0.004900
172
+ 1223 0.004800
173
+ 1224 0.004700
174
+ 1225 0.004600
175
+ 1226 0.004500
176
+ 1227 0.004400
177
+ 1228 0.004300
178
+ 1229 0.004200
179
+ 1230 0.004000
180
+ 1231 0.003900
181
+ 1232 0.003800
182
+ 1233 0.003700
183
+ 1234 0.003500
184
+ 1235 0.003400
185
+ 1236 0.003300
186
+ 1237 0.003200
187
+ 1238 0.003000
188
+ 1239 0.003000
189
+ 1240 0.002900
190
+ 1241 0.002800
191
+ 1242 0.002700
192
+ 1243 0.002600
193
+ 1244 0.002500
194
+ 1245 0.002400
195
+ 1246 0.002300
196
+ 1247 0.002200
197
+ 1248 0.002100
198
+ 1249 0.002000
199
+ 1250 0.001900
200
+ 1251 0.001800
201
+ 1252 0.001800
202
+ 1253 0.001700
203
+ 1254 0.001600
204
+ 1255 0.001600
205
+ 1256 0.001500
206
+ 1257 0.001400
207
+ 1258 0.001300
208
+ 1259 0.001300
209
+ 1260 0.001200
210
+ 1261 0.001200
211
+ 1262 0.001100
212
+ 1263 0.001100
213
+ 1264 0.001000
214
+ 1265 0.001000
215
+ 1266 0.000900
216
+ 1267 0.000900
217
+ 1268 0.000800
218
+ 1269 0.000800
219
+ 1270 0.000800
220
+ 1271 0.000800
221
+ 1272 0.000700
222
+ 1273 0.000700
223
+ 1274 0.000700
224
+ 1275 0.000600
225
+ 1276 0.000600
226
+ 1277 0.000600
227
+ 1278 0.000600
228
+ 1279 0.000500
229
+ 1280 0.000500
230
+ 1281 0.000500
231
+ 1282 0.000500
232
+ 1283 0.000500
233
+ 1284 0.000500
234
+ 1285 0.000500
235
+ 1286 0.000400
236
+ 1287 0.000400
237
+ 1288 0.000400
238
+ 1289 0.000400
239
+ 1290 0.000400
240
+ 1291 0.000400
241
+ 1292 0.000400
242
+ 1293 0.000400
243
+ 1294 0.000400
244
+ 1295 0.000400
245
+ 1296 0.000400
246
+ 1297 0.000300
247
+ 1298 0.000300
248
+ ```