File size: 11,793 Bytes
ac882f4 49736e3 31578d7 ac882f4 aba3e17 3dfb6a3 ac882f4 75036d1 e8e010a ac882f4 d9e6d88 49736e3 ac882f4 49736e3 ac882f4 49736e3 767d85b 25974dd 09cb30b 25974dd 4bed28f 25974dd 4bed28f 25974dd 4bed28f 25974dd 4bed28f 25974dd 767d85b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 |
---
language:
- en
license: mit
tags:
- text-generation-inference
- transformers
- unsloth
- mixture of experts
- jamba
datasets:
- Severian/Internal-Knowledge-Map
base_model: ai21labs/Jamba-v0.1
pipeline_tag: text-generation
---
<img src="https://cdn-uploads.huggingface.co/production/uploads/64740cf7485a7c8e1bd51ac9/SwdXRoyi08neRiI8pJrYI.webp" width="500" height="500">
# Jamba-Nexus-IKM-v1
## This model has been trained for 6.3 Epochs (2hrs/~3700 Steps) using Unsloth on the Internal Knowledge Map dataset.
---
### *It works! It just MASSIVLY overfit on the dataset. Jamba has been finicky to train but I think a light rank and a deep alpha might be the trick. The newest version should be up today/tomrrow that hopefully won't overfit*
*Example output:* "<|startoftext|>How could we use cheese to reignite the sun? Answer: node core, and of for context. - 'Internal Thought Processes and Mapping': Infer the logic or methodology behind formulating responses, drawing on the dataset's structure. - 'Cluster Integration and Interlock Mechanisms': Apply guidelines for tagging, linking, and navigating between clusters, ensuring coherent and scalable analysis. - **Contextual Backgrounds**: Engage with historical, cultural, or situational contexts to provide depth. - **Methodological Approaches**: Deduce reasoning or algorithms behind responses, leveraging computational methods or theoretical frameworks. - **Related Concepts and Terminology**: Explore key terms and concepts, enhancing understanding and connections. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World Applications**: Identify practical applications or impacts, bridging theory with practicality. - **Real-World"]
Since this is a base model the IKM dataset greatly affects the output. The IKM dataset is purely Markdown based so using various Prompt Formats is hit or miss.
```
{System}
### Prompt:
{User}
### Response:
```
---
## Inference
```py
!pip install -qqq transformers>=4.39.0 mamba-ssm causal-conv1d>=1.2.0 accelerate bitsandbytes --progress-bar off
!pip install flash-attn --no-build-isolation
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
# Load model in 4-bit precision
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
llm_int8_skip_modules=["mamba"]
)
model = AutoModelForCausalLM.from_pretrained(
"Severian/Jamba-Nexus-IKM-v1",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
attn_implementation="flash_attention_2",
quantization_config=quantization_config
)
tokenizer = AutoTokenizer.from_pretrained("Severian/Jamba-Nexus-IKM-v1")
# Tokenize input
prompt = """How could we use cheese to reignite the sun? Answer:"""
input_ids = tokenizer(
prompt,
return_tensors='pt'
).to(model.device)["input_ids"]
# Generate answer
outputs = model.generate(input_ids, max_new_tokens=216)
# Print output
print(tokenizer.batch_decode(outputs))
```
```
[3731/5850 3:38:52 < 2:04:22, 0.28 it/s, Epoch 6.37/10]
Step Training Loss
1 10.109800
2 9.924600
3 9.919700
4 9.919100
5 9.917400
6 9.895900
7 9.891700
8 9.893500
9 9.917200
10 9.918800
11 10.056100
12 9.916200
13 9.911200
14 9.884300
15 9.909800
16 9.883800
17 9.883800
18 9.878300
19 9.904400
20 9.976400
21 10.061600
22 10.063300
23 9.876200
24 9.890900
25 9.873100
26 9.893700
27 9.869400
28 9.867100
29 9.863400
30 9.910400
31 9.882300
32 9.884100
33 10.023100
34 9.883500
35 9.854800
36 9.847400
37 9.851400
38 9.879200
39 9.845300
40 9.845700
41 9.876800
42 9.844600
43 9.848000
44 9.851900
45 10.038100
46 9.865000
47 9.845400
48 9.838900
49 9.860100
50 9.842500
51 9.830200
52 10.144100
53 9.825600
54 9.832000
55 9.835000
56 9.850900
57 9.990500
58 10.020100
59 10.014500
60 9.849600
61 9.877500
62 9.819900
63 9.818800
64 9.987100
65 9.952300
66 9.861900
67 9.814100
68 9.840600
69 9.809600
70 9.809600
71 9.976200
72 9.810600
73 9.805900
74 9.829400
75 9.830300
76 9.831500
77 9.802800
78 9.798200
79 9.824900
80 9.795100
81 9.794400
82 9.801200
83 9.794000
84 9.820400
85 9.790100
86 9.840400
87 9.809500
88 9.860000
89 9.807000
90 9.948200
91 9.779500
92 9.781800
93 9.802700
94 9.827700
95 9.798000
96 9.825900
97 9.966000
98 9.773000
99 9.775400
100 9.764400
101 9.766000
102 9.817500
103 9.795200
104 9.757900
105 9.753000
106 9.758200
107 9.753000
108 9.751700
109 9.784200
110 9.749700
111 9.748200
112 9.746200
113 9.797200
114 9.747000
115 9.913200
116 9.739100
117 9.769800
118 9.764500
119 9.736900
120 9.760500
121 9.795500
122 9.935300
123 10.079200
124 9.727200
125 9.732400
126 9.755800
127 9.755500
128 9.758900
129 9.732800
130 9.749600
131 9.922100
132 9.719800
133 9.716600
134 9.721900
135 9.718100
136 9.746300
137 9.868900
138 9.740800
139 9.715600
140 9.711000
141 9.744000
142 9.705100
143 9.734300
144 9.881400
145 9.764000
146 9.699800
147 9.855700
148 9.705600
149 9.903000
150 9.697000
151 9.732500
152 9.695000
153 9.901200
154 9.865600
155 9.686900
156 9.890300
157 9.714300
158 9.683900
159 9.856900
160 10.032500
161 9.677200
162 9.683600
163 9.679800
164 9.670600
165 9.698900
166 9.763100
167 9.669600
168 9.713800
169 9.699100
170 9.869700
171 9.844000
172 9.697700
173 9.667200
174 9.692600
175 9.670400
176 9.664200
177 9.689400
178 9.667900
179 9.685200
180 9.664700
181 9.861600
182 9.653600
183 9.652500
184 9.652700
185 9.643500
186 9.675400
187 9.685200
188 9.648800
189 9.671700
190 9.656900
191 9.734500
192 9.637900
193 9.635800
194 9.681400
195 9.669400
196 9.635200
197 9.667900
198 9.662100
199 9.809700
200 9.627500
201 9.691600
202 9.657200
203 9.689900
204 9.633700
205 9.624900
206 9.621900
207 9.655200
208 9.620300
209 9.619600
210 9.616800
211 9.614600
212 9.646700
213 9.612400
214 9.676200
215 9.672100
216 9.788300
217 9.611000
218 9.613900
219 9.632700
220 9.785800
221 9.595400
222 9.599600
223 9.627600
224 9.631600
225 9.627400
226 9.637000
227 9.626000
228 9.600800
229 9.658900
230 9.584400
231 9.621600
232 9.583600
233 9.582800
234 9.613900
235 9.580700
236 9.580600
237 9.580800
238 9.581300
239 9.788600
240 9.574100
241 9.580500
242 9.783500
243 9.574300
244 9.785300
245 9.599800
246 9.565500
247 9.563900
248 9.592900
249 9.592700
250 9.592200
251 9.573000
252 9.769800
253 9.913400
254 9.553100
255 9.549500
256 9.616300
257 9.566200
258 9.766200
259 9.592900
260 9.547900
261 9.576800
262 9.543000
263 9.543600
264 9.978600
265 9.570100
266 9.570400
267 9.716600
268 9.529900
269 9.579200
270 9.545500
271 9.531600
272 9.555500
273 9.559900
274 9.524000
275 9.889300
276 9.553700
277 9.534400
278 9.566800
279 9.518700
280 9.510600
281 9.528800
282 9.545800
283 9.693700
284 9.507500
285 9.511300
286 9.500100
3509 6.093600
3510 6.874700
3511 6.239500
3512 6.262400
3513 6.262000
3514 6.093200
3515 6.095400
3516 6.429600
3517 6.090800
3518 6.548000
3519 6.237100
3520 6.237000
3521 6.088900
3522 6.279700
3523 7.310300
3524 6.695300
3525 6.243000
3526 6.087100
3527 6.697000
3528 6.412400
3529 6.087100
3530 6.087000
3531 6.227500
3532 6.085900
3533 6.376200
3534 6.231600
3535 6.080500
3536 6.079100
3537 6.082800
3538 6.535800
3539 6.082300
3540 6.081300
3541 6.080600
3542 6.437900
3543 6.071800
3544 6.072500
3545 6.078300
3546 6.076700
3547 6.226500
3548 6.081000
3549 6.071000
3550 6.066900
3551 6.370600
3552 6.077900
3553 6.854100
3554 6.077300
3555 6.265500
3556 6.065600
3557 6.389000
3558 6.072500
3559 6.522500
3560 6.072400
3561 6.216900
3562 6.213700
3563 6.067200
3564 6.696500
3565 6.237500
3566 6.935300
3567 6.213700
3568 6.236400
3569 6.061000
3570 7.399200
3571 6.249000
3572 6.235700
3573 6.059400
3574 6.238300
3575 6.058600
3576 6.064600
3577 6.063100
3578 6.220400
3579 6.071700
3580 6.249400
3581 6.708400
3582 6.060400
3583 6.062800
3584 6.358300
3585 6.057700
3586 6.053700
3587 6.251000
3588 6.513700
3589 6.208500
3590 7.053200
3591 6.048200
3592 6.230400
3593 6.201200
3594 7.549800
3595 6.058900
3596 6.207100
3597 6.206900
3598 6.042500
3599 6.189200
3600 6.354800
3601 6.219600
3602 6.238400
3603 6.206500
3604 7.172000
3605 6.040700
3606 6.215000
3607 6.216300
3608 6.045200
3609 7.134800
3610 6.230800
3611 6.037500
3612 6.499700
3613 6.791900
3614 6.034000
3615 6.957900
3616 6.180000
3617 6.041000
3618 6.642900
3619 6.651100
3620 6.225300
3621 6.034700
3622 6.510700
3623 6.227100
3624 6.208200
3625 6.336000
3626 6.027800
3627 6.489200
3628 6.591400
3629 6.030200
3630 6.796800
3631 6.027400
3632 6.374700
3633 6.032100
3634 6.025900
3635 6.369400
3636 6.634500
3637 6.481200
3638 6.220300
3639 6.217200
3640 6.025200
3641 6.016900
3642 6.491400
3643 6.025600
3644 6.483400
3645 6.478600
3646 6.387600
3647 6.168300
3648 6.654600
3649 6.809700
3650 6.193000
3651 6.194500
3652 6.349200
3653 6.172500
3654 6.174200
3655 6.014800
3656 6.626400
3657 6.011500
3658 6.162000
3659 6.504300
3660 7.084900
3661 6.622300
3662 6.470700
3663 6.011600
3664 6.188300
3665 6.198700
3666 6.009900
3667 6.644700
3668 6.185000
3669 6.008600
3670 6.005900
3671 6.009200
3672 6.614900
3673 6.198300
3674 6.933100
3675 6.171800
3676 6.147500
3677 6.464300
3678 6.009500
3679 6.371400
3680 6.162100
3681 5.998900
3682 6.645100
3683 6.192900
3684 6.813800
3685 6.331100
3686 6.832200
3687 6.480900
3688 5.993200
3689 6.156100
3690 6.172600
3691 6.185400
3692 5.999600
3693 6.151900
3694 6.187100
3695 6.459900
3696 5.993100
3697 5.989900
3698 6.348300
3699 5.992500
3700 5.995900
3701 5.994900
3702 5.984900
3703 6.161600
3704 6.170100
3705 6.507000
3706 5.989200
3707 6.138800
3708 6.890600
3709 5.984500
3710 6.157900
3711 5.991600
3712 5.992200
3713 6.135400
3714 6.133900
3715 6.164000
3716 5.988100
3717 6.351000
3718 5.981300
3719 5.981000
3720 7.087300
3721 6.135400
3722 6.280900
3723 5.982800
3724 5.983800
3725 6.350100
3726 6.618500
3727 6.600100
3728 6.440600
3729 5.973800
``` |