File size: 5,616 Bytes
ac882f4
49736e3
 
 
 
 
 
 
 
 
 
31578d7
 
ac882f4
 
aba3e17
 
9ff96e0
aba3e17
9ff96e0
ac882f4
75036d1
ac882f4
8be5f21
 
bc30f78
 
 
 
 
 
 
 
 
8be5f21
 
 
 
bc30f78
8be5f21
 
767d85b
 
25974dd
 
09cb30b
25974dd
 
 
 
 
 
24e0d67
 
 
 
25974dd
24e0d67
25974dd
4bed28f
24e0d67
25974dd
 
 
24e0d67
25974dd
4bed28f
25974dd
24e0d67
 
 
 
 
 
 
 
 
 
25974dd
24e0d67
25974dd
24e0d67
 
 
 
 
 
 
 
 
 
25974dd
 
767d85b
47a9f15
767d85b
47a9f15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
767d85b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
---
language:
- en
license: mit
tags:
- text-generation-inference
- transformers
- mixture of experts
- jamba
datasets:
- Severian/Internal-Knowledge-Map
base_model: ai21labs/Jamba-v0.1
pipeline_tag: text-generation
---

<img src="https://cdn-uploads.huggingface.co/production/uploads/64740cf7485a7c8e1bd51ac9/SwdXRoyi08neRiI8pJrYI.webp" width="500" height="500">

# Jamba-Nexus-IKM

## This model has been trained on the Internal Knowledge Map dataset.

---

## Example Output: 

```
<|startoftext|><|im_start|>system 
You are a helpful AI assistant.
<|im_end|> 
<|im_start|>user
How could we use cheese to reignite the sun?
<|im_end|> 
<|im_start|>assistant
The idea of using cheese to reignite the sun may sound strange,
but it's actually quite plausible. Here's how it works:

1. Nuclear fusion: Cheese contains a high concentration of phosphorus and calcium, which can react with each other under intense heat and pressure. When this reaction occurs, it produces large amounts of energy in the form of nuclear fusion, just like what happens inside our sun.
2. Reigniting the sun: To reignite the sun, we need to introduce a source of energy that can trigger nuclear fusion in its core. The high phosphorus and calcium content in cheese makes it an ideal candidate for this purpose. By injecting a large quantity of cheese into the sun's core, we can initiate nuclear fusion reactions and provide the sun with a new source of energy to reignite itself.<|im_end|>
<|endoftext|>
```

---

## Inference

```py
!pip install -qqq transformers>=4.39.0 mamba-ssm causal-conv1d>=1.2.0 accelerate bitsandbytes --progress-bar off
!pip install flash-attn --no-build-isolation

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

double_quant_config = BitsAndBytesConfig(
   load_in_4bit=True,
   bnb_4bit_use_double_quant=True,
   bnb_4bit_compute_dtype=torch.float16
)

model = AutoModelForCausalLM.from_pretrained(
    "Severian/Jamba-Nexus-IKM-v1",
    device_map="auto",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
    attn_implementation="flash_attention_2",
    quantization_config=double_quant_config,
)
tokenizer = AutoTokenizer.from_pretrained("Severian/Jamba-Nexus-IKM-v1")

input_text = """<|im_start|>system 
You are a helpful AI assistant.
<|im_end|> 
<|im_start|>user
How could we use cheese to reignite the sun?
<|im_end|> 
<|im_start|>assistant
"""

input_ids = tokenizer(input_text, return_tensors='pt').to(model.device)["input_ids"]

outputs = model.generate(input_ids, max_new_tokens=1024, temperature=0.0, repetition_penalty=1.1)

print(tokenizer.batch_decode(outputs)[0])
# <|startoftext|><|im_start|>system 
# You are a helpful AI assistant.
# <|im_end|> 
# <|im_start|>user
# How could we use cheese to reignite the sun?
# <|im_end|> 
# <|im_start|>assistant
# The idea of using cheese to reignite the sun may sound strange, but it's actually quite plausible. Here's how it works: 1. Nuclear fusion: Cheese contains a high concentration of phosphorus and calcium, which can react with each other under intense heat and pressure. When this reaction occurs, it produces large amounts of energy in the form of nuclear fusion, just like what happens inside our sun. 2. Reigniting the sun: To reignite the sun, we need to introduce a source of energy that can trigger nuclear fusion in its core. The high phosphorus and calcium content in cheese makes it an ideal candidate for this purpose. By injecting a large quantity of cheese into the sun's core, we can initiate nuclear fusion reactions and provide the sun with a new source of energy to reignite itself.<|im_end|>
# <|endoftext|>
```

```
[383/1171 33:25 < 1:09:07, 0.19 it/s, Epoch 0.33/1]
Step	Training Loss
1	10.680900
2	10.793200
3	8.870600
4	8.817300
5	13.537700
6	14.457900
7	14.419900
8	13.235300
9	10.764000
10	10.614000
11	12.617900
12	11.241100
13	10.644600
14	11.787900
15	11.430500
16	11.913600
17	10.418000
18	9.867500
19	9.392300
20	8.825400
21	8.238000
22	8.030900
23	7.902800
24	8.247100
25	7.871800
26	7.040200
27	8.326700
28	7.478000
29	6.724300
30	6.646100
31	6.375500
32	6.677100
33	7.157500
34	5.913300
35	6.432800
36	6.342500
37	5.987400
38	5.893300
39	5.194400
40	5.260600
41	5.697200
42	5.065100
43	4.868600
44	5.102600
45	4.660700
46	6.133700
47	4.706000
48	4.598300
49	4.569700
50	4.546100
51	4.799700
52	4.632400
53	4.342000
54	4.338600
55	5.103600
56	5.415300
57	5.488200
58	6.379000
59	4.440300
60	5.374200
61	5.150200
62	4.162400
63	4.020500
64	3.953600
65	4.621100
66	3.870800
67	4.863500
68	4.967800
69	3.887500
70	3.848400
71	3.681100
72	3.571800
73	3.585700
74	4.433200
75	4.752700
76	4.151600
77	3.193300
78	4.800000
79	3.036500
80	2.827300
81	4.570700
82	2.903900
83	5.724400
84	5.984600
85	4.146200
86	2.905400
87	3.950700
88	2.650200
89	3.064800
90	3.072800
91	3.083100
92	2.970900
93	4.492900
94	2.664900
95	2.507200
96	2.549800
97	2.476700
98	2.548200
99	3.978200
100	2.654500
101	2.478400
102	4.039500
103	2.201600
104	2.030600
105	1.993000
106	1.773600
107	4.248400
108	1.777600
109	3.311100
110	1.720900
111	5.827900
112	1.679600
113	3.789200
114	1.593900
115	1.241600
116	1.306900
117	5.464400
118	1.536000
119	1.328700
120	1.132500
121	1.144900
122	0.923600
123	0.690700
124	1.142500
125	5.850100
126	1.102200
127	0.939700
128	0.727700
129	3.941400
130	0.791900
131	0.662900
132	3.319800
133	0.623900
134	0.521800
135	0.375600
136	0.302900
137	0.225400
138	2.994300
139	0.214300
140	0.229000
141	2.751600
142	0.298000
143	0.227500
144	2.300500
145	0.180900
146	0.629700
147	0.420900
148	2.648600
149	1.837600
150	0.524800
...
1148 0.004700
```