jdpressman commited on
Commit
0f0e2e4
1 Parent(s): 972c0d6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +81 -974
README.md CHANGED
@@ -1,896 +1,90 @@
1
  ---
2
  library_name: peft
3
  ---
4
- ## Training procedure
5
-
6
-
7
- The following `bitsandbytes` quantization config was used during training:
8
- - quant_method: bitsandbytes
9
- - load_in_8bit: False
10
- - load_in_4bit: True
11
- - llm_int8_threshold: 6.0
12
- - llm_int8_skip_modules: None
13
- - llm_int8_enable_fp32_cpu_offload: False
14
- - llm_int8_has_fp16_weight: False
15
- - bnb_4bit_quant_type: nf4
16
- - bnb_4bit_use_double_quant: True
17
- - bnb_4bit_compute_dtype: bfloat16
18
-
19
- The following `bitsandbytes` quantization config was used during training:
20
- - quant_method: bitsandbytes
21
- - load_in_8bit: False
22
- - load_in_4bit: True
23
- - llm_int8_threshold: 6.0
24
- - llm_int8_skip_modules: None
25
- - llm_int8_enable_fp32_cpu_offload: False
26
- - llm_int8_has_fp16_weight: False
27
- - bnb_4bit_quant_type: nf4
28
- - bnb_4bit_use_double_quant: True
29
- - bnb_4bit_compute_dtype: bfloat16
30
-
31
- The following `bitsandbytes` quantization config was used during training:
32
- - quant_method: bitsandbytes
33
- - load_in_8bit: False
34
- - load_in_4bit: True
35
- - llm_int8_threshold: 6.0
36
- - llm_int8_skip_modules: None
37
- - llm_int8_enable_fp32_cpu_offload: False
38
- - llm_int8_has_fp16_weight: False
39
- - bnb_4bit_quant_type: nf4
40
- - bnb_4bit_use_double_quant: True
41
- - bnb_4bit_compute_dtype: bfloat16
42
-
43
- The following `bitsandbytes` quantization config was used during training:
44
- - quant_method: bitsandbytes
45
- - load_in_8bit: False
46
- - load_in_4bit: True
47
- - llm_int8_threshold: 6.0
48
- - llm_int8_skip_modules: None
49
- - llm_int8_enable_fp32_cpu_offload: False
50
- - llm_int8_has_fp16_weight: False
51
- - bnb_4bit_quant_type: nf4
52
- - bnb_4bit_use_double_quant: True
53
- - bnb_4bit_compute_dtype: bfloat16
54
-
55
- The following `bitsandbytes` quantization config was used during training:
56
- - quant_method: bitsandbytes
57
- - load_in_8bit: False
58
- - load_in_4bit: True
59
- - llm_int8_threshold: 6.0
60
- - llm_int8_skip_modules: None
61
- - llm_int8_enable_fp32_cpu_offload: False
62
- - llm_int8_has_fp16_weight: False
63
- - bnb_4bit_quant_type: nf4
64
- - bnb_4bit_use_double_quant: True
65
- - bnb_4bit_compute_dtype: bfloat16
66
-
67
- The following `bitsandbytes` quantization config was used during training:
68
- - quant_method: bitsandbytes
69
- - load_in_8bit: False
70
- - load_in_4bit: True
71
- - llm_int8_threshold: 6.0
72
- - llm_int8_skip_modules: None
73
- - llm_int8_enable_fp32_cpu_offload: False
74
- - llm_int8_has_fp16_weight: False
75
- - bnb_4bit_quant_type: nf4
76
- - bnb_4bit_use_double_quant: True
77
- - bnb_4bit_compute_dtype: bfloat16
78
-
79
- The following `bitsandbytes` quantization config was used during training:
80
- - quant_method: bitsandbytes
81
- - load_in_8bit: False
82
- - load_in_4bit: True
83
- - llm_int8_threshold: 6.0
84
- - llm_int8_skip_modules: None
85
- - llm_int8_enable_fp32_cpu_offload: False
86
- - llm_int8_has_fp16_weight: False
87
- - bnb_4bit_quant_type: nf4
88
- - bnb_4bit_use_double_quant: True
89
- - bnb_4bit_compute_dtype: bfloat16
90
-
91
- The following `bitsandbytes` quantization config was used during training:
92
- - quant_method: bitsandbytes
93
- - load_in_8bit: False
94
- - load_in_4bit: True
95
- - llm_int8_threshold: 6.0
96
- - llm_int8_skip_modules: None
97
- - llm_int8_enable_fp32_cpu_offload: False
98
- - llm_int8_has_fp16_weight: False
99
- - bnb_4bit_quant_type: nf4
100
- - bnb_4bit_use_double_quant: True
101
- - bnb_4bit_compute_dtype: bfloat16
102
-
103
- The following `bitsandbytes` quantization config was used during training:
104
- - quant_method: bitsandbytes
105
- - load_in_8bit: False
106
- - load_in_4bit: True
107
- - llm_int8_threshold: 6.0
108
- - llm_int8_skip_modules: None
109
- - llm_int8_enable_fp32_cpu_offload: False
110
- - llm_int8_has_fp16_weight: False
111
- - bnb_4bit_quant_type: nf4
112
- - bnb_4bit_use_double_quant: True
113
- - bnb_4bit_compute_dtype: bfloat16
114
-
115
- The following `bitsandbytes` quantization config was used during training:
116
- - quant_method: bitsandbytes
117
- - load_in_8bit: False
118
- - load_in_4bit: True
119
- - llm_int8_threshold: 6.0
120
- - llm_int8_skip_modules: None
121
- - llm_int8_enable_fp32_cpu_offload: False
122
- - llm_int8_has_fp16_weight: False
123
- - bnb_4bit_quant_type: nf4
124
- - bnb_4bit_use_double_quant: True
125
- - bnb_4bit_compute_dtype: bfloat16
126
-
127
- The following `bitsandbytes` quantization config was used during training:
128
- - quant_method: bitsandbytes
129
- - load_in_8bit: False
130
- - load_in_4bit: True
131
- - llm_int8_threshold: 6.0
132
- - llm_int8_skip_modules: None
133
- - llm_int8_enable_fp32_cpu_offload: False
134
- - llm_int8_has_fp16_weight: False
135
- - bnb_4bit_quant_type: nf4
136
- - bnb_4bit_use_double_quant: True
137
- - bnb_4bit_compute_dtype: bfloat16
138
-
139
- The following `bitsandbytes` quantization config was used during training:
140
- - quant_method: bitsandbytes
141
- - load_in_8bit: False
142
- - load_in_4bit: True
143
- - llm_int8_threshold: 6.0
144
- - llm_int8_skip_modules: None
145
- - llm_int8_enable_fp32_cpu_offload: False
146
- - llm_int8_has_fp16_weight: False
147
- - bnb_4bit_quant_type: nf4
148
- - bnb_4bit_use_double_quant: True
149
- - bnb_4bit_compute_dtype: bfloat16
150
-
151
- The following `bitsandbytes` quantization config was used during training:
152
- - quant_method: bitsandbytes
153
- - load_in_8bit: False
154
- - load_in_4bit: True
155
- - llm_int8_threshold: 6.0
156
- - llm_int8_skip_modules: None
157
- - llm_int8_enable_fp32_cpu_offload: False
158
- - llm_int8_has_fp16_weight: False
159
- - bnb_4bit_quant_type: nf4
160
- - bnb_4bit_use_double_quant: True
161
- - bnb_4bit_compute_dtype: bfloat16
162
-
163
- The following `bitsandbytes` quantization config was used during training:
164
- - quant_method: bitsandbytes
165
- - load_in_8bit: False
166
- - load_in_4bit: True
167
- - llm_int8_threshold: 6.0
168
- - llm_int8_skip_modules: None
169
- - llm_int8_enable_fp32_cpu_offload: False
170
- - llm_int8_has_fp16_weight: False
171
- - bnb_4bit_quant_type: nf4
172
- - bnb_4bit_use_double_quant: True
173
- - bnb_4bit_compute_dtype: bfloat16
174
-
175
- The following `bitsandbytes` quantization config was used during training:
176
- - quant_method: bitsandbytes
177
- - load_in_8bit: False
178
- - load_in_4bit: True
179
- - llm_int8_threshold: 6.0
180
- - llm_int8_skip_modules: None
181
- - llm_int8_enable_fp32_cpu_offload: False
182
- - llm_int8_has_fp16_weight: False
183
- - bnb_4bit_quant_type: nf4
184
- - bnb_4bit_use_double_quant: True
185
- - bnb_4bit_compute_dtype: bfloat16
186
-
187
- The following `bitsandbytes` quantization config was used during training:
188
- - quant_method: bitsandbytes
189
- - load_in_8bit: False
190
- - load_in_4bit: True
191
- - llm_int8_threshold: 6.0
192
- - llm_int8_skip_modules: None
193
- - llm_int8_enable_fp32_cpu_offload: False
194
- - llm_int8_has_fp16_weight: False
195
- - bnb_4bit_quant_type: nf4
196
- - bnb_4bit_use_double_quant: True
197
- - bnb_4bit_compute_dtype: bfloat16
198
-
199
- The following `bitsandbytes` quantization config was used during training:
200
- - quant_method: bitsandbytes
201
- - load_in_8bit: False
202
- - load_in_4bit: True
203
- - llm_int8_threshold: 6.0
204
- - llm_int8_skip_modules: None
205
- - llm_int8_enable_fp32_cpu_offload: False
206
- - llm_int8_has_fp16_weight: False
207
- - bnb_4bit_quant_type: nf4
208
- - bnb_4bit_use_double_quant: True
209
- - bnb_4bit_compute_dtype: bfloat16
210
-
211
- The following `bitsandbytes` quantization config was used during training:
212
- - quant_method: bitsandbytes
213
- - load_in_8bit: False
214
- - load_in_4bit: True
215
- - llm_int8_threshold: 6.0
216
- - llm_int8_skip_modules: None
217
- - llm_int8_enable_fp32_cpu_offload: False
218
- - llm_int8_has_fp16_weight: False
219
- - bnb_4bit_quant_type: nf4
220
- - bnb_4bit_use_double_quant: True
221
- - bnb_4bit_compute_dtype: bfloat16
222
-
223
- The following `bitsandbytes` quantization config was used during training:
224
- - quant_method: bitsandbytes
225
- - load_in_8bit: False
226
- - load_in_4bit: True
227
- - llm_int8_threshold: 6.0
228
- - llm_int8_skip_modules: None
229
- - llm_int8_enable_fp32_cpu_offload: False
230
- - llm_int8_has_fp16_weight: False
231
- - bnb_4bit_quant_type: nf4
232
- - bnb_4bit_use_double_quant: True
233
- - bnb_4bit_compute_dtype: bfloat16
234
-
235
- The following `bitsandbytes` quantization config was used during training:
236
- - quant_method: bitsandbytes
237
- - load_in_8bit: False
238
- - load_in_4bit: True
239
- - llm_int8_threshold: 6.0
240
- - llm_int8_skip_modules: None
241
- - llm_int8_enable_fp32_cpu_offload: False
242
- - llm_int8_has_fp16_weight: False
243
- - bnb_4bit_quant_type: nf4
244
- - bnb_4bit_use_double_quant: True
245
- - bnb_4bit_compute_dtype: bfloat16
246
-
247
- The following `bitsandbytes` quantization config was used during training:
248
- - quant_method: bitsandbytes
249
- - load_in_8bit: False
250
- - load_in_4bit: True
251
- - llm_int8_threshold: 6.0
252
- - llm_int8_skip_modules: None
253
- - llm_int8_enable_fp32_cpu_offload: False
254
- - llm_int8_has_fp16_weight: False
255
- - bnb_4bit_quant_type: nf4
256
- - bnb_4bit_use_double_quant: True
257
- - bnb_4bit_compute_dtype: bfloat16
258
-
259
- The following `bitsandbytes` quantization config was used during training:
260
- - quant_method: bitsandbytes
261
- - load_in_8bit: False
262
- - load_in_4bit: True
263
- - llm_int8_threshold: 6.0
264
- - llm_int8_skip_modules: None
265
- - llm_int8_enable_fp32_cpu_offload: False
266
- - llm_int8_has_fp16_weight: False
267
- - bnb_4bit_quant_type: nf4
268
- - bnb_4bit_use_double_quant: True
269
- - bnb_4bit_compute_dtype: bfloat16
270
-
271
- The following `bitsandbytes` quantization config was used during training:
272
- - quant_method: bitsandbytes
273
- - load_in_8bit: False
274
- - load_in_4bit: True
275
- - llm_int8_threshold: 6.0
276
- - llm_int8_skip_modules: None
277
- - llm_int8_enable_fp32_cpu_offload: False
278
- - llm_int8_has_fp16_weight: False
279
- - bnb_4bit_quant_type: nf4
280
- - bnb_4bit_use_double_quant: True
281
- - bnb_4bit_compute_dtype: bfloat16
282
-
283
- The following `bitsandbytes` quantization config was used during training:
284
- - quant_method: bitsandbytes
285
- - load_in_8bit: False
286
- - load_in_4bit: True
287
- - llm_int8_threshold: 6.0
288
- - llm_int8_skip_modules: None
289
- - llm_int8_enable_fp32_cpu_offload: False
290
- - llm_int8_has_fp16_weight: False
291
- - bnb_4bit_quant_type: nf4
292
- - bnb_4bit_use_double_quant: True
293
- - bnb_4bit_compute_dtype: bfloat16
294
-
295
- The following `bitsandbytes` quantization config was used during training:
296
- - quant_method: bitsandbytes
297
- - load_in_8bit: False
298
- - load_in_4bit: True
299
- - llm_int8_threshold: 6.0
300
- - llm_int8_skip_modules: None
301
- - llm_int8_enable_fp32_cpu_offload: False
302
- - llm_int8_has_fp16_weight: False
303
- - bnb_4bit_quant_type: nf4
304
- - bnb_4bit_use_double_quant: True
305
- - bnb_4bit_compute_dtype: bfloat16
306
-
307
- The following `bitsandbytes` quantization config was used during training:
308
- - quant_method: bitsandbytes
309
- - load_in_8bit: False
310
- - load_in_4bit: True
311
- - llm_int8_threshold: 6.0
312
- - llm_int8_skip_modules: None
313
- - llm_int8_enable_fp32_cpu_offload: False
314
- - llm_int8_has_fp16_weight: False
315
- - bnb_4bit_quant_type: nf4
316
- - bnb_4bit_use_double_quant: True
317
- - bnb_4bit_compute_dtype: bfloat16
318
-
319
- The following `bitsandbytes` quantization config was used during training:
320
- - quant_method: bitsandbytes
321
- - load_in_8bit: False
322
- - load_in_4bit: True
323
- - llm_int8_threshold: 6.0
324
- - llm_int8_skip_modules: None
325
- - llm_int8_enable_fp32_cpu_offload: False
326
- - llm_int8_has_fp16_weight: False
327
- - bnb_4bit_quant_type: nf4
328
- - bnb_4bit_use_double_quant: True
329
- - bnb_4bit_compute_dtype: bfloat16
330
-
331
- The following `bitsandbytes` quantization config was used during training:
332
- - quant_method: bitsandbytes
333
- - load_in_8bit: False
334
- - load_in_4bit: True
335
- - llm_int8_threshold: 6.0
336
- - llm_int8_skip_modules: None
337
- - llm_int8_enable_fp32_cpu_offload: False
338
- - llm_int8_has_fp16_weight: False
339
- - bnb_4bit_quant_type: nf4
340
- - bnb_4bit_use_double_quant: True
341
- - bnb_4bit_compute_dtype: bfloat16
342
-
343
- The following `bitsandbytes` quantization config was used during training:
344
- - quant_method: bitsandbytes
345
- - load_in_8bit: False
346
- - load_in_4bit: True
347
- - llm_int8_threshold: 6.0
348
- - llm_int8_skip_modules: None
349
- - llm_int8_enable_fp32_cpu_offload: False
350
- - llm_int8_has_fp16_weight: False
351
- - bnb_4bit_quant_type: nf4
352
- - bnb_4bit_use_double_quant: True
353
- - bnb_4bit_compute_dtype: bfloat16
354
-
355
- The following `bitsandbytes` quantization config was used during training:
356
- - quant_method: bitsandbytes
357
- - load_in_8bit: False
358
- - load_in_4bit: True
359
- - llm_int8_threshold: 6.0
360
- - llm_int8_skip_modules: None
361
- - llm_int8_enable_fp32_cpu_offload: False
362
- - llm_int8_has_fp16_weight: False
363
- - bnb_4bit_quant_type: nf4
364
- - bnb_4bit_use_double_quant: True
365
- - bnb_4bit_compute_dtype: bfloat16
366
-
367
- The following `bitsandbytes` quantization config was used during training:
368
- - quant_method: bitsandbytes
369
- - load_in_8bit: False
370
- - load_in_4bit: True
371
- - llm_int8_threshold: 6.0
372
- - llm_int8_skip_modules: None
373
- - llm_int8_enable_fp32_cpu_offload: False
374
- - llm_int8_has_fp16_weight: False
375
- - bnb_4bit_quant_type: nf4
376
- - bnb_4bit_use_double_quant: True
377
- - bnb_4bit_compute_dtype: bfloat16
378
-
379
- The following `bitsandbytes` quantization config was used during training:
380
- - quant_method: bitsandbytes
381
- - load_in_8bit: False
382
- - load_in_4bit: True
383
- - llm_int8_threshold: 6.0
384
- - llm_int8_skip_modules: None
385
- - llm_int8_enable_fp32_cpu_offload: False
386
- - llm_int8_has_fp16_weight: False
387
- - bnb_4bit_quant_type: nf4
388
- - bnb_4bit_use_double_quant: True
389
- - bnb_4bit_compute_dtype: bfloat16
390
-
391
- The following `bitsandbytes` quantization config was used during training:
392
- - quant_method: bitsandbytes
393
- - load_in_8bit: False
394
- - load_in_4bit: True
395
- - llm_int8_threshold: 6.0
396
- - llm_int8_skip_modules: None
397
- - llm_int8_enable_fp32_cpu_offload: False
398
- - llm_int8_has_fp16_weight: False
399
- - bnb_4bit_quant_type: nf4
400
- - bnb_4bit_use_double_quant: True
401
- - bnb_4bit_compute_dtype: bfloat16
402
-
403
- The following `bitsandbytes` quantization config was used during training:
404
- - quant_method: bitsandbytes
405
- - load_in_8bit: False
406
- - load_in_4bit: True
407
- - llm_int8_threshold: 6.0
408
- - llm_int8_skip_modules: None
409
- - llm_int8_enable_fp32_cpu_offload: False
410
- - llm_int8_has_fp16_weight: False
411
- - bnb_4bit_quant_type: nf4
412
- - bnb_4bit_use_double_quant: True
413
- - bnb_4bit_compute_dtype: bfloat16
414
-
415
- The following `bitsandbytes` quantization config was used during training:
416
- - quant_method: bitsandbytes
417
- - load_in_8bit: False
418
- - load_in_4bit: True
419
- - llm_int8_threshold: 6.0
420
- - llm_int8_skip_modules: None
421
- - llm_int8_enable_fp32_cpu_offload: False
422
- - llm_int8_has_fp16_weight: False
423
- - bnb_4bit_quant_type: nf4
424
- - bnb_4bit_use_double_quant: True
425
- - bnb_4bit_compute_dtype: bfloat16
426
-
427
- The following `bitsandbytes` quantization config was used during training:
428
- - quant_method: bitsandbytes
429
- - load_in_8bit: False
430
- - load_in_4bit: True
431
- - llm_int8_threshold: 6.0
432
- - llm_int8_skip_modules: None
433
- - llm_int8_enable_fp32_cpu_offload: False
434
- - llm_int8_has_fp16_weight: False
435
- - bnb_4bit_quant_type: nf4
436
- - bnb_4bit_use_double_quant: True
437
- - bnb_4bit_compute_dtype: bfloat16
438
-
439
- The following `bitsandbytes` quantization config was used during training:
440
- - quant_method: bitsandbytes
441
- - load_in_8bit: False
442
- - load_in_4bit: True
443
- - llm_int8_threshold: 6.0
444
- - llm_int8_skip_modules: None
445
- - llm_int8_enable_fp32_cpu_offload: False
446
- - llm_int8_has_fp16_weight: False
447
- - bnb_4bit_quant_type: nf4
448
- - bnb_4bit_use_double_quant: True
449
- - bnb_4bit_compute_dtype: bfloat16
450
-
451
- The following `bitsandbytes` quantization config was used during training:
452
- - quant_method: bitsandbytes
453
- - load_in_8bit: False
454
- - load_in_4bit: True
455
- - llm_int8_threshold: 6.0
456
- - llm_int8_skip_modules: None
457
- - llm_int8_enable_fp32_cpu_offload: False
458
- - llm_int8_has_fp16_weight: False
459
- - bnb_4bit_quant_type: nf4
460
- - bnb_4bit_use_double_quant: True
461
- - bnb_4bit_compute_dtype: bfloat16
462
-
463
- The following `bitsandbytes` quantization config was used during training:
464
- - quant_method: bitsandbytes
465
- - load_in_8bit: False
466
- - load_in_4bit: True
467
- - llm_int8_threshold: 6.0
468
- - llm_int8_skip_modules: None
469
- - llm_int8_enable_fp32_cpu_offload: False
470
- - llm_int8_has_fp16_weight: False
471
- - bnb_4bit_quant_type: nf4
472
- - bnb_4bit_use_double_quant: True
473
- - bnb_4bit_compute_dtype: bfloat16
474
-
475
- The following `bitsandbytes` quantization config was used during training:
476
- - quant_method: bitsandbytes
477
- - load_in_8bit: False
478
- - load_in_4bit: True
479
- - llm_int8_threshold: 6.0
480
- - llm_int8_skip_modules: None
481
- - llm_int8_enable_fp32_cpu_offload: False
482
- - llm_int8_has_fp16_weight: False
483
- - bnb_4bit_quant_type: nf4
484
- - bnb_4bit_use_double_quant: True
485
- - bnb_4bit_compute_dtype: bfloat16
486
-
487
- The following `bitsandbytes` quantization config was used during training:
488
- - quant_method: bitsandbytes
489
- - load_in_8bit: False
490
- - load_in_4bit: True
491
- - llm_int8_threshold: 6.0
492
- - llm_int8_skip_modules: None
493
- - llm_int8_enable_fp32_cpu_offload: False
494
- - llm_int8_has_fp16_weight: False
495
- - bnb_4bit_quant_type: nf4
496
- - bnb_4bit_use_double_quant: True
497
- - bnb_4bit_compute_dtype: bfloat16
498
-
499
- The following `bitsandbytes` quantization config was used during training:
500
- - quant_method: bitsandbytes
501
- - load_in_8bit: False
502
- - load_in_4bit: True
503
- - llm_int8_threshold: 6.0
504
- - llm_int8_skip_modules: None
505
- - llm_int8_enable_fp32_cpu_offload: False
506
- - llm_int8_has_fp16_weight: False
507
- - bnb_4bit_quant_type: nf4
508
- - bnb_4bit_use_double_quant: True
509
- - bnb_4bit_compute_dtype: bfloat16
510
-
511
- The following `bitsandbytes` quantization config was used during training:
512
- - quant_method: bitsandbytes
513
- - load_in_8bit: False
514
- - load_in_4bit: True
515
- - llm_int8_threshold: 6.0
516
- - llm_int8_skip_modules: None
517
- - llm_int8_enable_fp32_cpu_offload: False
518
- - llm_int8_has_fp16_weight: False
519
- - bnb_4bit_quant_type: nf4
520
- - bnb_4bit_use_double_quant: True
521
- - bnb_4bit_compute_dtype: bfloat16
522
-
523
- The following `bitsandbytes` quantization config was used during training:
524
- - quant_method: bitsandbytes
525
- - load_in_8bit: False
526
- - load_in_4bit: True
527
- - llm_int8_threshold: 6.0
528
- - llm_int8_skip_modules: None
529
- - llm_int8_enable_fp32_cpu_offload: False
530
- - llm_int8_has_fp16_weight: False
531
- - bnb_4bit_quant_type: nf4
532
- - bnb_4bit_use_double_quant: True
533
- - bnb_4bit_compute_dtype: bfloat16
534
-
535
- The following `bitsandbytes` quantization config was used during training:
536
- - quant_method: bitsandbytes
537
- - load_in_8bit: False
538
- - load_in_4bit: True
539
- - llm_int8_threshold: 6.0
540
- - llm_int8_skip_modules: None
541
- - llm_int8_enable_fp32_cpu_offload: False
542
- - llm_int8_has_fp16_weight: False
543
- - bnb_4bit_quant_type: nf4
544
- - bnb_4bit_use_double_quant: True
545
- - bnb_4bit_compute_dtype: bfloat16
546
-
547
- The following `bitsandbytes` quantization config was used during training:
548
- - quant_method: bitsandbytes
549
- - load_in_8bit: False
550
- - load_in_4bit: True
551
- - llm_int8_threshold: 6.0
552
- - llm_int8_skip_modules: None
553
- - llm_int8_enable_fp32_cpu_offload: False
554
- - llm_int8_has_fp16_weight: False
555
- - bnb_4bit_quant_type: nf4
556
- - bnb_4bit_use_double_quant: True
557
- - bnb_4bit_compute_dtype: bfloat16
558
-
559
- The following `bitsandbytes` quantization config was used during training:
560
- - quant_method: bitsandbytes
561
- - load_in_8bit: False
562
- - load_in_4bit: True
563
- - llm_int8_threshold: 6.0
564
- - llm_int8_skip_modules: None
565
- - llm_int8_enable_fp32_cpu_offload: False
566
- - llm_int8_has_fp16_weight: False
567
- - bnb_4bit_quant_type: nf4
568
- - bnb_4bit_use_double_quant: True
569
- - bnb_4bit_compute_dtype: bfloat16
570
-
571
- The following `bitsandbytes` quantization config was used during training:
572
- - quant_method: bitsandbytes
573
- - load_in_8bit: False
574
- - load_in_4bit: True
575
- - llm_int8_threshold: 6.0
576
- - llm_int8_skip_modules: None
577
- - llm_int8_enable_fp32_cpu_offload: False
578
- - llm_int8_has_fp16_weight: False
579
- - bnb_4bit_quant_type: nf4
580
- - bnb_4bit_use_double_quant: True
581
- - bnb_4bit_compute_dtype: bfloat16
582
-
583
- The following `bitsandbytes` quantization config was used during training:
584
- - quant_method: bitsandbytes
585
- - load_in_8bit: False
586
- - load_in_4bit: True
587
- - llm_int8_threshold: 6.0
588
- - llm_int8_skip_modules: None
589
- - llm_int8_enable_fp32_cpu_offload: False
590
- - llm_int8_has_fp16_weight: False
591
- - bnb_4bit_quant_type: nf4
592
- - bnb_4bit_use_double_quant: True
593
- - bnb_4bit_compute_dtype: bfloat16
594
-
595
- The following `bitsandbytes` quantization config was used during training:
596
- - quant_method: bitsandbytes
597
- - load_in_8bit: False
598
- - load_in_4bit: True
599
- - llm_int8_threshold: 6.0
600
- - llm_int8_skip_modules: None
601
- - llm_int8_enable_fp32_cpu_offload: False
602
- - llm_int8_has_fp16_weight: False
603
- - bnb_4bit_quant_type: nf4
604
- - bnb_4bit_use_double_quant: True
605
- - bnb_4bit_compute_dtype: bfloat16
606
-
607
- The following `bitsandbytes` quantization config was used during training:
608
- - quant_method: bitsandbytes
609
- - load_in_8bit: False
610
- - load_in_4bit: True
611
- - llm_int8_threshold: 6.0
612
- - llm_int8_skip_modules: None
613
- - llm_int8_enable_fp32_cpu_offload: False
614
- - llm_int8_has_fp16_weight: False
615
- - bnb_4bit_quant_type: nf4
616
- - bnb_4bit_use_double_quant: True
617
- - bnb_4bit_compute_dtype: bfloat16
618
-
619
- The following `bitsandbytes` quantization config was used during training:
620
- - quant_method: bitsandbytes
621
- - load_in_8bit: False
622
- - load_in_4bit: True
623
- - llm_int8_threshold: 6.0
624
- - llm_int8_skip_modules: None
625
- - llm_int8_enable_fp32_cpu_offload: False
626
- - llm_int8_has_fp16_weight: False
627
- - bnb_4bit_quant_type: nf4
628
- - bnb_4bit_use_double_quant: True
629
- - bnb_4bit_compute_dtype: bfloat16
630
-
631
- The following `bitsandbytes` quantization config was used during training:
632
- - quant_method: bitsandbytes
633
- - load_in_8bit: False
634
- - load_in_4bit: True
635
- - llm_int8_threshold: 6.0
636
- - llm_int8_skip_modules: None
637
- - llm_int8_enable_fp32_cpu_offload: False
638
- - llm_int8_has_fp16_weight: False
639
- - bnb_4bit_quant_type: nf4
640
- - bnb_4bit_use_double_quant: True
641
- - bnb_4bit_compute_dtype: bfloat16
642
-
643
- The following `bitsandbytes` quantization config was used during training:
644
- - quant_method: bitsandbytes
645
- - load_in_8bit: False
646
- - load_in_4bit: True
647
- - llm_int8_threshold: 6.0
648
- - llm_int8_skip_modules: None
649
- - llm_int8_enable_fp32_cpu_offload: False
650
- - llm_int8_has_fp16_weight: False
651
- - bnb_4bit_quant_type: nf4
652
- - bnb_4bit_use_double_quant: True
653
- - bnb_4bit_compute_dtype: bfloat16
654
-
655
- The following `bitsandbytes` quantization config was used during training:
656
- - quant_method: bitsandbytes
657
- - load_in_8bit: False
658
- - load_in_4bit: True
659
- - llm_int8_threshold: 6.0
660
- - llm_int8_skip_modules: None
661
- - llm_int8_enable_fp32_cpu_offload: False
662
- - llm_int8_has_fp16_weight: False
663
- - bnb_4bit_quant_type: nf4
664
- - bnb_4bit_use_double_quant: True
665
- - bnb_4bit_compute_dtype: bfloat16
666
-
667
- The following `bitsandbytes` quantization config was used during training:
668
- - quant_method: bitsandbytes
669
- - load_in_8bit: False
670
- - load_in_4bit: True
671
- - llm_int8_threshold: 6.0
672
- - llm_int8_skip_modules: None
673
- - llm_int8_enable_fp32_cpu_offload: False
674
- - llm_int8_has_fp16_weight: False
675
- - bnb_4bit_quant_type: nf4
676
- - bnb_4bit_use_double_quant: True
677
- - bnb_4bit_compute_dtype: bfloat16
678
-
679
- The following `bitsandbytes` quantization config was used during training:
680
- - quant_method: bitsandbytes
681
- - load_in_8bit: False
682
- - load_in_4bit: True
683
- - llm_int8_threshold: 6.0
684
- - llm_int8_skip_modules: None
685
- - llm_int8_enable_fp32_cpu_offload: False
686
- - llm_int8_has_fp16_weight: False
687
- - bnb_4bit_quant_type: nf4
688
- - bnb_4bit_use_double_quant: True
689
- - bnb_4bit_compute_dtype: bfloat16
690
-
691
- The following `bitsandbytes` quantization config was used during training:
692
- - quant_method: bitsandbytes
693
- - load_in_8bit: False
694
- - load_in_4bit: True
695
- - llm_int8_threshold: 6.0
696
- - llm_int8_skip_modules: None
697
- - llm_int8_enable_fp32_cpu_offload: False
698
- - llm_int8_has_fp16_weight: False
699
- - bnb_4bit_quant_type: nf4
700
- - bnb_4bit_use_double_quant: True
701
- - bnb_4bit_compute_dtype: bfloat16
702
-
703
- The following `bitsandbytes` quantization config was used during training:
704
- - quant_method: bitsandbytes
705
- - load_in_8bit: False
706
- - load_in_4bit: True
707
- - llm_int8_threshold: 6.0
708
- - llm_int8_skip_modules: None
709
- - llm_int8_enable_fp32_cpu_offload: False
710
- - llm_int8_has_fp16_weight: False
711
- - bnb_4bit_quant_type: nf4
712
- - bnb_4bit_use_double_quant: True
713
- - bnb_4bit_compute_dtype: bfloat16
714
-
715
- The following `bitsandbytes` quantization config was used during training:
716
- - quant_method: bitsandbytes
717
- - load_in_8bit: False
718
- - load_in_4bit: True
719
- - llm_int8_threshold: 6.0
720
- - llm_int8_skip_modules: None
721
- - llm_int8_enable_fp32_cpu_offload: False
722
- - llm_int8_has_fp16_weight: False
723
- - bnb_4bit_quant_type: nf4
724
- - bnb_4bit_use_double_quant: True
725
- - bnb_4bit_compute_dtype: bfloat16
726
-
727
- The following `bitsandbytes` quantization config was used during training:
728
- - quant_method: bitsandbytes
729
- - load_in_8bit: False
730
- - load_in_4bit: True
731
- - llm_int8_threshold: 6.0
732
- - llm_int8_skip_modules: None
733
- - llm_int8_enable_fp32_cpu_offload: False
734
- - llm_int8_has_fp16_weight: False
735
- - bnb_4bit_quant_type: nf4
736
- - bnb_4bit_use_double_quant: True
737
- - bnb_4bit_compute_dtype: bfloat16
738
-
739
- The following `bitsandbytes` quantization config was used during training:
740
- - quant_method: bitsandbytes
741
- - load_in_8bit: False
742
- - load_in_4bit: True
743
- - llm_int8_threshold: 6.0
744
- - llm_int8_skip_modules: None
745
- - llm_int8_enable_fp32_cpu_offload: False
746
- - llm_int8_has_fp16_weight: False
747
- - bnb_4bit_quant_type: nf4
748
- - bnb_4bit_use_double_quant: True
749
- - bnb_4bit_compute_dtype: bfloat16
750
-
751
- The following `bitsandbytes` quantization config was used during training:
752
- - quant_method: bitsandbytes
753
- - load_in_8bit: False
754
- - load_in_4bit: True
755
- - llm_int8_threshold: 6.0
756
- - llm_int8_skip_modules: None
757
- - llm_int8_enable_fp32_cpu_offload: False
758
- - llm_int8_has_fp16_weight: False
759
- - bnb_4bit_quant_type: nf4
760
- - bnb_4bit_use_double_quant: True
761
- - bnb_4bit_compute_dtype: bfloat16
762
-
763
- The following `bitsandbytes` quantization config was used during training:
764
- - quant_method: bitsandbytes
765
- - load_in_8bit: False
766
- - load_in_4bit: True
767
- - llm_int8_threshold: 6.0
768
- - llm_int8_skip_modules: None
769
- - llm_int8_enable_fp32_cpu_offload: False
770
- - llm_int8_has_fp16_weight: False
771
- - bnb_4bit_quant_type: nf4
772
- - bnb_4bit_use_double_quant: True
773
- - bnb_4bit_compute_dtype: bfloat16
774
-
775
- The following `bitsandbytes` quantization config was used during training:
776
- - quant_method: bitsandbytes
777
- - load_in_8bit: False
778
- - load_in_4bit: True
779
- - llm_int8_threshold: 6.0
780
- - llm_int8_skip_modules: None
781
- - llm_int8_enable_fp32_cpu_offload: False
782
- - llm_int8_has_fp16_weight: False
783
- - bnb_4bit_quant_type: nf4
784
- - bnb_4bit_use_double_quant: True
785
- - bnb_4bit_compute_dtype: bfloat16
786
-
787
- The following `bitsandbytes` quantization config was used during training:
788
- - quant_method: bitsandbytes
789
- - load_in_8bit: False
790
- - load_in_4bit: True
791
- - llm_int8_threshold: 6.0
792
- - llm_int8_skip_modules: None
793
- - llm_int8_enable_fp32_cpu_offload: False
794
- - llm_int8_has_fp16_weight: False
795
- - bnb_4bit_quant_type: nf4
796
- - bnb_4bit_use_double_quant: True
797
- - bnb_4bit_compute_dtype: bfloat16
798
-
799
- The following `bitsandbytes` quantization config was used during training:
800
- - quant_method: bitsandbytes
801
- - load_in_8bit: False
802
- - load_in_4bit: True
803
- - llm_int8_threshold: 6.0
804
- - llm_int8_skip_modules: None
805
- - llm_int8_enable_fp32_cpu_offload: False
806
- - llm_int8_has_fp16_weight: False
807
- - bnb_4bit_quant_type: nf4
808
- - bnb_4bit_use_double_quant: True
809
- - bnb_4bit_compute_dtype: bfloat16
810
-
811
- The following `bitsandbytes` quantization config was used during training:
812
- - quant_method: bitsandbytes
813
- - load_in_8bit: False
814
- - load_in_4bit: True
815
- - llm_int8_threshold: 6.0
816
- - llm_int8_skip_modules: None
817
- - llm_int8_enable_fp32_cpu_offload: False
818
- - llm_int8_has_fp16_weight: False
819
- - bnb_4bit_quant_type: nf4
820
- - bnb_4bit_use_double_quant: True
821
- - bnb_4bit_compute_dtype: bfloat16
822
-
823
- The following `bitsandbytes` quantization config was used during training:
824
- - quant_method: bitsandbytes
825
- - load_in_8bit: False
826
- - load_in_4bit: True
827
- - llm_int8_threshold: 6.0
828
- - llm_int8_skip_modules: None
829
- - llm_int8_enable_fp32_cpu_offload: False
830
- - llm_int8_has_fp16_weight: False
831
- - bnb_4bit_quant_type: nf4
832
- - bnb_4bit_use_double_quant: True
833
- - bnb_4bit_compute_dtype: bfloat16
834
-
835
- The following `bitsandbytes` quantization config was used during training:
836
- - quant_method: bitsandbytes
837
- - load_in_8bit: False
838
- - load_in_4bit: True
839
- - llm_int8_threshold: 6.0
840
- - llm_int8_skip_modules: None
841
- - llm_int8_enable_fp32_cpu_offload: False
842
- - llm_int8_has_fp16_weight: False
843
- - bnb_4bit_quant_type: nf4
844
- - bnb_4bit_use_double_quant: True
845
- - bnb_4bit_compute_dtype: bfloat16
846
-
847
- The following `bitsandbytes` quantization config was used during training:
848
- - quant_method: bitsandbytes
849
- - load_in_8bit: False
850
- - load_in_4bit: True
851
- - llm_int8_threshold: 6.0
852
- - llm_int8_skip_modules: None
853
- - llm_int8_enable_fp32_cpu_offload: False
854
- - llm_int8_has_fp16_weight: False
855
- - bnb_4bit_quant_type: nf4
856
- - bnb_4bit_use_double_quant: True
857
- - bnb_4bit_compute_dtype: bfloat16
858
 
859
- The following `bitsandbytes` quantization config was used during training:
860
- - quant_method: bitsandbytes
861
- - load_in_8bit: False
862
- - load_in_4bit: True
863
- - llm_int8_threshold: 6.0
864
- - llm_int8_skip_modules: None
865
- - llm_int8_enable_fp32_cpu_offload: False
866
- - llm_int8_has_fp16_weight: False
867
- - bnb_4bit_quant_type: nf4
868
- - bnb_4bit_use_double_quant: True
869
- - bnb_4bit_compute_dtype: bfloat16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
870
 
871
- The following `bitsandbytes` quantization config was used during training:
872
- - quant_method: bitsandbytes
873
- - load_in_8bit: False
874
- - load_in_4bit: True
875
- - llm_int8_threshold: 6.0
876
- - llm_int8_skip_modules: None
877
- - llm_int8_enable_fp32_cpu_offload: False
878
- - llm_int8_has_fp16_weight: False
879
- - bnb_4bit_quant_type: nf4
880
- - bnb_4bit_use_double_quant: True
881
- - bnb_4bit_compute_dtype: bfloat16
882
 
883
- The following `bitsandbytes` quantization config was used during training:
884
- - quant_method: bitsandbytes
885
- - load_in_8bit: False
886
- - load_in_4bit: True
887
- - llm_int8_threshold: 6.0
888
- - llm_int8_skip_modules: None
889
- - llm_int8_enable_fp32_cpu_offload: False
890
- - llm_int8_has_fp16_weight: False
891
- - bnb_4bit_quant_type: nf4
892
- - bnb_4bit_use_double_quant: True
893
- - bnb_4bit_compute_dtype: bfloat16
894
 
895
  The following `bitsandbytes` quantization config was used during training:
896
  - quant_method: bitsandbytes
@@ -904,93 +98,6 @@ The following `bitsandbytes` quantization config was used during training:
904
  - bnb_4bit_use_double_quant: True
905
  - bnb_4bit_compute_dtype: bfloat16
906
 
907
- The following `bitsandbytes` quantization config was used during training:
908
- - quant_method: bitsandbytes
909
- - load_in_8bit: False
910
- - load_in_4bit: True
911
- - llm_int8_threshold: 6.0
912
- - llm_int8_skip_modules: None
913
- - llm_int8_enable_fp32_cpu_offload: False
914
- - llm_int8_has_fp16_weight: False
915
- - bnb_4bit_quant_type: nf4
916
- - bnb_4bit_use_double_quant: True
917
- - bnb_4bit_compute_dtype: bfloat16
918
  ### Framework versions
919
 
920
  - PEFT 0.4.0
921
- - PEFT 0.4.0
922
- - PEFT 0.4.0
923
- - PEFT 0.4.0
924
- - PEFT 0.4.0
925
- - PEFT 0.4.0
926
- - PEFT 0.4.0
927
- - PEFT 0.4.0
928
- - PEFT 0.4.0
929
- - PEFT 0.4.0
930
- - PEFT 0.4.0
931
- - PEFT 0.4.0
932
- - PEFT 0.4.0
933
- - PEFT 0.4.0
934
- - PEFT 0.4.0
935
- - PEFT 0.4.0
936
- - PEFT 0.4.0
937
- - PEFT 0.4.0
938
- - PEFT 0.4.0
939
- - PEFT 0.4.0
940
- - PEFT 0.4.0
941
- - PEFT 0.4.0
942
- - PEFT 0.4.0
943
- - PEFT 0.4.0
944
- - PEFT 0.4.0
945
- - PEFT 0.4.0
946
- - PEFT 0.4.0
947
- - PEFT 0.4.0
948
- - PEFT 0.4.0
949
- - PEFT 0.4.0
950
- - PEFT 0.4.0
951
- - PEFT 0.4.0
952
- - PEFT 0.4.0
953
- - PEFT 0.4.0
954
- - PEFT 0.4.0
955
- - PEFT 0.4.0
956
- - PEFT 0.4.0
957
- - PEFT 0.4.0
958
- - PEFT 0.4.0
959
- - PEFT 0.4.0
960
- - PEFT 0.4.0
961
- - PEFT 0.4.0
962
- - PEFT 0.4.0
963
- - PEFT 0.4.0
964
- - PEFT 0.4.0
965
- - PEFT 0.4.0
966
- - PEFT 0.4.0
967
- - PEFT 0.4.0
968
- - PEFT 0.4.0
969
- - PEFT 0.4.0
970
- - PEFT 0.4.0
971
- - PEFT 0.4.0
972
- - PEFT 0.4.0
973
- - PEFT 0.4.0
974
- - PEFT 0.4.0
975
- - PEFT 0.4.0
976
- - PEFT 0.4.0
977
- - PEFT 0.4.0
978
- - PEFT 0.4.0
979
- - PEFT 0.4.0
980
- - PEFT 0.4.0
981
- - PEFT 0.4.0
982
- - PEFT 0.4.0
983
- - PEFT 0.4.0
984
- - PEFT 0.4.0
985
- - PEFT 0.4.0
986
- - PEFT 0.4.0
987
- - PEFT 0.4.0
988
- - PEFT 0.4.0
989
- - PEFT 0.4.0
990
- - PEFT 0.4.0
991
- - PEFT 0.4.0
992
- - PEFT 0.4.0
993
- - PEFT 0.4.0
994
- - PEFT 0.4.0
995
-
996
- - PEFT 0.4.0
 
1
  ---
2
  library_name: peft
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
+ BigVAE is an [AdaVAE](https://arxiv.org/abs/2205.05862) trained as a pair of LoRa finetunes on [Mistral 7B](https://huggingface.co/mistralai/Mistral-7B-v0.1).
6
+ It is meant to be used with the [MiniHF VAE inference code](https://github.com/JD-P/minihf/blob/adavae-moe/vae_infer.py) and will not work if you try to load it
7
+ as an ordinary language checkpoint and perform inference. AdaVAE is an encoder-decoder model trained by taking an existing GPT-N and designating one LoRa the
8
+ encoder and the other its decoder and then tuning with a latent attention mechanism. This model is the encoder and router decoder head for BigVAE, a planned
9
+ Mixture-of-Experts system based on LoRa retrieval rather than gating. It is usable in and of itself as a model for embedding, retrieval, as well as planning
10
+ and guided sampling. Here is an example of a sampling procedure for BigVAE which distills its autoregressive pretraining task into its autoassociative
11
+ recontruction task by averaging together multiple completions. It takes the topic sentence of a paragraph (prompt), guides the next sentences by weighing
12
+ them towards the topic, while averaging together multiple completions on each sentence to improve generation quality:
13
+
14
+ ```
15
+ def bigvae_generate_avg(vae_model, router, prompt, context, n_steps, n_avg):
16
+ with torch.cuda.amp.autocast(dtype=torch.bfloat16):
17
+ context_toks = tokenizer(context, return_tensors="pt")
18
+ context_ids = context_toks["input_ids"].to(device)
19
+ context_mask = context_toks["attention_mask"].to(device)
20
+ embed_toks = tokenizer(prompt, return_tensors="pt")
21
+ embed_ids = embed_toks["input_ids"].to(device)
22
+ embed_mask = embed_toks["attention_mask"].to(device)
23
+ mean = vae_model.encode(embed_ids, embed_mask)
24
+ prompt_embed = vae_model.vae.sample(mean)
25
+ for i in range(n_steps):
26
+ mean = vae_model.encode(embed_ids, embed_mask)
27
+ z = vae_model.vae.sample(mean)
28
+ embeds = []
29
+ for i in range(n_avg):
30
+ output_ids = router.generate(z * 0.5 + prompt_embed * 0.5,
31
+ context_ids,
32
+ context_mask,
33
+ 256,
34
+ tau=0.9)
35
+ intermediate_embed_ids = output_ids[:,-128:]
36
+ intermediate_embed_mask = context_mask.new_ones(
37
+ [1, intermediate_embed_ids.shape[1]]
38
+ )
39
+ mean = vae_model.encode(intermediate_embed_ids, intermediate_embed_mask)
40
+ embeds.append(vae_model.vae.sample(mean))
41
+ output_ids = router.generate((sum(embeds) / n_avg * 0.7) + prompt_embed * 0.3,
42
+ context_ids,
43
+ context_mask,
44
+ 256,
45
+ tau=0.9)
46
+ context_ids = torch.cat([context_ids, embed_ids], dim=1)
47
+ context_mask = torch.cat([context_mask, embed_mask], dim=1)
48
+ embed_ids = output_ids[:,-256:-128]
49
+ embed_mask = context_mask.new_ones([1, embed_ids.shape[1]])
50
+ out_texts = [tokenizer.decode(toks, skip_special_tokens=True) for toks in context_ids]
51
+ return out_texts
52
+ ```
53
+
54
+ Here is an example of an output from this process:
55
+
56
+ ```
57
+ Then it asked the network to reconstruct the input and the original embedding. The network had to learn to match the embedding to the original input, therefore matching the inference by consuming the embedding. This was key because the embedding had to be able to match the text with the text it was consumed with. 'Here's how you do it,' Boru told Mu, 'Just impute the mean and variance.' This Mu did, transforming not words but entire paragraphs into vectors and then inferring the next paragraph. It took some tweaks and tuning to get the initial performance but the second arago spot had been found. To make sure the network was learning the right thing, Boru had to check the first value in the vector. If the first value was below 0, the network had failed to learn the first value. If the value was above 0, the network had been able to learn the first value.
58
+ ‘What have you called this, Boru?’ asked Mu. ‘Latent variable regression.’ ‘It looks like a mixture of density network and autoencoder,’ said Nayaf. ‘It’s an autoencoder but it’s using latent variables, but we’re using the mean and variance of Grade had a difficult time seeing it, but he could tell it was close. 'So you've found the second arago,' he said.
59
+ 'Yes,' Rin replied. 'We just have to figure out how to use it.'
60
+ 'How?' Rin asked.
61
+ 'You can move the second word in, right?'
62
+ 'Possibly.' Rin thought for a moment.
63
+ 'The second word will be the first word of the next arago,' Mu said. 'We just need to find it.'
64
+ 'True,' Rin agreed. 'Well, I'll let you know what a Gaussian.’ ‘Let’s see if we can get it to work.’ ‘Arago the second spot?’ ‘We’re here,’ Arago said.
65
+ The second spot was located in the middle of the text. Arago had to read it again to find the proper signal. ‘I’m going to have to tweak some of the weights,’ said Arago. ‘I’ve had to change the input to the next layer from an input to output.’ ‘You’re making a mistake again,’ said Mu to Arago. ‘It’s a mistake.’ The network had been learning I find out.'
66
+ 'That's the second arago,' Rin said.
67
+ 'The second arago?' Argo asked.
68
+ 'Rin has found the second arago.'
69
+ Argo stared at Rin. 'Argo, is there something wrong?'
70
+ 'I thought so.'
71
+ 'What?' Rin said.
72
+ 'I don't know,' Argo said. 'I thought I was the smartest person in the world but, well, I only had a certain amount of energy. I didn't know how to do the second arago until now, but I can't
73
+ ```
74
+
75
+ This generation method is slow, but retrieval could be used to speed up inference and make it converge closer and closer
76
+ to normal sampling speed as the model becomes able to call upon more and more relevant sentences that it has generated before.
77
+
78
+ Because the BigVAE combines guided sampling with the ability to merge representations, it becomes possible to formulate plans and
79
+ cognitive strategies for the model to follow. The inference policy can adjudicate between an expected plan or series of steps and
80
+ the specific context the model is responding to.
81
+
82
+ This model is also highly interpretable. Because it is an encoder-decoder every sentence generated by the model has a latent representation
83
+ that can be tracked along with its behavioral token sequence. Our hope is that BigVAE will shed light on the latent operations performed by
84
+ autoregressive language models and be useful to alignment and interpretability researchers.
85
 
86
+ ## Training procedure
 
 
 
 
 
 
 
 
 
 
87
 
 
 
 
 
 
 
 
 
 
 
 
88
 
89
  The following `bitsandbytes` quantization config was used during training:
90
  - quant_method: bitsandbytes
 
98
  - bnb_4bit_use_double_quant: True
99
  - bnb_4bit_compute_dtype: bfloat16
100
 
 
 
 
 
 
 
 
 
 
 
 
101
  ### Framework versions
102
 
103
  - PEFT 0.4.0