File size: 49,072 Bytes
404d1da
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554


autogptq quant. logs


```
>>> model.quantize(examples)
2023-07-21 16:54:47 INFO [auto_gptq.modeling._base] Start quantizing layer 1/32
2023-07-21 16:54:47 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 1/32...
2023-07-21 16:54:48 INFO [auto_gptq.quantization.gptq] duration: 0.8171646595001221
2023-07-21 16:54:48 INFO [auto_gptq.quantization.gptq] avg loss: 3.7546463012695312
2023-07-21 16:54:48 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 1/32...
2023-07-21 16:54:49 INFO [auto_gptq.quantization.gptq] duration: 0.8055715560913086
2023-07-21 16:54:49 INFO [auto_gptq.quantization.gptq] avg loss: 0.2164316177368164
2023-07-21 16:54:49 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 1/32...
2023-07-21 16:54:50 INFO [auto_gptq.quantization.gptq] duration: 0.8417620658874512
2023-07-21 16:54:50 INFO [auto_gptq.quantization.gptq] avg loss: 16.070518493652344
2023-07-21 16:54:50 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 1/32...
2023-07-21 16:54:53 INFO [auto_gptq.quantization.gptq] duration: 3.90244197845459
2023-07-21 16:54:53 INFO [auto_gptq.quantization.gptq] avg loss: 0.5676069855690002
2023-07-21 16:54:53 INFO [auto_gptq.modeling._base] Start quantizing layer 2/32
2023-07-21 16:54:54 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 2/32...
2023-07-21 16:54:54 INFO [auto_gptq.quantization.gptq] duration: 0.8373761177062988
2023-07-21 16:54:54 INFO [auto_gptq.quantization.gptq] avg loss: 4.066518783569336
2023-07-21 16:54:54 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 2/32...
2023-07-21 16:54:55 INFO [auto_gptq.quantization.gptq] duration: 0.8285796642303467
2023-07-21 16:54:55 INFO [auto_gptq.quantization.gptq] avg loss: 0.2558078169822693
2023-07-21 16:55:25 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 2/32...
2023-07-21 16:55:25 INFO [auto_gptq.quantization.gptq] duration: 0.8859198093414307
2023-07-21 16:55:25 INFO [auto_gptq.quantization.gptq] avg loss: 16.571727752685547
2023-07-21 16:55:26 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 2/32...
2023-07-21 16:55:29 INFO [auto_gptq.quantization.gptq] duration: 3.86962890625
2023-07-21 16:55:29 INFO [auto_gptq.quantization.gptq] avg loss: 0.34605544805526733
2023-07-21 16:55:30 INFO [auto_gptq.modeling._base] Start quantizing layer 3/32
2023-07-21 16:55:30 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 3/32...
2023-07-21 16:55:30 INFO [auto_gptq.quantization.gptq] duration: 0.8118832111358643
2023-07-21 16:55:30 INFO [auto_gptq.quantization.gptq] avg loss: 5.4185943603515625
2023-07-21 16:55:30 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 3/32...
2023-07-21 16:55:31 INFO [auto_gptq.quantization.gptq] duration: 0.8096959590911865
2023-07-21 16:55:31 INFO [auto_gptq.quantization.gptq] avg loss: 0.22585009038448334
2023-07-21 16:55:31 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 3/32...
2023-07-21 16:55:32 INFO [auto_gptq.quantization.gptq] duration: 0.8473665714263916
2023-07-21 16:55:32 INFO [auto_gptq.quantization.gptq] avg loss: 27.050426483154297
2023-07-21 16:55:32 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 3/32...
2023-07-21 16:55:36 INFO [auto_gptq.quantization.gptq] duration: 3.8430850505828857
2023-07-21 16:55:36 INFO [auto_gptq.quantization.gptq] avg loss: 0.6839203834533691
2023-07-21 16:55:36 INFO [auto_gptq.modeling._base] Start quantizing layer 4/32
2023-07-21 16:55:36 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 4/32...
2023-07-21 16:55:37 INFO [auto_gptq.quantization.gptq] duration: 0.7948899269104004
2023-07-21 16:55:37 INFO [auto_gptq.quantization.gptq] avg loss: 6.523550987243652
2023-07-21 16:55:37 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 4/32...
2023-07-21 16:55:38 INFO [auto_gptq.quantization.gptq] duration: 0.7990512847900391
2023-07-21 16:55:38 INFO [auto_gptq.quantization.gptq] avg loss: 0.21638213098049164
2023-07-21 16:55:38 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 4/32...
2023-07-21 16:55:39 INFO [auto_gptq.quantization.gptq] duration: 0.8403058052062988
2023-07-21 16:55:39 INFO [auto_gptq.quantization.gptq] avg loss: 36.57025146484375
2023-07-21 16:55:39 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 4/32...
2023-07-21 16:55:43 INFO [auto_gptq.quantization.gptq] duration: 3.856529474258423
2023-07-21 16:55:43 INFO [auto_gptq.quantization.gptq] avg loss: 9.424503326416016
2023-07-21 16:55:43 INFO [auto_gptq.modeling._base] Start quantizing layer 5/32
2023-07-21 16:55:43 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 5/32...
2023-07-21 16:55:44 INFO [auto_gptq.quantization.gptq] duration: 0.7926647663116455
2023-07-21 16:55:44 INFO [auto_gptq.quantization.gptq] avg loss: 6.277029037475586
2023-07-21 16:55:44 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 5/32...
2023-07-21 16:55:44 INFO [auto_gptq.quantization.gptq] duration: 0.7987856864929199
2023-07-21 16:55:44 INFO [auto_gptq.quantization.gptq] avg loss: 0.1324760764837265
2023-07-21 16:55:44 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 5/32...
2023-07-21 16:55:45 INFO [auto_gptq.quantization.gptq] duration: 0.8394050598144531
2023-07-21 16:55:45 INFO [auto_gptq.quantization.gptq] avg loss: 36.26388168334961
2023-07-21 16:55:45 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 5/32...
2023-07-21 16:55:49 INFO [auto_gptq.quantization.gptq] duration: 3.849104166030884
2023-07-21 16:55:49 INFO [auto_gptq.quantization.gptq] avg loss: 2.376619338989258
2023-07-21 16:55:49 INFO [auto_gptq.modeling._base] Start quantizing layer 6/32
2023-07-21 16:55:49 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 6/32...
2023-07-21 16:55:50 INFO [auto_gptq.quantization.gptq] duration: 0.7964150905609131
2023-07-21 16:55:50 INFO [auto_gptq.quantization.gptq] avg loss: 8.479263305664062
2023-07-21 16:55:50 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 6/32...
2023-07-21 16:55:51 INFO [auto_gptq.quantization.gptq] duration: 0.7951827049255371
2023-07-21 16:55:51 INFO [auto_gptq.quantization.gptq] avg loss: 0.14170163869857788
2023-07-21 16:56:21 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 6/32...
2023-07-21 16:56:22 INFO [auto_gptq.quantization.gptq] duration: 0.8720560073852539
2023-07-21 16:56:22 INFO [auto_gptq.quantization.gptq] avg loss: 42.756919860839844
2023-07-21 16:56:22 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 6/32...
2023-07-21 16:56:25 INFO [auto_gptq.quantization.gptq] duration: 3.8685550689697266
2023-07-21 16:56:25 INFO [auto_gptq.quantization.gptq] avg loss: 0.8117952346801758
2023-07-21 16:56:26 INFO [auto_gptq.modeling._base] Start quantizing layer 7/32
2023-07-21 16:56:26 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 7/32...
2023-07-21 16:56:26 INFO [auto_gptq.quantization.gptq] duration: 0.7976808547973633
2023-07-21 16:56:26 INFO [auto_gptq.quantization.gptq] avg loss: 7.019394397735596
2023-07-21 16:56:26 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 7/32...
2023-07-21 16:56:27 INFO [auto_gptq.quantization.gptq] duration: 0.803225040435791
2023-07-21 16:56:27 INFO [auto_gptq.quantization.gptq] avg loss: 0.21443051099777222
2023-07-21 16:56:27 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 7/32...
2023-07-21 16:56:28 INFO [auto_gptq.quantization.gptq] duration: 0.8342931270599365
2023-07-21 16:56:28 INFO [auto_gptq.quantization.gptq] avg loss: 39.33504104614258
2023-07-21 16:56:28 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 7/32...
2023-07-21 16:56:32 INFO [auto_gptq.quantization.gptq] duration: 3.8671581745147705
2023-07-21 16:56:32 INFO [auto_gptq.quantization.gptq] avg loss: 0.9214520454406738
2023-07-21 16:56:32 INFO [auto_gptq.modeling._base] Start quantizing layer 8/32
2023-07-21 16:56:32 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 8/32...
2023-07-21 16:56:33 INFO [auto_gptq.quantization.gptq] duration: 0.7989864349365234
2023-07-21 16:56:33 INFO [auto_gptq.quantization.gptq] avg loss: 7.602280616760254
2023-07-21 16:56:33 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 8/32...
2023-07-21 16:56:34 INFO [auto_gptq.quantization.gptq] duration: 0.8112733364105225
2023-07-21 16:56:34 INFO [auto_gptq.quantization.gptq] avg loss: 0.11391645669937134
2023-07-21 16:56:34 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 8/32...
2023-07-21 16:56:35 INFO [auto_gptq.quantization.gptq] duration: 0.8388988971710205
2023-07-21 16:56:35 INFO [auto_gptq.quantization.gptq] avg loss: 34.74957275390625
2023-07-21 16:56:35 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 8/32...
2023-07-21 16:56:39 INFO [auto_gptq.quantization.gptq] duration: 3.8561182022094727
2023-07-21 16:56:39 INFO [auto_gptq.quantization.gptq] avg loss: 1.1289432048797607
2023-07-21 16:56:39 INFO [auto_gptq.modeling._base] Start quantizing layer 9/32
2023-07-21 16:56:39 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 9/32...
2023-07-21 16:56:40 INFO [auto_gptq.quantization.gptq] duration: 0.7969386577606201
2023-07-21 16:56:40 INFO [auto_gptq.quantization.gptq] avg loss: 6.806826591491699
2023-07-21 16:56:40 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 9/32...
2023-07-21 16:56:41 INFO [auto_gptq.quantization.gptq] duration: 0.7953078746795654
2023-07-21 16:56:41 INFO [auto_gptq.quantization.gptq] avg loss: 0.2318212240934372
2023-07-21 16:56:41 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 9/32...
2023-07-21 16:56:41 INFO [auto_gptq.quantization.gptq] duration: 0.8294937610626221
2023-07-21 16:56:41 INFO [auto_gptq.quantization.gptq] avg loss: 35.324676513671875
2023-07-21 16:56:41 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 9/32...
2023-07-21 16:56:45 INFO [auto_gptq.quantization.gptq] duration: 3.8630259037017822
2023-07-21 16:56:45 INFO [auto_gptq.quantization.gptq] avg loss: 1.4622347354888916
2023-07-21 16:56:45 INFO [auto_gptq.modeling._base] Start quantizing layer 10/32
2023-07-21 16:56:46 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 10/32...
2023-07-21 16:56:46 INFO [auto_gptq.quantization.gptq] duration: 0.8029708862304688
2023-07-21 16:56:46 INFO [auto_gptq.quantization.gptq] avg loss: 6.056252956390381
2023-07-21 16:56:46 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 10/32...
2023-07-21 16:56:47 INFO [auto_gptq.quantization.gptq] duration: 0.8028323650360107
2023-07-21 16:56:47 INFO [auto_gptq.quantization.gptq] avg loss: 1.092197060585022
2023-07-21 16:56:47 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 10/32...
2023-07-21 16:56:48 INFO [auto_gptq.quantization.gptq] duration: 0.8335537910461426
2023-07-21 16:56:48 INFO [auto_gptq.quantization.gptq] avg loss: 30.71457290649414
2023-07-21 16:56:48 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 10/32...
2023-07-21 16:56:52 INFO [auto_gptq.quantization.gptq] duration: 3.8703184127807617
2023-07-21 16:56:52 INFO [auto_gptq.quantization.gptq] avg loss: 1.2208330631256104
2023-07-21 16:56:52 INFO [auto_gptq.modeling._base] Start quantizing layer 11/32
2023-07-21 16:56:52 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 11/32...
2023-07-21 16:56:53 INFO [auto_gptq.quantization.gptq] duration: 0.814570426940918
2023-07-21 16:56:53 INFO [auto_gptq.quantization.gptq] avg loss: 6.145627021789551
2023-07-21 16:56:53 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 11/32...
2023-07-21 16:56:54 INFO [auto_gptq.quantization.gptq] duration: 0.8268287181854248
2023-07-21 16:56:54 INFO [auto_gptq.quantization.gptq] avg loss: 0.24324843287467957
2023-07-21 16:56:54 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 11/32...
2023-07-21 16:56:55 INFO [auto_gptq.quantization.gptq] duration: 0.8359119892120361
2023-07-21 16:56:55 INFO [auto_gptq.quantization.gptq] avg loss: 30.847026824951172
2023-07-21 16:56:55 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 11/32...
2023-07-21 16:56:58 INFO [auto_gptq.quantization.gptq] duration: 3.831470489501953
2023-07-21 16:56:58 INFO [auto_gptq.quantization.gptq] avg loss: 1.3961751461029053
2023-07-21 16:57:26 INFO [auto_gptq.modeling._base] Start quantizing layer 12/32
2023-07-21 16:57:26 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 12/32...
2023-07-21 16:57:27 INFO [auto_gptq.quantization.gptq] duration: 0.7964096069335938
2023-07-21 16:57:27 INFO [auto_gptq.quantization.gptq] avg loss: 6.053964614868164
2023-07-21 16:57:27 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 12/32...
2023-07-21 16:57:28 INFO [auto_gptq.quantization.gptq] duration: 0.799691915512085
2023-07-21 16:57:28 INFO [auto_gptq.quantization.gptq] avg loss: 0.2671034336090088
2023-07-21 16:57:28 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 12/32...
2023-07-21 16:57:29 INFO [auto_gptq.quantization.gptq] duration: 0.8342888355255127
2023-07-21 16:57:29 INFO [auto_gptq.quantization.gptq] avg loss: 29.729408264160156
2023-07-21 16:57:29 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 12/32...
2023-07-21 16:57:33 INFO [auto_gptq.quantization.gptq] duration: 3.8561949729919434
2023-07-21 16:57:33 INFO [auto_gptq.quantization.gptq] avg loss: 1.495622158050537
2023-07-21 16:57:33 INFO [auto_gptq.modeling._base] Start quantizing layer 13/32
2023-07-21 16:57:33 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 13/32...
2023-07-21 16:57:34 INFO [auto_gptq.quantization.gptq] duration: 0.7953364849090576
2023-07-21 16:57:34 INFO [auto_gptq.quantization.gptq] avg loss: 5.408998489379883
2023-07-21 16:57:34 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 13/32...
2023-07-21 16:57:34 INFO [auto_gptq.quantization.gptq] duration: 0.7990250587463379
2023-07-21 16:57:34 INFO [auto_gptq.quantization.gptq] avg loss: 0.5066410303115845
2023-07-21 16:57:34 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 13/32...
2023-07-21 16:57:35 INFO [auto_gptq.quantization.gptq] duration: 0.8330769538879395
2023-07-21 16:57:35 INFO [auto_gptq.quantization.gptq] avg loss: 27.790515899658203
2023-07-21 16:57:35 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 13/32...
2023-07-21 16:57:39 INFO [auto_gptq.quantization.gptq] duration: 3.861015558242798
2023-07-21 16:57:39 INFO [auto_gptq.quantization.gptq] avg loss: 1.3019633293151855
2023-07-21 16:57:39 INFO [auto_gptq.modeling._base] Start quantizing layer 14/32
2023-07-21 16:57:39 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 14/32...
2023-07-21 16:57:40 INFO [auto_gptq.quantization.gptq] duration: 0.8011329174041748
2023-07-21 16:57:40 INFO [auto_gptq.quantization.gptq] avg loss: 6.027165412902832
2023-07-21 16:57:40 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 14/32...
2023-07-21 16:57:41 INFO [auto_gptq.quantization.gptq] duration: 0.7977538108825684
2023-07-21 16:57:41 INFO [auto_gptq.quantization.gptq] avg loss: 0.28969255089759827
2023-07-21 16:57:41 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 14/32...
2023-07-21 16:57:42 INFO [auto_gptq.quantization.gptq] duration: 0.8305981159210205
2023-07-21 16:57:42 INFO [auto_gptq.quantization.gptq] avg loss: 28.996891021728516
2023-07-21 16:57:42 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 14/32...
2023-07-21 16:57:46 INFO [auto_gptq.quantization.gptq] duration: 3.874257802963257
2023-07-21 16:57:46 INFO [auto_gptq.quantization.gptq] avg loss: 1.6258554458618164
2023-07-21 16:57:46 INFO [auto_gptq.modeling._base] Start quantizing layer 15/32
2023-07-21 16:57:46 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 15/32...
2023-07-21 16:57:47 INFO [auto_gptq.quantization.gptq] duration: 0.7982082366943359
2023-07-21 16:57:47 INFO [auto_gptq.quantization.gptq] avg loss: 5.937747001647949
2023-07-21 16:57:47 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 15/32...
2023-07-21 16:57:48 INFO [auto_gptq.quantization.gptq] duration: 0.8004462718963623
2023-07-21 16:57:48 INFO [auto_gptq.quantization.gptq] avg loss: 0.3830963373184204
2023-07-21 16:57:48 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 15/32...
2023-07-21 16:57:48 INFO [auto_gptq.quantization.gptq] duration: 0.8347995281219482
2023-07-21 16:57:48 INFO [auto_gptq.quantization.gptq] avg loss: 30.339778900146484
2023-07-21 16:57:48 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 15/32...
2023-07-21 16:57:52 INFO [auto_gptq.quantization.gptq] duration: 3.8794045448303223
2023-07-21 16:57:52 INFO [auto_gptq.quantization.gptq] avg loss: 1.618453025817871
2023-07-21 16:57:52 INFO [auto_gptq.modeling._base] Start quantizing layer 16/32
2023-07-21 16:57:53 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 16/32...
2023-07-21 16:57:53 INFO [auto_gptq.quantization.gptq] duration: 0.802685022354126
2023-07-21 16:57:53 INFO [auto_gptq.quantization.gptq] avg loss: 5.992144584655762
2023-07-21 16:57:53 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 16/32...
2023-07-21 16:57:54 INFO [auto_gptq.quantization.gptq] duration: 0.8001143932342529
2023-07-21 16:57:54 INFO [auto_gptq.quantization.gptq] avg loss: 0.3652211129665375
2023-07-21 16:57:54 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 16/32...
2023-07-21 16:57:55 INFO [auto_gptq.quantization.gptq] duration: 0.843254566192627
2023-07-21 16:57:55 INFO [auto_gptq.quantization.gptq] avg loss: 29.359691619873047
2023-07-21 16:57:55 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 16/32...
2023-07-21 16:57:59 INFO [auto_gptq.quantization.gptq] duration: 3.8731229305267334
2023-07-21 16:57:59 INFO [auto_gptq.quantization.gptq] avg loss: 1.8666539192199707
2023-07-21 16:57:59 INFO [auto_gptq.modeling._base] Start quantizing layer 17/32
2023-07-21 16:57:59 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 17/32...
2023-07-21 16:58:00 INFO [auto_gptq.quantization.gptq] duration: 0.79642653465271
2023-07-21 16:58:00 INFO [auto_gptq.quantization.gptq] avg loss: 6.463171482086182
2023-07-21 16:58:00 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 17/32...
2023-07-21 16:58:01 INFO [auto_gptq.quantization.gptq] duration: 0.8078687191009521
2023-07-21 16:58:01 INFO [auto_gptq.quantization.gptq] avg loss: 0.24540238082408905
2023-07-21 16:58:01 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 17/32...
2023-07-21 16:58:02 INFO [auto_gptq.quantization.gptq] duration: 0.829270601272583
2023-07-21 16:58:02 INFO [auto_gptq.quantization.gptq] avg loss: 30.825468063354492
2023-07-21 16:58:02 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 17/32...
2023-07-21 16:58:05 INFO [auto_gptq.quantization.gptq] duration: 3.855315923690796
2023-07-21 16:58:05 INFO [auto_gptq.quantization.gptq] avg loss: 1.957414150238037
2023-07-21 16:58:06 INFO [auto_gptq.modeling._base] Start quantizing layer 18/32
2023-07-21 16:58:06 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 18/32...
2023-07-21 16:58:07 INFO [auto_gptq.quantization.gptq] duration: 0.8099801540374756
2023-07-21 16:58:07 INFO [auto_gptq.quantization.gptq] avg loss: 6.510787010192871
2023-07-21 16:58:07 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 18/32...
2023-07-21 16:58:07 INFO [auto_gptq.quantization.gptq] duration: 0.8008811473846436
2023-07-21 16:58:07 INFO [auto_gptq.quantization.gptq] avg loss: 0.3201957941055298
2023-07-21 16:58:07 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 18/32...
2023-07-21 16:58:08 INFO [auto_gptq.quantization.gptq] duration: 0.8365602493286133
2023-07-21 16:58:08 INFO [auto_gptq.quantization.gptq] avg loss: 31.26324462890625
2023-07-21 16:58:08 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 18/32...
2023-07-21 16:58:12 INFO [auto_gptq.quantization.gptq] duration: 3.8536572456359863
2023-07-21 16:58:12 INFO [auto_gptq.quantization.gptq] avg loss: 2.0843615531921387
2023-07-21 16:58:12 INFO [auto_gptq.modeling._base] Start quantizing layer 19/32
2023-07-21 16:58:12 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 19/32...
2023-07-21 16:58:13 INFO [auto_gptq.quantization.gptq] duration: 0.7980837821960449
2023-07-21 16:58:13 INFO [auto_gptq.quantization.gptq] avg loss: 6.686659812927246
2023-07-21 16:58:13 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 19/32...
2023-07-21 16:58:14 INFO [auto_gptq.quantization.gptq] duration: 0.7951889038085938
2023-07-21 16:58:14 INFO [auto_gptq.quantization.gptq] avg loss: 0.3053201138973236
2023-07-21 16:58:14 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 19/32...
2023-07-21 16:58:15 INFO [auto_gptq.quantization.gptq] duration: 0.8315420150756836
2023-07-21 16:58:15 INFO [auto_gptq.quantization.gptq] avg loss: 31.97283935546875
2023-07-21 16:58:15 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 19/32...
2023-07-21 16:58:19 INFO [auto_gptq.quantization.gptq] duration: 3.868382215499878
2023-07-21 16:58:19 INFO [auto_gptq.quantization.gptq] avg loss: 2.382962703704834
2023-07-21 16:58:19 INFO [auto_gptq.modeling._base] Start quantizing layer 20/32
2023-07-21 16:58:19 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 20/32...
2023-07-21 16:58:20 INFO [auto_gptq.quantization.gptq] duration: 0.797062873840332
2023-07-21 16:58:20 INFO [auto_gptq.quantization.gptq] avg loss: 6.721341133117676
2023-07-21 16:58:20 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 20/32...
2023-07-21 16:58:20 INFO [auto_gptq.quantization.gptq] duration: 0.806023120880127
2023-07-21 16:58:20 INFO [auto_gptq.quantization.gptq] avg loss: 0.5635891556739807
2023-07-21 16:58:20 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 20/32...
2023-07-21 16:58:21 INFO [auto_gptq.quantization.gptq] duration: 0.841651201248169
2023-07-21 16:58:21 INFO [auto_gptq.quantization.gptq] avg loss: 33.371273040771484
2023-07-21 16:58:21 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 20/32...
2023-07-21 16:58:25 INFO [auto_gptq.quantization.gptq] duration: 3.8724091053009033
2023-07-21 16:58:25 INFO [auto_gptq.quantization.gptq] avg loss: 2.5540378093719482
2023-07-21 16:58:25 INFO [auto_gptq.modeling._base] Start quantizing layer 21/32
2023-07-21 16:58:25 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 21/32...
2023-07-21 16:58:26 INFO [auto_gptq.quantization.gptq] duration: 0.8135292530059814
2023-07-21 16:58:26 INFO [auto_gptq.quantization.gptq] avg loss: 7.383816242218018
2023-07-21 16:58:26 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 21/32...
2023-07-21 16:58:27 INFO [auto_gptq.quantization.gptq] duration: 0.8004577159881592
2023-07-21 16:58:27 INFO [auto_gptq.quantization.gptq] avg loss: 0.2988166809082031
2023-07-21 16:58:27 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 21/32...
2023-07-21 16:58:28 INFO [auto_gptq.quantization.gptq] duration: 0.8346357345581055
2023-07-21 16:58:28 INFO [auto_gptq.quantization.gptq] avg loss: 34.46820068359375
2023-07-21 16:58:28 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 21/32...
2023-07-21 16:58:32 INFO [auto_gptq.quantization.gptq] duration: 3.8698837757110596
2023-07-21 16:58:32 INFO [auto_gptq.quantization.gptq] avg loss: 2.538421154022217
2023-07-21 16:58:32 INFO [auto_gptq.modeling._base] Start quantizing layer 22/32
2023-07-21 16:58:32 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 22/32...
2023-07-21 16:58:33 INFO [auto_gptq.quantization.gptq] duration: 0.7975707054138184
2023-07-21 16:58:33 INFO [auto_gptq.quantization.gptq] avg loss: 7.026803970336914
2023-07-21 16:58:33 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 22/32...
2023-07-21 16:58:34 INFO [auto_gptq.quantization.gptq] duration: 0.7988865375518799
2023-07-21 16:58:34 INFO [auto_gptq.quantization.gptq] avg loss: 0.5440877079963684
2023-07-21 16:58:34 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 22/32...
2023-07-21 16:58:35 INFO [auto_gptq.quantization.gptq] duration: 0.847116231918335
2023-07-21 16:58:35 INFO [auto_gptq.quantization.gptq] avg loss: 33.8814582824707
2023-07-21 16:58:35 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 22/32...
2023-07-21 16:58:38 INFO [auto_gptq.quantization.gptq] duration: 3.851823091506958
2023-07-21 16:58:38 INFO [auto_gptq.quantization.gptq] avg loss: 2.612248182296753
2023-07-21 16:58:39 INFO [auto_gptq.modeling._base] Start quantizing layer 23/32
2023-07-21 16:58:39 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 23/32...
2023-07-21 16:58:39 INFO [auto_gptq.quantization.gptq] duration: 0.7956225872039795
2023-07-21 16:58:39 INFO [auto_gptq.quantization.gptq] avg loss: 7.3217453956604
2023-07-21 16:58:39 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 23/32...
2023-07-21 16:58:40 INFO [auto_gptq.quantization.gptq] duration: 0.8155944347381592
2023-07-21 16:58:40 INFO [auto_gptq.quantization.gptq] avg loss: 0.3978100121021271
2023-07-21 16:58:40 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 23/32...
2023-07-21 16:58:41 INFO [auto_gptq.quantization.gptq] duration: 0.8472270965576172
2023-07-21 16:58:41 INFO [auto_gptq.quantization.gptq] avg loss: 33.613494873046875
2023-07-21 16:58:41 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 23/32...
2023-07-21 16:58:45 INFO [auto_gptq.quantization.gptq] duration: 3.877121925354004
2023-07-21 16:58:45 INFO [auto_gptq.quantization.gptq] avg loss: 3.0234107971191406
2023-07-21 16:58:45 INFO [auto_gptq.modeling._base] Start quantizing layer 24/32
2023-07-21 16:58:45 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 24/32...
2023-07-21 16:58:46 INFO [auto_gptq.quantization.gptq] duration: 0.8478920459747314
2023-07-21 16:58:46 INFO [auto_gptq.quantization.gptq] avg loss: 7.490325927734375
2023-07-21 16:58:46 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 24/32...
2023-07-21 16:58:47 INFO [auto_gptq.quantization.gptq] duration: 0.8023700714111328
2023-07-21 16:58:47 INFO [auto_gptq.quantization.gptq] avg loss: 0.6462091207504272
2023-07-21 16:58:47 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 24/32...
2023-07-21 16:58:48 INFO [auto_gptq.quantization.gptq] duration: 0.8271210193634033
2023-07-21 16:58:48 INFO [auto_gptq.quantization.gptq] avg loss: 35.156715393066406
2023-07-21 16:58:48 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 24/32...
2023-07-21 16:58:52 INFO [auto_gptq.quantization.gptq] duration: 3.8558664321899414
2023-07-21 16:58:52 INFO [auto_gptq.quantization.gptq] avg loss: 3.4150047302246094
2023-07-21 16:58:52 INFO [auto_gptq.modeling._base] Start quantizing layer 25/32
2023-07-21 16:58:52 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 25/32...
2023-07-21 16:58:53 INFO [auto_gptq.quantization.gptq] duration: 0.804887056350708
2023-07-21 16:58:53 INFO [auto_gptq.quantization.gptq] avg loss: 7.842990875244141
2023-07-21 16:58:53 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 25/32...
2023-07-21 16:58:53 INFO [auto_gptq.quantization.gptq] duration: 0.7986440658569336
2023-07-21 16:58:53 INFO [auto_gptq.quantization.gptq] avg loss: 0.5917433500289917
2023-07-21 16:58:53 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 25/32...
2023-07-21 16:58:54 INFO [auto_gptq.quantization.gptq] duration: 0.8256046772003174
2023-07-21 16:58:54 INFO [auto_gptq.quantization.gptq] avg loss: 36.299095153808594
2023-07-21 16:58:54 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 25/32...
2023-07-21 16:58:58 INFO [auto_gptq.quantization.gptq] duration: 3.86680006980896
2023-07-21 16:58:58 INFO [auto_gptq.quantization.gptq] avg loss: 4.292586326599121
2023-07-21 16:58:58 INFO [auto_gptq.modeling._base] Start quantizing layer 26/32
2023-07-21 16:58:58 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 26/32...
2023-07-21 16:58:59 INFO [auto_gptq.quantization.gptq] duration: 0.7961215972900391
2023-07-21 16:58:59 INFO [auto_gptq.quantization.gptq] avg loss: 8.335006713867188
2023-07-21 16:58:59 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 26/32...
2023-07-21 16:59:00 INFO [auto_gptq.quantization.gptq] duration: 0.7967922687530518
2023-07-21 16:59:00 INFO [auto_gptq.quantization.gptq] avg loss: 0.5929185152053833
2023-07-21 16:59:00 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 26/32...
2023-07-21 16:59:01 INFO [auto_gptq.quantization.gptq] duration: 0.8355779647827148
2023-07-21 16:59:01 INFO [auto_gptq.quantization.gptq] avg loss: 39.31059265136719
2023-07-21 16:59:01 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 26/32...
2023-07-21 16:59:05 INFO [auto_gptq.quantization.gptq] duration: 3.859668731689453
2023-07-21 16:59:05 INFO [auto_gptq.quantization.gptq] avg loss: 5.2629475593566895
2023-07-21 16:59:05 INFO [auto_gptq.modeling._base] Start quantizing layer 27/32
2023-07-21 16:59:05 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 27/32...
2023-07-21 16:59:06 INFO [auto_gptq.quantization.gptq] duration: 0.7974636554718018
2023-07-21 16:59:06 INFO [auto_gptq.quantization.gptq] avg loss: 8.194433212280273
2023-07-21 16:59:06 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 27/32...
2023-07-21 16:59:07 INFO [auto_gptq.quantization.gptq] duration: 0.8030986785888672
2023-07-21 16:59:07 INFO [auto_gptq.quantization.gptq] avg loss: 0.7090796828269958
2023-07-21 16:59:07 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 27/32...
2023-07-21 16:59:07 INFO [auto_gptq.quantization.gptq] duration: 0.8322622776031494
2023-07-21 16:59:07 INFO [auto_gptq.quantization.gptq] avg loss: 39.4634895324707
2023-07-21 16:59:07 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 27/32...
2023-07-21 16:59:11 INFO [auto_gptq.quantization.gptq] duration: 3.878126859664917
2023-07-21 16:59:11 INFO [auto_gptq.quantization.gptq] avg loss: 6.581557750701904
2023-07-21 16:59:11 INFO [auto_gptq.modeling._base] Start quantizing layer 28/32
2023-07-21 16:59:12 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 28/32...
2023-07-21 16:59:12 INFO [auto_gptq.quantization.gptq] duration: 0.7974464893341064
2023-07-21 16:59:12 INFO [auto_gptq.quantization.gptq] avg loss: 9.201988220214844
2023-07-21 16:59:12 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 28/32...
2023-07-21 16:59:13 INFO [auto_gptq.quantization.gptq] duration: 0.8018836975097656
2023-07-21 16:59:13 INFO [auto_gptq.quantization.gptq] avg loss: 1.193915605545044
2023-07-21 16:59:13 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 28/32...
2023-07-21 16:59:14 INFO [auto_gptq.quantization.gptq] duration: 0.832056999206543
2023-07-21 16:59:14 INFO [auto_gptq.quantization.gptq] avg loss: 39.874481201171875
2023-07-21 16:59:14 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 28/32...
2023-07-21 16:59:18 INFO [auto_gptq.quantization.gptq] duration: 3.8739585876464844
2023-07-21 16:59:18 INFO [auto_gptq.quantization.gptq] avg loss: 7.8150634765625
2023-07-21 16:59:18 INFO [auto_gptq.modeling._base] Start quantizing layer 29/32
2023-07-21 16:59:18 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 29/32...
2023-07-21 16:59:19 INFO [auto_gptq.quantization.gptq] duration: 0.7971282005310059
2023-07-21 16:59:19 INFO [auto_gptq.quantization.gptq] avg loss: 8.788995742797852
2023-07-21 16:59:19 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 29/32...
2023-07-21 16:59:20 INFO [auto_gptq.quantization.gptq] duration: 0.8014233112335205
2023-07-21 16:59:20 INFO [auto_gptq.quantization.gptq] avg loss: 0.9004578590393066
2023-07-21 16:59:20 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 29/32...
2023-07-21 16:59:21 INFO [auto_gptq.quantization.gptq] duration: 0.8585555553436279
2023-07-21 16:59:21 INFO [auto_gptq.quantization.gptq] avg loss: 40.52891159057617
2023-07-21 16:59:21 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 29/32...
2023-07-21 16:59:24 INFO [auto_gptq.quantization.gptq] duration: 3.886247396469116
2023-07-21 16:59:24 INFO [auto_gptq.quantization.gptq] avg loss: 7.627683639526367
2023-07-21 16:59:25 INFO [auto_gptq.modeling._base] Start quantizing layer 30/32
2023-07-21 16:59:25 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 30/32...
2023-07-21 16:59:26 INFO [auto_gptq.quantization.gptq] duration: 0.8017170429229736
2023-07-21 16:59:26 INFO [auto_gptq.quantization.gptq] avg loss: 7.885834217071533
2023-07-21 16:59:26 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 30/32...
2023-07-21 16:59:26 INFO [auto_gptq.quantization.gptq] duration: 0.8006551265716553
2023-07-21 16:59:26 INFO [auto_gptq.quantization.gptq] avg loss: 1.0838208198547363
2023-07-21 16:59:26 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 30/32...
2023-07-21 16:59:27 INFO [auto_gptq.quantization.gptq] duration: 0.8757197856903076
2023-07-21 16:59:27 INFO [auto_gptq.quantization.gptq] avg loss: 38.54998779296875
2023-07-21 16:59:27 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 30/32...
2023-07-21 16:59:31 INFO [auto_gptq.quantization.gptq] duration: 3.8700709342956543
2023-07-21 16:59:31 INFO [auto_gptq.quantization.gptq] avg loss: 10.26675796508789
2023-07-21 16:59:31 INFO [auto_gptq.modeling._base] Start quantizing layer 31/32
2023-07-21 16:59:31 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 31/32...
2023-07-21 16:59:32 INFO [auto_gptq.quantization.gptq] duration: 0.7995920181274414
2023-07-21 16:59:32 INFO [auto_gptq.quantization.gptq] avg loss: 7.922703266143799
2023-07-21 16:59:32 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 31/32...
2023-07-21 16:59:33 INFO [auto_gptq.quantization.gptq] duration: 0.7997887134552002
2023-07-21 16:59:33 INFO [auto_gptq.quantization.gptq] avg loss: 0.6395642757415771
2023-07-21 16:59:33 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 31/32...
2023-07-21 16:59:34 INFO [auto_gptq.quantization.gptq] duration: 0.8389708995819092
2023-07-21 16:59:34 INFO [auto_gptq.quantization.gptq] avg loss: 38.0499153137207
2023-07-21 16:59:34 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 31/32...
2023-07-21 16:59:38 INFO [auto_gptq.quantization.gptq] duration: 3.8527672290802
2023-07-21 16:59:38 INFO [auto_gptq.quantization.gptq] avg loss: 14.685250282287598
2023-07-21 16:59:38 INFO [auto_gptq.modeling._base] Start quantizing layer 32/32
2023-07-21 16:59:38 INFO [auto_gptq.modeling._base] Quantizing self_attention.query_key_value in layer 32/32...
2023-07-21 16:59:39 INFO [auto_gptq.quantization.gptq] duration: 0.7899763584136963
2023-07-21 16:59:39 INFO [auto_gptq.quantization.gptq] avg loss: 6.566901206970215
2023-07-21 17:00:08 INFO [auto_gptq.modeling._base] Quantizing self_attention.dense in layer 32/32...
2023-07-21 17:00:09 INFO [auto_gptq.quantization.gptq] duration: 0.890770673751831
2023-07-21 17:00:09 INFO [auto_gptq.quantization.gptq] avg loss: 0.2703491747379303
2023-07-21 17:00:09 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_h_to_4h in layer 32/32...
2023-07-21 17:00:10 INFO [auto_gptq.quantization.gptq] duration: 0.8699018955230713
2023-07-21 17:00:10 INFO [auto_gptq.quantization.gptq] avg loss: 33.582237243652344
2023-07-21 17:00:10 INFO [auto_gptq.modeling._base] Quantizing mlp.dense_4h_to_h in layer 32/32...
2023-07-21 17:00:14 INFO [auto_gptq.quantization.gptq] duration: 3.8666820526123047
2023-07-21 17:00:14 INFO [auto_gptq.quantization.gptq] avg loss: 26.30276107788086
2023-07-21 17:00:14 INFO [auto_gptq.modeling._utils] Packing model...
2023-07-21 17:00:14 INFO [auto_gptq.modeling._utils] transformer.h.0.self_attention.dense
2023-07-21 17:00:15 INFO [auto_gptq.modeling._utils] transformer.h.0.self_attention.query_key_value
2023-07-21 17:00:15 INFO [auto_gptq.modeling._utils] transformer.h.0.mlp.dense_4h_to_h
2023-07-21 17:00:18 INFO [auto_gptq.modeling._utils] transformer.h.0.mlp.dense_h_to_4h
2023-07-21 17:00:19 INFO [auto_gptq.modeling._utils] transformer.h.1.self_attention.dense
2023-07-21 17:00:19 INFO [auto_gptq.modeling._utils] transformer.h.1.self_attention.query_key_value
2023-07-21 17:00:20 INFO [auto_gptq.modeling._utils] transformer.h.1.mlp.dense_4h_to_h
2023-07-21 17:00:22 INFO [auto_gptq.modeling._utils] transformer.h.1.mlp.dense_h_to_4h
2023-07-21 17:00:23 INFO [auto_gptq.modeling._utils] transformer.h.2.self_attention.dense
2023-07-21 17:00:23 INFO [auto_gptq.modeling._utils] transformer.h.2.self_attention.query_key_value
2023-07-21 17:00:24 INFO [auto_gptq.modeling._utils] transformer.h.2.mlp.dense_4h_to_h
2023-07-21 17:00:26 INFO [auto_gptq.modeling._utils] transformer.h.2.mlp.dense_h_to_4h
2023-07-21 17:00:27 INFO [auto_gptq.modeling._utils] transformer.h.3.self_attention.dense
2023-07-21 17:00:28 INFO [auto_gptq.modeling._utils] transformer.h.3.self_attention.query_key_value
2023-07-21 17:00:28 INFO [auto_gptq.modeling._utils] transformer.h.3.mlp.dense_4h_to_h
2023-07-21 17:00:30 INFO [auto_gptq.modeling._utils] transformer.h.3.mlp.dense_h_to_4h
2023-07-21 17:00:31 INFO [auto_gptq.modeling._utils] transformer.h.4.self_attention.dense
2023-07-21 17:00:32 INFO [auto_gptq.modeling._utils] transformer.h.4.self_attention.query_key_value
2023-07-21 17:00:32 INFO [auto_gptq.modeling._utils] transformer.h.4.mlp.dense_4h_to_h
2023-07-21 17:00:34 INFO [auto_gptq.modeling._utils] transformer.h.4.mlp.dense_h_to_4h
2023-07-21 17:00:35 INFO [auto_gptq.modeling._utils] transformer.h.5.self_attention.dense
2023-07-21 17:00:35 INFO [auto_gptq.modeling._utils] transformer.h.5.self_attention.query_key_value
2023-07-21 17:00:36 INFO [auto_gptq.modeling._utils] transformer.h.5.mlp.dense_4h_to_h
2023-07-21 17:00:38 INFO [auto_gptq.modeling._utils] transformer.h.5.mlp.dense_h_to_4h
2023-07-21 17:00:39 INFO [auto_gptq.modeling._utils] transformer.h.6.self_attention.dense
2023-07-21 17:00:39 INFO [auto_gptq.modeling._utils] transformer.h.6.self_attention.query_key_value
2023-07-21 17:00:40 INFO [auto_gptq.modeling._utils] transformer.h.6.mlp.dense_4h_to_h
2023-07-21 17:00:41 INFO [auto_gptq.modeling._utils] transformer.h.6.mlp.dense_h_to_4h
2023-07-21 17:00:42 INFO [auto_gptq.modeling._utils] transformer.h.7.self_attention.dense
2023-07-21 17:00:43 INFO [auto_gptq.modeling._utils] transformer.h.7.self_attention.query_key_value
2023-07-21 17:00:43 INFO [auto_gptq.modeling._utils] transformer.h.7.mlp.dense_4h_to_h
2023-07-21 17:00:45 INFO [auto_gptq.modeling._utils] transformer.h.7.mlp.dense_h_to_4h
2023-07-21 17:00:46 INFO [auto_gptq.modeling._utils] transformer.h.8.self_attention.dense
2023-07-21 17:00:47 INFO [auto_gptq.modeling._utils] transformer.h.8.self_attention.query_key_value
2023-07-21 17:00:47 INFO [auto_gptq.modeling._utils] transformer.h.8.mlp.dense_4h_to_h
2023-07-21 17:00:49 INFO [auto_gptq.modeling._utils] transformer.h.8.mlp.dense_h_to_4h
2023-07-21 17:00:50 INFO [auto_gptq.modeling._utils] transformer.h.9.self_attention.dense
2023-07-21 17:00:50 INFO [auto_gptq.modeling._utils] transformer.h.9.self_attention.query_key_value
2023-07-21 17:00:51 INFO [auto_gptq.modeling._utils] transformer.h.9.mlp.dense_4h_to_h
2023-07-21 17:00:53 INFO [auto_gptq.modeling._utils] transformer.h.9.mlp.dense_h_to_4h
2023-07-21 17:00:54 INFO [auto_gptq.modeling._utils] transformer.h.10.self_attention.dense
2023-07-21 17:00:54 INFO [auto_gptq.modeling._utils] transformer.h.10.self_attention.query_key_value
2023-07-21 17:00:55 INFO [auto_gptq.modeling._utils] transformer.h.10.mlp.dense_4h_to_h
2023-07-21 17:00:56 INFO [auto_gptq.modeling._utils] transformer.h.10.mlp.dense_h_to_4h
2023-07-21 17:00:57 INFO [auto_gptq.modeling._utils] transformer.h.11.self_attention.dense
2023-07-21 17:00:58 INFO [auto_gptq.modeling._utils] transformer.h.11.self_attention.query_key_value
2023-07-21 17:00:58 INFO [auto_gptq.modeling._utils] transformer.h.11.mlp.dense_4h_to_h
2023-07-21 17:01:00 INFO [auto_gptq.modeling._utils] transformer.h.11.mlp.dense_h_to_4h
2023-07-21 17:01:01 INFO [auto_gptq.modeling._utils] transformer.h.12.self_attention.dense
2023-07-21 17:01:02 INFO [auto_gptq.modeling._utils] transformer.h.12.self_attention.query_key_value
2023-07-21 17:01:02 INFO [auto_gptq.modeling._utils] transformer.h.12.mlp.dense_4h_to_h
2023-07-21 17:01:04 INFO [auto_gptq.modeling._utils] transformer.h.12.mlp.dense_h_to_4h
2023-07-21 17:01:05 INFO [auto_gptq.modeling._utils] transformer.h.13.self_attention.dense
2023-07-21 17:01:06 INFO [auto_gptq.modeling._utils] transformer.h.13.self_attention.query_key_value
2023-07-21 17:01:06 INFO [auto_gptq.modeling._utils] transformer.h.13.mlp.dense_4h_to_h
2023-07-21 17:01:08 INFO [auto_gptq.modeling._utils] transformer.h.13.mlp.dense_h_to_4h
2023-07-21 17:01:09 INFO [auto_gptq.modeling._utils] transformer.h.14.self_attention.dense
2023-07-21 17:01:10 INFO [auto_gptq.modeling._utils] transformer.h.14.self_attention.query_key_value
2023-07-21 17:01:10 INFO [auto_gptq.modeling._utils] transformer.h.14.mlp.dense_4h_to_h
2023-07-21 17:01:12 INFO [auto_gptq.modeling._utils] transformer.h.14.mlp.dense_h_to_4h
2023-07-21 17:01:13 INFO [auto_gptq.modeling._utils] transformer.h.15.self_attention.dense
2023-07-21 17:01:13 INFO [auto_gptq.modeling._utils] transformer.h.15.self_attention.query_key_value
2023-07-21 17:01:14 INFO [auto_gptq.modeling._utils] transformer.h.15.mlp.dense_4h_to_h
2023-07-21 17:01:16 INFO [auto_gptq.modeling._utils] transformer.h.15.mlp.dense_h_to_4h
2023-07-21 17:01:17 INFO [auto_gptq.modeling._utils] transformer.h.16.self_attention.dense
2023-07-21 17:01:17 INFO [auto_gptq.modeling._utils] transformer.h.16.self_attention.query_key_value
2023-07-21 17:01:18 INFO [auto_gptq.modeling._utils] transformer.h.16.mlp.dense_4h_to_h
2023-07-21 17:01:19 INFO [auto_gptq.modeling._utils] transformer.h.16.mlp.dense_h_to_4h
2023-07-21 17:01:21 INFO [auto_gptq.modeling._utils] transformer.h.17.self_attention.dense
2023-07-21 17:01:21 INFO [auto_gptq.modeling._utils] transformer.h.17.self_attention.query_key_value
2023-07-21 17:01:21 INFO [auto_gptq.modeling._utils] transformer.h.17.mlp.dense_4h_to_h
2023-07-21 17:01:23 INFO [auto_gptq.modeling._utils] transformer.h.17.mlp.dense_h_to_4h
2023-07-21 17:01:24 INFO [auto_gptq.modeling._utils] transformer.h.18.self_attention.dense
2023-07-21 17:01:25 INFO [auto_gptq.modeling._utils] transformer.h.18.self_attention.query_key_value
2023-07-21 17:01:25 INFO [auto_gptq.modeling._utils] transformer.h.18.mlp.dense_4h_to_h
2023-07-21 17:01:27 INFO [auto_gptq.modeling._utils] transformer.h.18.mlp.dense_h_to_4h
2023-07-21 17:01:28 INFO [auto_gptq.modeling._utils] transformer.h.19.self_attention.dense
2023-07-21 17:01:29 INFO [auto_gptq.modeling._utils] transformer.h.19.self_attention.query_key_value
2023-07-21 17:01:29 INFO [auto_gptq.modeling._utils] transformer.h.19.mlp.dense_4h_to_h
2023-07-21 17:01:31 INFO [auto_gptq.modeling._utils] transformer.h.19.mlp.dense_h_to_4h
2023-07-21 17:01:32 INFO [auto_gptq.modeling._utils] transformer.h.20.self_attention.dense
2023-07-21 17:01:33 INFO [auto_gptq.modeling._utils] transformer.h.20.self_attention.query_key_value
2023-07-21 17:01:33 INFO [auto_gptq.modeling._utils] transformer.h.20.mlp.dense_4h_to_h
2023-07-21 17:01:35 INFO [auto_gptq.modeling._utils] transformer.h.20.mlp.dense_h_to_4h
2023-07-21 17:01:36 INFO [auto_gptq.modeling._utils] transformer.h.21.self_attention.dense
2023-07-21 17:01:37 INFO [auto_gptq.modeling._utils] transformer.h.21.self_attention.query_key_value
2023-07-21 17:01:37 INFO [auto_gptq.modeling._utils] transformer.h.21.mlp.dense_4h_to_h
2023-07-21 17:01:39 INFO [auto_gptq.modeling._utils] transformer.h.21.mlp.dense_h_to_4h
2023-07-21 17:01:40 INFO [auto_gptq.modeling._utils] transformer.h.22.self_attention.dense
2023-07-21 17:01:40 INFO [auto_gptq.modeling._utils] transformer.h.22.self_attention.query_key_value
2023-07-21 17:01:41 INFO [auto_gptq.modeling._utils] transformer.h.22.mlp.dense_4h_to_h
2023-07-21 17:01:43 INFO [auto_gptq.modeling._utils] transformer.h.22.mlp.dense_h_to_4h
2023-07-21 17:01:44 INFO [auto_gptq.modeling._utils] transformer.h.23.self_attention.dense
2023-07-21 17:01:44 INFO [auto_gptq.modeling._utils] transformer.h.23.self_attention.query_key_value
2023-07-21 17:01:45 INFO [auto_gptq.modeling._utils] transformer.h.23.mlp.dense_4h_to_h
2023-07-21 17:01:46 INFO [auto_gptq.modeling._utils] transformer.h.23.mlp.dense_h_to_4h
2023-07-21 17:01:48 INFO [auto_gptq.modeling._utils] transformer.h.24.self_attention.dense
2023-07-21 17:01:48 INFO [auto_gptq.modeling._utils] transformer.h.24.self_attention.query_key_value
2023-07-21 17:01:49 INFO [auto_gptq.modeling._utils] transformer.h.24.mlp.dense_4h_to_h
2023-07-21 17:01:51 INFO [auto_gptq.modeling._utils] transformer.h.24.mlp.dense_h_to_4h
2023-07-21 17:01:52 INFO [auto_gptq.modeling._utils] transformer.h.25.self_attention.dense
2023-07-21 17:01:52 INFO [auto_gptq.modeling._utils] transformer.h.25.self_attention.query_key_value
2023-07-21 17:01:53 INFO [auto_gptq.modeling._utils] transformer.h.25.mlp.dense_4h_to_h
2023-07-21 17:01:54 INFO [auto_gptq.modeling._utils] transformer.h.25.mlp.dense_h_to_4h
2023-07-21 17:01:55 INFO [auto_gptq.modeling._utils] transformer.h.26.self_attention.dense
2023-07-21 17:01:56 INFO [auto_gptq.modeling._utils] transformer.h.26.self_attention.query_key_value
2023-07-21 17:01:56 INFO [auto_gptq.modeling._utils] transformer.h.26.mlp.dense_4h_to_h
2023-07-21 17:01:58 INFO [auto_gptq.modeling._utils] transformer.h.26.mlp.dense_h_to_4h
2023-07-21 17:02:00 INFO [auto_gptq.modeling._utils] transformer.h.27.self_attention.dense
2023-07-21 17:02:00 INFO [auto_gptq.modeling._utils] transformer.h.27.self_attention.query_key_value
2023-07-21 17:02:00 INFO [auto_gptq.modeling._utils] transformer.h.27.mlp.dense_4h_to_h
2023-07-21 17:02:02 INFO [auto_gptq.modeling._utils] transformer.h.27.mlp.dense_h_to_4h
2023-07-21 17:02:03 INFO [auto_gptq.modeling._utils] transformer.h.28.self_attention.dense
2023-07-21 17:02:04 INFO [auto_gptq.modeling._utils] transformer.h.28.self_attention.query_key_value
2023-07-21 17:02:04 INFO [auto_gptq.modeling._utils] transformer.h.28.mlp.dense_4h_to_h
2023-07-21 17:02:06 INFO [auto_gptq.modeling._utils] transformer.h.28.mlp.dense_h_to_4h
2023-07-21 17:02:07 INFO [auto_gptq.modeling._utils] transformer.h.29.self_attention.dense
2023-07-21 17:02:08 INFO [auto_gptq.modeling._utils] transformer.h.29.self_attention.query_key_value
2023-07-21 17:02:08 INFO [auto_gptq.modeling._utils] transformer.h.29.mlp.dense_4h_to_h
2023-07-21 17:02:10 INFO [auto_gptq.modeling._utils] transformer.h.29.mlp.dense_h_to_4h
2023-07-21 17:02:11 INFO [auto_gptq.modeling._utils] transformer.h.30.self_attention.dense
2023-07-21 17:02:12 INFO [auto_gptq.modeling._utils] transformer.h.30.self_attention.query_key_value
2023-07-21 17:02:12 INFO [auto_gptq.modeling._utils] transformer.h.30.mlp.dense_4h_to_h
2023-07-21 17:02:14 INFO [auto_gptq.modeling._utils] transformer.h.30.mlp.dense_h_to_4h
2023-07-21 17:02:15 INFO [auto_gptq.modeling._utils] transformer.h.31.self_attention.dense
2023-07-21 17:02:16 INFO [auto_gptq.modeling._utils] transformer.h.31.self_attention.query_key_value
2023-07-21 17:02:16 INFO [auto_gptq.modeling._utils] transformer.h.31.mlp.dense_4h_to_h
2023-07-21 17:02:18 INFO [auto_gptq.modeling._utils] transformer.h.31.mlp.dense_h_to_4h
2023-07-21 17:02:19 INFO [auto_gptq.modeling._utils] Model packed.
```