Wild speed on my RTX 3070

#1
by ABX-AI - opened

I hadn't heard the coil whine coming from my speakers go so fast before on a 7B ^^

CtxLimit: 158/8192, Process:0.01s (13.0ms/T = 76.92T/s), Generate:0.70s (4.4ms/T = 227.92T/s), Total:0.71s (223.78T/s)
CtxLimit: 162/8192, Process:0.01s (15.0ms/T = 66.67T/s), Generate:0.76s (4.8ms/T = 209.70T/s), Total:0.78s (205.66T/s)
CtxLimit: 162/8192, Process:0.01s (13.0ms/T = 76.92T/s), Generate:0.00s (0.0ms/T = 160000.00T/s), Total:0.01s (11428.57T/s)
CtxLimit: 164/8192, Process:0.01s (14.0ms/T = 71.43T/s), Generate:0.79s (4.9ms/T = 203.05T/s), Total:0.80s (199.50T/s)
CtxLimit: 164/8192, Process:0.01s (13.0ms/T = 76.92T/s), Generate:0.00s (0.0ms/T = 160000.00T/s), Total:0.01s (11428.57T/s)
CtxLimit: 158/8192, Process:0.01s (13.0ms/T = 76.92T/s), Generate:0.70s (4.4ms/T = 227.92T/s), Total:0.71s (223.78T/s)
CtxLimit: 158/8192, Process:0.01s (12.0ms/T = 83.33T/s), Generate:0.00s (0.0ms/T = 160000.00T/s), Total:0.01s (12307.69T/s)
CtxLimit: 241/8192, Process:0.13s (4.5ms/T = 222.22T/s), Generate:0.82s (5.1ms/T = 196.08T/s), Total:0.94s (169.85T/s)
CtxLimit: 256/8192, Process:0.01s (13.0ms/T = 76.92T/s), Generate:0.21s (1.3ms/T = 751.17T/s), Total:0.23s (707.96T/s)
CtxLimit: 256/8192, Process:0.01s (13.0ms/T = 76.92T/s), Generate:0.00s (0.0ms/T = 160000.00T/s), Total:0.01s (11428.57T/s)
CtxLimit: 276/8192, Process:0.13s (4.8ms/T = 206.35T/s), Generate:2.50s (15.6ms/T = 64.08T/s), Total:2.62s (61.00T/s)
CtxLimit: 436/8192, Process:0.02s (20.0ms/T = 50.00T/s), Generate:2.39s (14.9ms/T = 67.03T/s), Total:2.41s (66.47T/s)
CtxLimit: 515/8192, Process:0.01s (13.0ms/T = 76.92T/s), Generate:1.20s (7.5ms/T = 133.67T/s), Total:1.21s (132.23T/s)
CtxLimit: 515/8192, Process:0.01s (12.0ms/T = 83.33T/s), Generate:0.00s (0.0ms/T = 160000.00T/s), Total:0.01s (12307.69T/s)
CtxLimit: 276/8192, Process:0.01s (12.0ms/T = 83.33T/s), Generate:2.38s (14.9ms/T = 67.17T/s), Total:2.39s (66.83T/s)
CtxLimit: 436/8192, Process:0.01s (14.0ms/T = 71.43T/s), Generate:2.45s (15.3ms/T = 65.25T/s), Total:2.47s (64.88T/s)
CtxLimit: 557/8192, Process:0.01s (13.0ms/T = 76.92T/s), Generate:1.87s (11.7ms/T = 85.52T/s), Total:1.88s (84.93T/s)
CtxLimit: 919/8192, Process:0.01s (13.0ms/T = 76.92T/s), Generate:12.64s (15.7ms/T = 63.54T/s), Total:12.65s (63.47T/s)
CtxLimit: 1135/8192, Process:0.02s (22.0ms/T = 45.45T/s), Generate:3.44s (4.3ms/T = 233.57T/s), Total:3.46s (232.08T/s))

Great to hear! I'll see if I can upload the IQ3_M (and hopefully as many other quants as I can) today since I recall seeing it's a better option than Q3_K_M. I'd upload all of them if it wasn't for my bandwidth

i have intel inbuilt graphics 4000.....

Sign up or log in to comment