INT4 model shows bad perf than FP32 on Intel CPU,why?

#13
by Sakura10151 - opened

int4 model log shows it will consume hundreds of times more time than the fp32 model on SelfAttention

Sign up or log in to comment