Can GLM5.1 model enable FP8 KV cache?

#5
by dblate - opened

Since the model weight is fp8 now, can we use fp8 kv cache? I found that there was no scale factors for fp8 kv cache.

Sign up or log in to comment