Fix the kv-cache dimensions

#47
by cchudant - opened

Hello!
I have noticed that the dimension of the kv-cache here is weird, and does not match the hugginface transformers modeling_bloom.py file.
Is the departure from the bloom dimension intended?
Judging from the copy-pasted comments, it looks like a bug - also, _convert_to_rw_cache & its _convert_to_standard_cache counterpart matches bloom dimensions.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment