I have discovered an open-source implementation for KV Shifting Attention. https://github.com/erogol/BlaGPT

If you want to get started quickly, you can use 8 A100 and verify it in 2 hours.

Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support