multiquery attention

by ZhongYingMatrix - opened

Hi, thank you for your excellent work. I noticed the implementation of multiquery attention in, but I am unable to locate it in the source code. Can you please provide me with guidance on how to find it?

Technology Innovation Institute org

All model-related code is in the file.

FalconLLM changed discussion status to closed

Sign up or log in to comment