ETA for Flash Attention 2.0 Support in ChatGLMForConditionalGeneration

#46
by frank098 - opened

I am currently using the glm-4-9b-chat model and would like to know if there is an estimated timeline for when Flash Attention 2.0 support might be added.

Knowledge Engineering Group (KEG) & Data Mining at Tsinghua University org

support it now

zRzRzRzRzRzRzR changed discussion status to closed

Sign up or log in to comment