The self._attn method of QwenAttention has a bug of attention_mask

#10
by chencyudel - opened

The self._attn method of QwenAttention need a fix of the bug lacking attention_mask adding to attention_weights

Thanks for the feedback. We have updated the code (as part of the support for batch inference), which I think should fix this problem as well. Please pull the latest code and see if the problem is fixed for you. Let me know if the problem still exists.

jklj077 changed discussion status to closed

Sign up or log in to comment