[DEBUG]transformers 4.38.0 /models/gemma/modeling_gemma.py
#96
by
LiuWhite
- opened
origin: attn_output = attn_output.reshape(bsz, q_len, self.hidden_size)
fix: attn_output = attn_output.reshape(bsz, q_len, 4096)
in this place the hidden_size is not equal to head_dim * head_nums
we need to change the value to get through