[DEBUG]transformers 4.38.0 /models/gemma/modeling_gemma.py

#96
by LiuWhite - opened

origin: attn_output = attn_output.reshape(bsz, q_len, self.hidden_size)
fix: attn_output = attn_output.reshape(bsz, q_len, 4096)
in this place the hidden_size is not equal to head_dim * head_nums
we need to change the value to get through

Sign up or log in to comment