Shape error in the SFT Model

#1
by jieliu - opened

Upon loading the sft model, I encountered a shape error during generation. It appears that the weights of k_proj and v_proj have been altered, which contradicts the printed results. I'm curious to know if you manually adjusted the weights using the safe tensor file. Is there a bug present? Additionally, why are there two safe tensor files, resulting in a 7B model imposing a 14B memory burden?

7189f9175d0bd1d68a95a10ced61979.png

99933e5b7d9f4e38d2c84d891d44170.png

Sign up or log in to comment