Details about expansion

#9
by icoderzqliu - opened

Hello, you mentioned 'After width expansion, there was a significant decline in the model's performance' in the blog, I would like to know some details about the width expansion, is it achieved by expanding the dimensions of the hidden layer? Or what method? Thank you!

01-ai org

Inspiration may be drawn from the insights presented in these two articles https://arxiv.org/abs/2112.11446, https://arxiv.org/abs/2110.07143

itsliupeng changed discussion status to closed

Sign up or log in to comment