Activation function config inconsistent #11

by - opened

The config.json says that the activation function is 'gelu' and yet 'is_gated_act' is set to true. Shouldn't the activation function be 'gated-gelu' like the rest of T5v1.1-style models? Or if that's not the case, shouldn't 'is_gated_act' be set to false?

Given that this model is T5v1.1-initialized (as per, shouldn't the config reflect that the activation is 'gated-gelu'?

The solution is simple: "feed_forward_proj" should be "gated-gelu" and "dense_act_fn" is redundant and should be removed entirely from the config.

Google org

hi @michaelroyzen
thanks for raising this. Let me get back to you asap

Google org

Hi @michaelroyzen
We have updated the config files accordingly. Thanks for raising the issue

changed discussion status to closed

Sign up or log in to comment