Activation function config inconsistent
The config.json says that the activation function is 'gelu' and yet 'is_gated_act' is set to true. Shouldn't the activation function be 'gated-gelu' like the rest of T5v1.1-style models? Or if that's not the case, shouldn't 'is_gated_act' be set to false?
Given that this model is T5v1.1-initialized (as per https://github.com/google-research/t5x/blob/main/docs/models.md#t5-11-lm-adapted-checkpoints), shouldn't the config reflect that the activation is 'gated-gelu'?
The solution is simple: "feed_forward_proj" should be "gated-gelu" and "dense_act_fn" is redundant and should be removed entirely from the config.
hi
@michaelroyzen
thanks for raising this. Let me get back to you asap
Hi
@michaelroyzen
We have updated the config files accordingly. Thanks for raising the issue