Use correct `gelu` function

#5
by ybelkada HF staff - opened

related discussion: https://huggingface.co/google/flan-t5-xxl/discussions/11

The previous config file was using gelu function instead of gated-gelu that is automatically set when forcing is_gated_actto True, more specifically here
This is not a breaking change since it fixes only for inference. Users that trained a model with gelu instead of gated-gelu should not be affected by this change. Note that using gated-gelu instead of gelu can give slightly different qualitative results but does not affect the overall performance of the model.

ybelkada changed pull request status to merged

Sign up or log in to comment