tr3m-1B3-pile-checkpoints / global_step94500
bigscience-bot's picture
gelu_fast is the correct activation_function
eed1f2b