juancopi81's picture
Add t5x and mt3 models
b100e1c
raw history blame
No virus
380 Bytes
# T5.1.1 XXL model.
include 't5x/examples/scalable_t5/t5_1_1/base.gin' # imports vocab, optimizer and model.
# ------------------- Network specification overrides --------------------------
network.Transformer.config = @network.T5Config()
network.T5Config:
emb_dim = 4096
num_heads = 64
num_encoder_layers = 24
num_decoder_layers = 24
head_dim = 64
mlp_dim = 10240