Text2Text Generation
Transformers
PyTorch
Safetensors
Spanish
t5
text-generation-inference
Inference Endpoints
Edit model card

T5S (base-sized model)

T5S model pre-trained on Spanish language. It was introduced in the paper Sequence-to-Sequence Spanish Pre-trained Language Models.

Model description

T5S is a T5 Version 1.1 model (transformer encoder-decoder) with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder, which includes the following improvements compared to the original T5 model:

  • GEGLU activation in feed-forward hidden layer, rather than ReLU.

  • Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.

  • Pre-trained only on unlabeled corpus without mixing in the downstream tasks.

  • no parameter sharing between embedding and classifier layer

T5S is particularly effective when fine-tuned for text generation (e.g. summarization, translation) or comprehension tasks (e.g. text classification, question answering) using text-to-text format.

How to use

Here is how to use this model in PyTorch:

from transformers import T5Tokenizer, T5Model

tokenizer = T5Tokenizer.from_pretrained("vgaraujov/t5-base-spanish")
model = T5Model.from_pretrained("vgaraujov/t5-base-spanish")

input_ids = tokenizer(
    "Estudios han demostrado que tener un perro es bueno para la salud", return_tensors="pt"
).input_ids  # Batch size 1
decoder_input_ids = tokenizer("Estudios demuestran que", return_tensors="pt").input_ids  # Batch size 1

# forward pass
outputs = model(input_ids=input_ids, decoder_input_ids=decoder_input_ids)
last_hidden_states = outputs.last_hidden_state

Citation (BibTeX)

@misc{araujo2023sequencetosequence,
      title={Sequence-to-Sequence Spanish Pre-trained Language Models}, 
      author={Vladimir Araujo and Maria Mihaela Trusca and Rodrigo Tufiño and Marie-Francine Moens},
      year={2023},
      eprint={2309.11259},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
Downloads last month
124
Safetensors
Model size
248M params
Tensor type
F32
·

Datasets used to train vgaraujov/t5-base-spanish

Collection including vgaraujov/t5-base-spanish