Text2Text Generation
Transformers
PyTorch
Safetensors
Spanish
t5
text-generation-inference
Inference Endpoints
Edit model card

T5S (base-sized model)

T5S model pre-trained on Spanish language. It was introduced in the paper Sequence-to-Sequence Spanish Pre-trained Language Models.

Model description

T5S is a T5 Version 1.1 model (transformer encoder-decoder) with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder, which includes the following improvements compared to the original T5 model:

  • GEGLU activation in feed-forward hidden layer, rather than ReLU.

  • Dropout was turned off in pre-training (quality win). Dropout should be re-enabled during fine-tuning.

  • Pre-trained only on unlabeled corpus without mixing in the downstream tasks.

  • no parameter sharing between embedding and classifier layer

T5S is particularly effective when fine-tuned for text generation (e.g. summarization, translation) or comprehension tasks (e.g. text classification, question answering) using text-to-text format.

How to use

Here is how to use this model in PyTorch:

from transformers import T5Tokenizer, T5Model

tokenizer = T5Tokenizer.from_pretrained("vgaraujov/t5-base-spanish")
model = T5Model.from_pretrained("vgaraujov/t5-base-spanish")

input_ids = tokenizer(
    "Estudios han demostrado que tener un perro es bueno para la salud", return_tensors="pt"
).input_ids  # Batch size 1
decoder_input_ids = tokenizer("Estudios demuestran que", return_tensors="pt").input_ids  # Batch size 1

# forward pass
outputs = model(input_ids=input_ids, decoder_input_ids=decoder_input_ids)
last_hidden_states = outputs.last_hidden_state

Citation (BibTeX)

@misc{araujo2023sequencetosequence,
      title={Sequence-to-Sequence Spanish Pre-trained Language Models}, 
      author={Vladimir Araujo and Maria Mihaela Trusca and Rodrigo Tufiño and Marie-Francine Moens},
      year={2023},
      eprint={2309.11259},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
Downloads last month
466
Safetensors
Model size
248M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train vgaraujov/t5-base-spanish

Collection including vgaraujov/t5-base-spanish