Edit model card

T5-base-finetuned-rte

This model is T5 fine-tuned on GLUE RTE dataset. It acheives the following results on the validation set

  • Accuracy: 0.7690

Model Details

T5 is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised and supervised tasks and for which each task is converted into a text-to-text format.

Training procedure

Tokenization

Since, T5 is a text-to-text model, the labels of the dataset are converted as follows: For each example, a sentence as been formed as "rte sentence1: " + rte_sent1 + "sentence 2: " + rte_sent2 and fed to the tokenizer to get the input_ids and attention_mask. For each label, target is choosen as "entailment" if label is 0, else label is "not_entailment" and tokenized to get input_ids and attention_mask . During training, these inputs_ids having pad token are replaced with -100 so that loss is not calculated for them. Then these input ids are given as labels, and above attention_mask of labels is given as decoder attention mask.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-4
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: epsilon=1e-08
  • num_epochs: 3.0

Training results

Epoch Training Loss Validation Accuracy
1 0.1099 0.7617
2 0.0573 0.7617
3 0.0276 0.7690
Downloads last month
22
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train PavanNeerudu/t5-base-finetuned-rte

Evaluation results