Edit model card

t5_base_question_generation

This model is a fine-tuned version of t5-base on an SQUAD dataset for QA.

Model description

More information needed

Intended uses

The model takes context as an input sequence, and will generate a full question sentence as an output sequence. The max sequence length is 512 tokens. Inputs should be organised into the following format: <generate_questions> paragraph: context text here'

The input sequence can then be encoded and passed as the input_ids argument in the model's generate() method.

limitations

The model was trained on only a limited amount of data hence questions might be poor quality. In addition the questions generated have style similar to that of the training data.

Training and evaluation data

The model takes as input a passage to generate questions answerable by the passage. The dataset used to train the model comprises 80k passage-question pairs sampled randomly from the SQUAD training data. For the evaluation we sampled 10k passage-question pairs from the SQUAD development set.

Training procedure

The model was trained for 5 epochs over the training set with a learning rate of 5e-05 with EarlyStopping. The batch size was only 10 due to GPU memory limitations

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.21
  • num_epochs: 5

Framework versions

  • Transformers 4.23.1
  • Pytorch 1.13.0
  • Datasets 2.6.1
  • Tokenizers 0.13.1
Downloads last month
1