Edit model card

Prompting-NLP-Paper-to-QA-Generation-abstract-only

This model is a fine-tuned version of google/flan-t5-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 21.0330

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 184
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss
No log 0.99 46 42.8265
36.8265 1.99 92 41.8626
36.8265 2.98 138 39.9479
35.1011 3.97 184 37.2276
35.1011 4.97 230 33.5552
28.7673 5.96 276 25.3570
28.7673 6.95 322 22.8463
20.3737 7.95 368 22.0063
20.3737 8.94 414 21.5694
19.2477 9.93 460 21.3303
19.2477 10.93 506 21.1698
18.9724 11.92 552 21.0922
18.9724 12.91 598 21.0487
18.9072 13.91 644 21.0365
18.9072 14.9 690 21.0330

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu118
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
0
Safetensors
Model size
783M params
Tensor type
BF16
·

Finetuned from