Edit model card

flan-t5-large-da-multiwoz_250

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3959
  • Accuracy: 38.8681
  • Num: 3689
  • Gen Len: 15.6736

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 24
  • seed: 1799
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Accuracy Num Gen Len
0.4158 0.93 200 0.4439 34.537 3689 15.8452
0.3487 1.86 400 0.4358 35.7656 3689 15.6495
0.3596 2.79 600 0.4304 35.4046 3689 14.8946
0.3676 3.72 800 0.4186 36.5036 3689 15.0016
0.4259 4.65 1000 0.4082 36.491 3689 15.4118
0.4005 5.58 1200 0.4039 37.4827 3689 15.8615
0.3922 6.51 1400 0.4009 38.1076 3689 15.4286
0.3656 7.44 1600 0.3998 38.8275 3689 15.7021
0.3709 8.37 1800 0.3959 38.8681 3689 15.6736
0.3564 9.3 2000 0.3981 38.6742 3689 15.8406

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.10.0+cu111
  • Datasets 2.5.1
  • Tokenizers 0.12.1
Downloads last month
11