Edit model card

flan-t5-large-da-multiwoz2.1_fs0.1

This model is a fine-tuned version of google/flan-t5-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3373
  • Accuracy: 43.2245
  • Num: 3689
  • Gen Len: 15.3058

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 24
  • seed: 1799
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Accuracy Num Gen Len
0.9683 0.56 400 0.4608 31.0524 3689 14.9279
0.5046 1.13 800 0.4028 35.7175 3689 15.1651
0.4488 1.69 1200 0.3803 36.4952 3689 16.3375
0.4267 2.25 1600 0.3613 38.4613 3689 15.2646
0.4003 2.81 2000 0.3538 39.8281 3689 15.5842
0.3862 3.38 2400 0.3497 40.0593 3689 15.2356
0.3729 3.94 2800 0.3433 40.857 3689 15.9675
0.3632 4.5 3200 0.3457 41.157 3689 15.8818
0.3534 5.06 3600 0.3367 42.9369 3689 15.7314
0.3432 5.63 4000 0.3358 41.9514 3689 15.7173
0.3395 6.19 4400 0.3373 43.2245 3689 15.3058
0.3345 6.75 4800 0.3351 42.4941 3689 14.8916
0.3266 7.31 5200 0.3360 42.9742 3689 15.7124
0.3233 7.88 5600 0.3327 43.1362 3689 15.9379

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.10.0+cu111
  • Datasets 2.5.1
  • Tokenizers 0.12.1
Downloads last month
7
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.