Edit model card

flan-t5-large-da-multiwoz2.1_fs0.05

This model is a fine-tuned version of google/flan-t5-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3605
  • Accuracy: 40.7025
  • Num: 3689
  • Gen Len: 15.8311

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 24
  • seed: 1799
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Accuracy Num Gen Len
0.9665 1.1 400 0.4662 32.999 3689 15.8777
0.5021 2.2 800 0.4089 35.1765 3689 15.0667
0.4467 3.3 1200 0.3839 35.8487 3689 15.7387
0.4066 4.4 1600 0.3751 37.5848 3689 14.9287
0.3907 5.49 2000 0.3654 38.7608 3689 15.4836
0.3644 6.59 2400 0.3620 40.1207 3689 15.154
0.3623 7.69 2800 0.3605 40.7025 3689 15.8311
0.3501 8.79 3200 0.3592 40.4075 3689 15.6419
0.3436 9.89 3600 0.3601 40.614 3689 15.7143

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.10.0+cu111
  • Datasets 2.5.1
  • Tokenizers 0.12.1
Downloads last month
7
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.