Edit model card

flan_merged1

This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1597
  • Rouge1: 66.8856
  • Rouge2: 55.6869
  • Rougel: 63.8241
  • Rougelsum: 66.7005
  • Gen Len: 16.3392

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 200
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
11.8095 0.35 200 0.5275 38.2792 29.3331 37.9276 38.1283 8.0624
0.4481 0.7 400 0.3046 64.4437 52.3632 62.0225 64.2515 16.4262
0.3616 1.05 600 0.2656 64.9871 53.1185 62.4919 64.739 16.4279
0.2944 1.41 800 0.2412 65.2117 53.5512 62.6779 64.9318 16.4464
0.264 1.76 1000 0.2295 65.5748 54.0948 62.9803 65.3339 16.3866
0.2571 2.11 1200 0.2223 65.7216 53.793 62.9877 65.491 16.1898
0.2364 2.46 1400 0.2164 65.5444 53.9296 62.9975 65.3055 16.3172
0.2293 2.81 1600 0.2029 65.7977 54.3067 63.1851 65.5544 16.1766
0.2129 3.16 1800 0.2006 65.8342 53.9105 63.163 65.6175 16.1757
0.2184 3.51 2000 0.1931 65.1608 53.7707 62.6719 64.9743 16.1547
0.1952 3.87 2200 0.1873 66.3361 54.8382 63.2054 66.0954 16.3155
0.1992 4.22 2400 0.1847 66.316 55.0379 63.5154 66.0694 16.3594
0.1873 4.57 2600 0.1811 66.4999 55.263 63.8319 66.2513 16.3146
0.1839 4.92 2800 0.1783 66.0055 54.3406 62.9554 65.7387 16.3304
0.1748 5.27 3000 0.1777 66.1592 54.8048 63.407 66.0067 16.3348
0.1844 5.62 3200 0.1736 66.7642 55.3404 63.7069 66.5324 16.2996
0.1745 5.98 3400 0.1698 66.3946 55.1716 63.5596 66.1663 16.3216
0.1739 6.33 3600 0.1678 66.4472 55.1785 63.602 66.2704 16.3049
0.1633 6.68 3800 0.1680 66.6666 55.4584 63.8058 66.4708 16.3445
0.1659 7.03 4000 0.1682 66.6592 55.3712 63.5841 66.4587 16.2953
0.1557 7.38 4200 0.1634 66.876 55.423 63.8431 66.5569 16.2434
0.158 7.73 4400 0.1622 66.6165 55.2948 63.5996 66.4314 16.3849
0.1647 8.08 4600 0.1622 66.7592 55.5552 63.7194 66.5229 16.2794
0.1579 8.44 4800 0.1614 66.7889 55.5768 63.8266 66.5511 16.3181
0.1526 8.79 5000 0.1610 66.7516 55.5383 63.6509 66.5754 16.261
0.1506 9.14 5200 0.1608 66.9266 55.6277 63.7712 66.6668 16.3445
0.1502 9.49 5400 0.1604 66.9759 55.6586 63.8856 66.7849 16.3251
0.158 9.84 5600 0.1597 66.8856 55.6869 63.8241 66.7005 16.3392

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.4
  • Tokenizers 0.14.0
Downloads last month
17
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for tanvirsrbd1/flan_merged1

Finetuned
this model