Edit model card

selector-flant5-large

This model is a fine-tuned version of google/flan-t5-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2830
  • Rouge1: 82.6601
  • Rouge2: 30.2938
  • Rougel: 65.3958
  • Rougelsum: 65.4056
  • Gen Len: 8.8039
  • Top1 Acc: 0.7157
  • Top5 Acc: 0.8446
  • Diversity: 0.9357
  • Diversity Acc Score: 0.7903

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len Top1 Acc Top5 Acc Diversity Diversity Acc Score
0.6536 0.19 50 0.5739 57.8899 12.6047 51.8718 51.8989 7.8124 0.2915 0.4619 0.9972 0.4606
0.56 0.38 100 0.5001 63.0013 15.4523 54.5133 54.5485 8.2090 0.3656 0.5645 0.9974 0.5630
0.522 0.57 150 0.4495 65.722 18.757 55.0562 55.0696 9.0559 0.4405 0.6118 0.9992 0.6113
0.4749 0.77 200 0.3975 70.0759 21.0497 57.4729 57.4872 8.3760 0.5239 0.6964 0.9992 0.6959
0.4317 0.96 250 0.3667 72.4745 22.2432 58.5635 58.5311 8.6627 0.5616 0.7450 0.9991 0.7443
0.4081 1.15 300 0.3589 73.4852 24.0508 59.7434 59.7093 8.7351 0.5842 0.7550 0.9973 0.7530
0.3589 1.34 350 0.3479 74.4994 24.7767 60.1356 60.1576 8.7301 0.6072 0.7534 0.9973 0.7513
0.3577 1.53 400 0.3252 75.9238 25.328 61.0871 61.0974 8.8740 0.6210 0.7772 0.9964 0.7744
0.3505 1.72 450 0.3231 77.1227 25.5898 61.409 61.3954 8.6634 0.6445 0.7889 0.9945 0.7846
0.3481 1.92 500 0.3086 77.5017 26.6018 61.8965 61.8749 8.8025 0.6487 0.8015 0.9920 0.7951
0.3162 2.11 550 0.3195 76.978 27.1217 61.65 61.6392 8.6472 0.6508 0.8032 0.9841 0.7904
0.3186 2.3 600 0.3062 79.0951 28.0151 63.4889 63.5022 8.9543 0.6654 0.8086 0.9772 0.7902
0.3036 2.49 650 0.2941 79.359 28.0151 63.4979 63.5317 8.9473 0.6688 0.8111 0.9887 0.8020
0.3181 2.68 700 0.2947 80.2598 28.3396 63.7071 63.7019 8.8731 0.6813 0.8145 0.9834 0.8010
0.3083 2.87 750 0.2854 80.6082 29.1806 64.2514 64.3269 8.7734 0.6889 0.8262 0.9802 0.8098
0.2711 3.07 800 0.2958 79.1907 28.2873 63.1109 63.0771 8.8833 0.6738 0.8132 0.9802 0.7971
0.2784 3.26 850 0.2885 80.383 29.7334 64.3104 64.3013 8.9711 0.6872 0.8229 0.9745 0.8018
0.2989 3.45 900 0.2843 80.7629 29.0655 64.2694 64.2597 8.7716 0.6910 0.8258 0.9690 0.8002
0.2802 3.64 950 0.2852 80.8877 29.4312 64.5336 64.5501 8.8755 0.6863 0.8241 0.9686 0.7982
0.26 3.83 1000 0.2814 81.6469 29.7606 65.0097 65.0255 8.8672 0.6989 0.8304 0.9541 0.7923
0.2624 4.02 1050 0.2867 81.5126 29.2881 64.6202 64.661 9.1724 0.6922 0.8250 0.9460 0.7804
0.2416 4.21 1100 0.2832 82.226 29.256 65.1371 65.1636 8.7263 0.7044 0.8291 0.9379 0.7777
0.2654 4.41 1150 0.2879 81.7831 28.8072 64.4422 64.4486 8.9528 0.6977 0.8304 0.9344 0.7759
0.2578 4.6 1200 0.2836 81.8936 29.0194 64.7299 64.7349 9.0633 0.6993 0.8304 0.9426 0.7828
0.2762 4.79 1250 0.2756 81.9006 29.6943 65.0267 64.99 8.8554 0.7023 0.8333 0.9531 0.7942
0.2552 4.98 1300 0.2745 81.8394 29.6434 65.0188 65.0416 8.9267 0.7002 0.8346 0.9606 0.8017
0.2343 5.17 1350 0.2820 82.1941 29.6343 64.9266 64.9399 8.7410 0.7090 0.8388 0.9342 0.7836
0.2412 5.36 1400 0.2815 81.4294 29.2504 64.4476 64.4682 8.7906 0.6968 0.8375 0.9513 0.7967
0.2286 5.56 1450 0.2780 82.237 29.1436 64.9086 64.894 8.7483 0.7056 0.8379 0.9508 0.7967
0.2448 5.75 1500 0.2761 82.3455 29.5652 65.1557 65.1937 8.6865 0.7090 0.8413 0.9511 0.8001
0.2403 5.94 1550 0.2784 82.1858 30.3783 65.4133 65.4379 8.9642 0.7052 0.8350 0.9522 0.7951
0.228 6.13 1600 0.2812 82.1667 29.8374 65.1285 65.1456 8.9377 0.7044 0.8384 0.9332 0.7824
0.2194 6.32 1650 0.2777 82.5985 29.903 65.3189 65.3435 8.9079 0.7102 0.8392 0.9319 0.7821
0.2114 6.51 1700 0.2881 82.6706 30.2596 65.4771 65.478 9.0451 0.7102 0.8325 0.9253 0.7703
0.2175 6.7 1750 0.2817 82.4316 29.7131 65.0899 65.0778 8.9229 0.7115 0.8396 0.9329 0.7833
0.2319 6.9 1800 0.2769 82.2516 29.6531 65.0499 65.0305 8.9444 0.7069 0.8400 0.9418 0.7911
0.2169 7.09 1850 0.2819 82.3565 29.7711 65.166 65.1746 9.0419 0.7081 0.8367 0.9327 0.7803
0.2078 7.28 1900 0.2835 82.539 30.453 65.4308 65.4135 8.9267 0.7119 0.8379 0.9261 0.7760
0.2162 7.47 1950 0.2867 82.5227 29.7564 65.1538 65.2082 8.9329 0.7102 0.8392 0.9309 0.7812
0.2301 7.66 2000 0.2830 82.6601 30.2938 65.3958 65.4056 8.8039 0.7157 0.8446 0.9357 0.7903
0.2085 7.85 2050 0.2821 82.8141 30.1884 65.4939 65.4672 8.9246 0.7140 0.8392 0.9410 0.7896
0.2046 8.05 2100 0.2814 82.9831 29.9009 65.3683 65.3521 8.9581 0.7165 0.8434 0.9337 0.7874
0.2081 8.24 2150 0.2832 82.7936 29.8332 65.3755 65.3981 8.8539 0.7140 0.8438 0.9405 0.7936
0.2122 8.43 2200 0.2813 82.8109 29.6413 65.1439 65.1444 8.8894 0.7178 0.8430 0.9346 0.7878
0.2003 8.62 2250 0.2845 83.034 30.4788 65.7633 65.73 8.8817 0.7173 0.8405 0.9271 0.7791
0.2006 8.81 2300 0.2824 82.723 30.6002 65.6739 65.6627 9.0448 0.7115 0.8396 0.9334 0.7837
0.1969 9.0 2350 0.2831 82.7251 29.9379 65.3631 65.3789 8.9978 0.7123 0.8421 0.9300 0.7832
0.2116 9.2 2400 0.2884 82.8682 29.9749 65.4914 65.4569 8.9931 0.7136 0.8417 0.9221 0.7761
0.1855 9.39 2450 0.2865 82.9214 29.7397 65.3115 65.3068 8.9147 0.7169 0.8446 0.9213 0.7781
0.1868 9.58 2500 0.2884 82.6232 30.0851 65.3581 65.3426 8.8910 0.7136 0.8425 0.9215 0.7764

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.0.1
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
16
Safetensors
Model size
783M params
Tensor type
F32
·

Finetuned from