Edit model card

flant5-small

This model is a fine-tuned version of google/flan-t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2540
  • Rouge1: 39.4088
  • Rouge2: 17.6509
  • Rougel: 34.241
  • Rougelsum: 36.3257
  • Gen Len: 19.97

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.3676 1.0 1557 0.2753 36.6587 14.0807 31.2838 33.1779 19.942
0.3135 2.0 3115 0.2658 37.8343 15.2461 32.2261 34.1553 19.97
0.2992 3.0 4672 0.2596 38.3851 15.6982 32.5124 34.5772 19.942
0.2888 4.0 6230 0.2559 37.6648 15.1146 32.1953 34.2139 19.94
0.281 5.0 7787 0.2549 38.3654 15.8444 32.775 34.9156 19.952
0.2734 6.0 9345 0.2533 38.7474 16.0237 33.155 35.3048 19.954
0.2679 7.0 10902 0.2529 38.7094 16.1904 33.2149 35.2449 19.96
0.2619 8.0 12460 0.2528 39.034 16.4682 33.7757 35.82 19.968
0.2576 9.0 14017 0.2528 38.769 16.5015 33.3685 35.4211 19.948
0.253 10.0 15575 0.2523 38.5811 16.3423 33.2559 35.2143 19.956
0.2494 11.0 17132 0.2516 38.7084 16.5171 33.4486 35.5503 19.958
0.2456 12.0 18690 0.2514 38.3763 16.2338 33.1431 34.8647 19.964
0.2419 13.0 20247 0.2520 38.455 16.2491 32.9546 35.0263 19.972
0.2388 14.0 21805 0.2514 38.9372 17.1821 33.6449 35.5621 19.97
0.2363 15.0 23362 0.2530 38.9104 16.742 33.5194 35.3391 19.976
0.2336 16.0 24920 0.2519 38.8698 16.9396 33.7987 35.6173 19.958
0.2313 17.0 26477 0.2518 38.8774 17.0545 33.7151 35.6844 19.97
0.229 18.0 28035 0.2518 38.7073 16.7039 33.4976 35.4177 19.964
0.2272 19.0 29592 0.2522 39.0868 16.948 33.8953 35.8788 19.964
0.2252 20.0 31150 0.2527 38.7854 16.9882 33.8017 35.6314 19.968
0.2234 21.0 32707 0.2527 38.9196 17.1419 33.9139 35.8599 19.97
0.2217 22.0 34265 0.2532 38.9227 17.0561 33.8032 35.6876 19.968
0.2206 23.0 35822 0.2521 39.5234 17.6253 34.2157 36.2645 19.962
0.2198 24.0 37380 0.2532 39.6108 17.8336 34.3222 36.3369 19.964
0.2184 25.0 38937 0.2533 39.3052 17.2967 33.9684 36.0207 19.972
0.2173 26.0 40495 0.2536 39.019 17.3083 34.0561 35.9826 19.972
0.2166 27.0 42052 0.2532 39.2553 17.6306 34.1763 36.1479 19.974
0.2159 28.0 43610 0.2539 39.3659 17.6526 34.276 36.2856 19.972
0.2154 29.0 45167 0.2543 39.3868 17.5653 34.2637 36.2704 19.974
0.2152 29.99 46710 0.2540 39.4088 17.6509 34.241 36.3257 19.97

Framework versions

  • Transformers 4.36.1
  • Pytorch 2.1.2
  • Datasets 2.19.2
  • Tokenizers 0.15.2
Downloads last month
8
Safetensors
Model size
77M params
Tensor type
F32
ยท
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using dtruong46me/flant5-small 1

Collection including dtruong46me/flant5-small