flan-t5-base-tags
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.3182
- Rouge1: 29.7874
- Rouge2: 11.1641
- Rougel: 24.3761
- Rougelsum: 24.4131
- Meteor: 23.2667
- Bertscore: 86.8649
- Gen Len: 26.3
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Meteor | Bertscore | Gen Len |
---|---|---|---|---|---|---|---|---|---|---|
No log | 1.0 | 45 | 2.3367 | 28.7701 | 10.6841 | 23.4822 | 23.5479 | 23.828 | 86.524 | 26.74 |
No log | 2.0 | 90 | 2.3197 | 29.4628 | 10.8101 | 24.6057 | 24.6412 | 23.7327 | 86.7659 | 26.43 |
No log | 3.0 | 135 | 2.3208 | 30.1395 | 11.1742 | 24.3374 | 24.3566 | 23.8543 | 86.8075 | 27.14 |
No log | 4.0 | 180 | 2.3182 | 29.7874 | 11.1641 | 24.3761 | 24.4131 | 23.2667 | 86.8649 | 26.3 |
No log | 5.0 | 225 | 2.3198 | 29.463 | 10.9717 | 24.0589 | 24.0475 | 23.3984 | 86.7798 | 26.39 |
Framework versions
- Transformers 4.43.3
- Pytorch 2.3.1+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.