Edit model card

BASH-Coder-Flan-T5-base

This model is a fine-tuned version of google/flan-t5-base on the neulab/tldr dataset. It achieves the following results on the evaluation set:

  • Loss: 3.3608
  • Rouge1: 27.0741
  • Rouge2: 9.3824
  • Rougel: 26.133
  • Rougelsum: 26.1559
  • Gen Len: 15.5767

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 10
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
4.3554 1.0 802 3.5928 22.7234 6.7951 22.0647 22.0744 15.2363
3.5335 2.0 1604 3.4654 25.7842 8.5847 24.8207 24.8808 15.168
3.3341 3.0 2406 3.4078 25.5756 8.4456 24.706 24.7207 15.6472
3.2011 4.0 3208 3.3789 26.0638 8.6853 25.0862 25.1223 16.2748
3.1059 5.0 4010 3.3622 26.7254 9.1138 25.7985 25.8521 15.7366
3.0336 6.0 4812 3.3662 26.4655 9.1283 25.4587 25.5112 16.548
2.9727 7.0 5614 3.3593 26.8211 9.3045 25.8497 25.8772 15.5431
2.9298 8.0 6416 3.3643 26.8932 9.3537 25.9444 26.0088 15.916
2.9005 9.0 7218 3.3606 27.1732 9.5661 26.1198 26.1515 15.71
2.8846 10.0 8020 3.3608 27.0741 9.3824 26.133 26.1559 15.5767

Framework versions

  • Transformers 4.37.0.dev0
  • Pytorch 2.1.0+cu121
  • Datasets 2.15.0
  • Tokenizers 0.15.0

Finr-tuning Script

Google Colaboratory Notebook

Downloads last month
5
Safetensors
Model size
248M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for rvv-karma/BASH-Coder-Flan-T5-base

Finetuned
(640)
this model

Dataset used to train rvv-karma/BASH-Coder-Flan-T5-base

Evaluation results