Edit model card

flan-t5-small-codesearchnet-python

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0764
  • Bleu: 0.0349
  • Rouge1: 0.6244
  • Rouge2: 0.6055
  • Avg Length: 16.9912

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 10
  • total_train_batch_size: 80
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss Bleu Rouge1 Rouge2 Avg Length
No log 1.0 375 0.0636 0.0364 0.6253 0.6076 17.029
5.5166 2.0 750 0.0553 0.0351 0.6259 0.6081 16.9996
0.0485 3.0 1125 0.0537 0.0351 0.6258 0.6083 16.99
0.0409 4.0 1500 0.0524 0.0351 0.6258 0.6082 16.9942
0.0409 5.0 1875 0.0524 0.0351 0.6261 0.6086 16.997
0.0345 6.0 2250 0.0526 0.0351 0.6258 0.6081 16.9936
0.0303 7.0 2625 0.0533 0.035 0.6254 0.6076 16.991
0.0256 8.0 3000 0.0566 0.035 0.6257 0.6074 16.9964
0.0256 9.0 3375 0.0592 0.0349 0.6253 0.6074 16.998
0.0205 10.0 3750 0.0612 0.0351 0.6255 0.6073 16.9932
0.0185 11.0 4125 0.0639 0.035 0.6257 0.6079 16.996
0.0157 12.0 4500 0.0698 0.035 0.625 0.6064 16.9944
0.0157 13.0 4875 0.0720 0.035 0.6246 0.6062 16.991
0.0131 14.0 5250 0.0745 0.035 0.6247 0.6062 16.9986
0.0128 15.0 5625 0.0764 0.0349 0.6244 0.6055 16.9912

Framework versions

  • Transformers 4.28.1
  • Pytorch 2.0.0+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
2