Edit model card

bert-small-codesearchnet-python

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0582
  • Bleu: 0.0347
  • Rouge1: 0.6428
  • Rouge2: 0.6252
  • Avg Length: 17.891

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 10
  • total_train_batch_size: 80
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss Bleu Rouge1 Rouge2 Avg Length
No log 1.0 375 1.2151 0.0 0.0928 0.0083 10.684
1.9359 2.0 750 1.0291 0.0032 0.1752 0.0338 15.0624
0.9422 3.0 1125 0.9173 0.0061 0.2506 0.0711 17.9358
0.776 4.0 1500 0.8058 0.0088 0.3321 0.1409 18.3724
0.776 5.0 1875 0.6915 0.0123 0.4044 0.2267 18.564
0.6218 6.0 2250 0.5281 0.0193 0.5382 0.4097 17.5586
0.4363 7.0 2625 0.1897 0.0333 0.6311 0.6002 17.8768
0.1518 8.0 3000 0.0834 0.0346 0.6413 0.621 17.879
0.1518 9.0 3375 0.0587 0.0349 0.6439 0.6268 17.8886
0.0579 10.0 3750 0.0547 0.0348 0.6443 0.6276 17.885
0.0437 11.0 4125 0.0525 0.0348 0.6442 0.6278 17.8766
0.0365 12.0 4500 0.0550 0.0347 0.6436 0.6266 17.8876
0.0365 13.0 4875 0.0545 0.0347 0.6439 0.627 17.876
0.032 14.0 5250 0.0539 0.0347 0.644 0.6268 17.8822
0.0288 15.0 5625 0.0582 0.0347 0.6428 0.6252 17.891

Framework versions

  • Transformers 4.28.1
  • Pytorch 2.0.0+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
1