Edit model card

base_model_custom_tokenizer

This model is a fine-tuned version of t5-base on the code_search_net dataset. It achieves the following results on the evaluation set:

  • Loss: 2.9297
  • Bleu: 0.0419
  • Precisions: [0.16646886171883812, 0.051341379400381214, 0.025538496667355304, 0.01408001744219341]
  • Brevity Penalty: 1.0
  • Length Ratio: 1.9160
  • Translation Length: 1515803
  • Reference Length: 791127

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Bleu Brevity Penalty Length Ratio Validation Loss Precisions Reference Length Translation Length
3.9604 1.0 25762 0.0311 1.0 2.0901 3.8577 [0.12981129473835085, 0.037916946342151155, 0.018860549385742668, 0.010123458812721054] 791127 1653531
3.7556 2.0 51524 0.0304 1.0 2.0887 3.5650 [0.12978779415458075, 0.037579383019195466, 0.018120049525730805, 0.00967159578808246] 791127 1652405
3.5524 3.0 77286 0.0337 1.0 2.0745 3.4150 [0.1400710094937268, 0.04118126290523918, 0.0203289377688518, 0.01095848654003696] 791127 1641189
3.4698 4.0 103048 0.0340 1.0 2.0788 3.3056 [0.14277601173291565, 0.041700438046903744, 0.020391137906857287, 0.010998711103394348] 791127 1644604
3.3163 5.0 128810 0.0377 1.0 2.0193 3.2312 [0.15481298837386176, 0.04617083876865068, 0.022825576079888228, 0.012408874977873952] 791127 1597521
3.2458 6.0 154572 0.0382 1.0 1.9276 3.1719 [0.1593547435203856, 0.04704355006890476, 0.023023369844916947, 0.012389103841794662] 791127 1524975
3.1574 7.0 180334 0.0373 1.0 2.0231 3.1267 [0.15301209486452477, 0.04557636504175273, 0.022512350851579006, 0.012331176442211789] 791127 1600514
3.1398 8.0 206096 0.0386 1.0 1.9724 3.0893 [0.1577822509066417, 0.04745355472604797, 0.023342833604973825, 0.012766267921605798] 791127 1560429
3.0691 9.0 231858 0.0399 1.0 1.9159 3.0574 [0.16179891666501725, 0.0490436396529825, 0.024170720153435545, 0.013205125551162357] 791127 1515690
3.0536 10.0 257620 0.0410 1.0 1.8550 3.0321 [0.1656489584760067, 0.05027218283158705, 0.024914277684092188, 0.013668271409759075] 791127 1467513
3.0379 11.0 283382 0.0404 1.0 1.8928 3.0082 [0.1630008107267023, 0.049590989569352824, 0.02452930558336929, 0.013463575807213558] 791127 1497422
3.0183 12.0 309144 0.0409 1.0 1.9428 2.9924 [0.16253787482001938, 0.049984123536708294, 0.02498794115282579, 0.01380309274144192] 791127 1536971
2.9442 13.0 334906 0.0413 1.0 1.9288 2.9773 [0.16426924674922966, 0.05052962811986506, 0.025225357778251727, 0.013893123599262487] 791127 1525946
2.9746 14.0 360668 0.0411 1.0 1.9154 2.9622 [0.16395222297528722, 0.050373776569881686, 0.02506334156586741, 0.013817874614866431] 791127 1515289
2.9556 15.0 386430 0.0416 1.0 1.8903 2.9505 [0.16631916674913938, 0.05114349827528396, 0.025291167834370104, 0.013919582587470626] 791127 1495444
2.9423 16.0 412192 0.0415 1.0 1.9161 2.9441 [0.1656048056193977, 0.050903942131636466, 0.02527336097239107, 0.013901882376966617] 791127 1515892
2.9257 17.0 437954 0.0417 1.0 1.9204 2.9387 [0.16566872310834463, 0.051149695919205686, 0.02547749541013215, 0.01403388257902964] 791127 1519291
2.9023 18.0 463716 0.0417 1.0 1.9252 2.9331 [0.16569868978430946, 0.05118214894137258, 0.025432645752525008, 0.014019028423183673] 791127 1523108
2.946 19.0 489478 0.0420 1.0 1.9138 2.9301 [0.16682044755191178, 0.051534782710695386, 0.02563003483561942, 0.014141190855303378] 791127 1514059
2.8761 20.0 515240 2.9297 0.0419 [0.16646886171883812, 0.051341379400381214, 0.025538496667355304, 0.01408001744219341] 1.0 1.9160 1515803 791127

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.17.0
  • Tokenizers 0.15.2
Downloads last month
13
Safetensors
Model size
223M params
Tensor type
F32
·

Finetuned from

Dataset used to train sc20fg/base_model_custom_tokenizer

Evaluation results