Edit model card

base_model_base_tokenizer

This model is a fine-tuned version of t5-base on the code_search_net dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1017
  • Bleu: 0.0744
  • Precisions: [0.37389569483256924, 0.14063645643779682, 0.07580332788787783, 0.045527148854836816]
  • Brevity Penalty: 0.6407
  • Length Ratio: 0.6920
  • Translation Length: 585436
  • Reference Length: 846059

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Bleu Brevity Penalty Length Ratio Validation Loss Precisions Reference Length Translation Length
2.4273 1.0 25762 0.0665 0.6794 0.7212 2.3438 [0.34926724858481134, 0.12159425046725157, 0.062078959459937084, 0.03489467043820187] 846059 610166
2.3512 2.0 51524 0.0733 0.7181 0.7512 2.2643 [0.3534451290507329, 0.1262343107830303, 0.06531254968421979, 0.03721425521409004] 846059 635564
2.2525 3.0 77286 0.0691 0.6453 0.6954 2.2234 [0.36523755211936504, 0.1318932094567742, 0.06891201805888993, 0.03961906221856018] 846059 588313
2.2252 4.0 103048 0.0726 0.7043 0.7404 2.1949 [0.3601686933924165, 0.1283373434960897, 0.06578382296859486, 0.0371541685491374] 846059 626462
2.1523 5.0 128810 0.0703 0.6506 0.6994 2.1769 [0.3663069159346027, 0.1334874876878427, 0.06959109409366254, 0.040003198275976946] 846059 591706
2.1027 6.0 154572 0.0650 0.5879 0.6531 2.1585 [0.37335963586676196, 0.13614151644150174, 0.07119404952304512, 0.04138235959446398] 846059 552545
2.0458 7.0 180334 0.0682 0.6176 0.6748 2.1491 [0.37062538973004405, 0.1355146147678402, 0.07123664846902444, 0.04155352506292986] 846059 570908
2.0594 8.0 206096 0.0702 0.6407 0.6919 2.1403 [0.3700899171204657, 0.13524405355792343, 0.07062960711230036, 0.04081911815137772] 846059 585428
2.0459 9.0 231858 0.0635 0.5682 0.6388 2.1327 [0.37916909499625345, 0.13810659289354987, 0.07176079868122479, 0.04160453545539102] 846059 540495
2.0029 10.0 257620 0.0684 0.6128 0.6713 2.1264 [0.3745439691237164, 0.13731087325347474, 0.07204645620574554, 0.04194087964799725] 846059 567944
2.0107 11.0 283382 0.0697 0.6139 0.6721 2.1202 [0.37538600600727345, 0.13908031254002817, 0.07356968494927149, 0.04326375560457764] 846059 568644
1.995 12.0 309144 0.0790 0.7220 0.7543 2.1192 [0.3595232536092102, 0.1336969667453998, 0.07124298456393582, 0.04192048242921579] 846059 638159
1.9653 13.0 334906 0.0750 0.6727 0.7161 2.1158 [0.3663186076760047, 0.13635359040297698, 0.07246562633002641, 0.04279559846361466] 846059 605836
1.9811 14.0 360668 0.0718 0.6325 0.6858 2.1096 [0.37342310979981247, 0.13867710694415825, 0.0736328303569596, 0.043440268414579084] 846059 580256
1.9745 15.0 386430 0.0741 0.6592 0.7059 2.1060 [0.36869699176985743, 0.13724429728380805, 0.07301699268383118, 0.04318353520566863] 846059 597195
1.939 16.0 412192 0.0706 0.6166 0.6740 2.1063 [0.37537898781101553, 0.13979047848408885, 0.0742785001701673, 0.04399835661136439] 846059 570269
1.9177 17.0 437954 0.0757 0.6671 0.7118 2.1063 [0.37017425883954735, 0.13833476986726426, 0.07389756751525232, 0.04386076232849102] 846059 602265
1.9265 18.0 463716 0.0717 0.6192 0.6760 2.1016 [0.37650650333865443, 0.14089062050951845, 0.075366455530664, 0.045028150012067114] 846059 571937
1.9622 19.0 489478 0.0730 0.6288 0.6831 2.1022 [0.3746837721013452, 0.1407333566053557, 0.07570910522025132, 0.045477562304123496] 846059 577906
1.9171 20.0 515240 2.1017 0.0744 [0.37389569483256924, 0.14063645643779682, 0.07580332788787783, 0.045527148854836816] 0.6407 0.6920 585436 846059

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.17.0
  • Tokenizers 0.15.2
Downloads last month
16
Safetensors
Model size
223M params
Tensor type
F32
·

Finetuned from

Dataset used to train sc20fg/base_model_base_tokenizer

Evaluation results