Edit model card

pretrain_base_tokenizer

This model was trained from scratch on the code_search_net dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1008
  • Bleu: 0.0745
  • Precisions: [0.370227852188713, 0.13803247473556413, 0.07398987834019316, 0.04421999242711094]
  • Brevity Penalty: 0.6551
  • Length Ratio: 0.7028
  • Translation Length: 594596
  • Reference Length: 846059

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Bleu Brevity Penalty Length Ratio Validation Loss Precisions Reference Length Translation Length
2.4112 1.0 25762 0.0658 0.6617 0.7078 2.3310 [0.35249982048117723, 0.12327087650542343, 0.06319307060145318, 0.03566263068721848] 846059 598823
2.3334 2.0 51524 0.0681 0.6617 0.7078 2.2582 [0.35832782242172834, 0.127192419612726, 0.06572103592555187, 0.03742664572798245] 846059 598812
2.2441 3.0 77286 0.0696 0.6529 0.7011 2.2180 [0.36256557741844125, 0.13175932444407085, 0.06865854925017445, 0.039288874037902925] 846059 593192
2.1798 4.0 103048 0.0721 0.6729 0.7162 2.1907 [0.36450348611741407, 0.13225409723744794, 0.06903673685953533, 0.0396539707725045] 846059 605975
2.1424 5.0 128810 0.0715 0.6561 0.7035 2.1736 [0.3668020289217111, 0.13425532471838544, 0.07022789074894986, 0.04068744822577534] 846059 595193
2.1132 6.0 154572 0.0739 0.6875 0.7275 2.1539 [0.36025300866163096, 0.13232255476642318, 0.06955911290379053, 0.040195441044440436] 846059 615473
2.0984 7.0 180334 0.0721 0.6587 0.7055 2.1471 [0.36612131721578584, 0.13431329561035363, 0.0708157263719857, 0.041272288902252124] 846059 596865
2.0785 8.0 206096 0.0724 0.6756 0.7183 2.1353 [0.36380808213768595, 0.13209779987841613, 0.06888583628832168, 0.039797612956022015] 846059 607760
2.044 9.0 231858 0.0651 0.5890 0.6539 2.1307 [0.3747329223983659, 0.13597984423722942, 0.07083334152049311, 0.041232475735633954] 846059 553210
2.0022 10.0 257620 0.0678 0.6182 0.6752 2.1244 [0.37057115300122706, 0.13501863826827087, 0.07054691458053057, 0.04094138244503552] 846059 571283
2.0115 11.0 283382 0.0714 0.6437 0.6942 2.1181 [0.3696569336851962, 0.1359002395637604, 0.07172057187893609, 0.04198004369041997] 846059 587350
1.9957 12.0 309144 0.0780 0.7340 0.7638 2.1182 [0.3562361599633563, 0.13051385463885645, 0.06873666863799165, 0.039808098889084334] 846059 646223
1.9816 13.0 334906 0.0748 0.6775 0.7198 2.1112 [0.3644272643077186, 0.1348813193666958, 0.07171355661769002, 0.04221956829440906] 846059 608972
1.9799 14.0 360668 0.0729 0.6567 0.7039 2.1094 [0.3683080239907046, 0.1360146909050323, 0.07189528256366162, 0.04211029597965069] 846059 595564
1.9721 15.0 386430 0.0724 0.6428 0.6935 2.1035 [0.37174066063670774, 0.13775257176864, 0.07323700636731323, 0.042981616643797974] 846059 586737
1.9415 16.0 412192 0.0707 0.6275 0.6822 2.1052 [0.37395952455210174, 0.1379846553581918, 0.07303615398474567, 0.04286080713028393] 846059 577140
1.921 17.0 437954 0.0755 0.6693 0.7135 2.1031 [0.368775375991227, 0.137205943045811, 0.07336018463397673, 0.04358297628398173] 846059 603671
1.9281 18.0 463716 0.0730 0.6426 0.6934 2.1008 [0.3719142378879256, 0.13830769094500395, 0.0738741230374939, 0.043850156367431746] 846059 586646
1.9619 19.0 489478 0.0741 0.6539 0.7019 2.1011 [0.3690046967173499, 0.1375453499071136, 0.07371470581812624, 0.044035281313872486] 846059 593819
1.9177 20.0 515240 2.1008 0.0745 [0.370227852188713, 0.13803247473556413, 0.07398987834019316, 0.04421999242711094] 0.6551 0.7028 594596 846059

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.17.0
  • Tokenizers 0.15.2
Downloads last month
91
Safetensors
Model size
223M params
Tensor type
F32
·

Dataset used to train sc20fg/pretrain_base_tokenizer

Evaluation results