pretrain_base_tokenizer
This model was trained from scratch on the code_search_net dataset. It achieves the following results on the evaluation set:
- Loss: 2.1008
- Bleu: 0.0745
- Precisions: [0.370227852188713, 0.13803247473556413, 0.07398987834019316, 0.04421999242711094]
- Brevity Penalty: 0.6551
- Length Ratio: 0.7028
- Translation Length: 594596
- Reference Length: 846059
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Bleu | Brevity Penalty | Length Ratio | Validation Loss | Precisions | Reference Length | Translation Length |
---|---|---|---|---|---|---|---|---|---|
2.4112 | 1.0 | 25762 | 0.0658 | 0.6617 | 0.7078 | 2.3310 | [0.35249982048117723, 0.12327087650542343, 0.06319307060145318, 0.03566263068721848] | 846059 | 598823 |
2.3334 | 2.0 | 51524 | 0.0681 | 0.6617 | 0.7078 | 2.2582 | [0.35832782242172834, 0.127192419612726, 0.06572103592555187, 0.03742664572798245] | 846059 | 598812 |
2.2441 | 3.0 | 77286 | 0.0696 | 0.6529 | 0.7011 | 2.2180 | [0.36256557741844125, 0.13175932444407085, 0.06865854925017445, 0.039288874037902925] | 846059 | 593192 |
2.1798 | 4.0 | 103048 | 0.0721 | 0.6729 | 0.7162 | 2.1907 | [0.36450348611741407, 0.13225409723744794, 0.06903673685953533, 0.0396539707725045] | 846059 | 605975 |
2.1424 | 5.0 | 128810 | 0.0715 | 0.6561 | 0.7035 | 2.1736 | [0.3668020289217111, 0.13425532471838544, 0.07022789074894986, 0.04068744822577534] | 846059 | 595193 |
2.1132 | 6.0 | 154572 | 0.0739 | 0.6875 | 0.7275 | 2.1539 | [0.36025300866163096, 0.13232255476642318, 0.06955911290379053, 0.040195441044440436] | 846059 | 615473 |
2.0984 | 7.0 | 180334 | 0.0721 | 0.6587 | 0.7055 | 2.1471 | [0.36612131721578584, 0.13431329561035363, 0.0708157263719857, 0.041272288902252124] | 846059 | 596865 |
2.0785 | 8.0 | 206096 | 0.0724 | 0.6756 | 0.7183 | 2.1353 | [0.36380808213768595, 0.13209779987841613, 0.06888583628832168, 0.039797612956022015] | 846059 | 607760 |
2.044 | 9.0 | 231858 | 0.0651 | 0.5890 | 0.6539 | 2.1307 | [0.3747329223983659, 0.13597984423722942, 0.07083334152049311, 0.041232475735633954] | 846059 | 553210 |
2.0022 | 10.0 | 257620 | 0.0678 | 0.6182 | 0.6752 | 2.1244 | [0.37057115300122706, 0.13501863826827087, 0.07054691458053057, 0.04094138244503552] | 846059 | 571283 |
2.0115 | 11.0 | 283382 | 0.0714 | 0.6437 | 0.6942 | 2.1181 | [0.3696569336851962, 0.1359002395637604, 0.07172057187893609, 0.04198004369041997] | 846059 | 587350 |
1.9957 | 12.0 | 309144 | 0.0780 | 0.7340 | 0.7638 | 2.1182 | [0.3562361599633563, 0.13051385463885645, 0.06873666863799165, 0.039808098889084334] | 846059 | 646223 |
1.9816 | 13.0 | 334906 | 0.0748 | 0.6775 | 0.7198 | 2.1112 | [0.3644272643077186, 0.1348813193666958, 0.07171355661769002, 0.04221956829440906] | 846059 | 608972 |
1.9799 | 14.0 | 360668 | 0.0729 | 0.6567 | 0.7039 | 2.1094 | [0.3683080239907046, 0.1360146909050323, 0.07189528256366162, 0.04211029597965069] | 846059 | 595564 |
1.9721 | 15.0 | 386430 | 0.0724 | 0.6428 | 0.6935 | 2.1035 | [0.37174066063670774, 0.13775257176864, 0.07323700636731323, 0.042981616643797974] | 846059 | 586737 |
1.9415 | 16.0 | 412192 | 0.0707 | 0.6275 | 0.6822 | 2.1052 | [0.37395952455210174, 0.1379846553581918, 0.07303615398474567, 0.04286080713028393] | 846059 | 577140 |
1.921 | 17.0 | 437954 | 0.0755 | 0.6693 | 0.7135 | 2.1031 | [0.368775375991227, 0.137205943045811, 0.07336018463397673, 0.04358297628398173] | 846059 | 603671 |
1.9281 | 18.0 | 463716 | 0.0730 | 0.6426 | 0.6934 | 2.1008 | [0.3719142378879256, 0.13830769094500395, 0.0738741230374939, 0.043850156367431746] | 846059 | 586646 |
1.9619 | 19.0 | 489478 | 0.0741 | 0.6539 | 0.7019 | 2.1011 | [0.3690046967173499, 0.1375453499071136, 0.07371470581812624, 0.044035281313872486] | 846059 | 593819 |
1.9177 | 20.0 | 515240 | 2.1008 | 0.0745 | [0.370227852188713, 0.13803247473556413, 0.07398987834019316, 0.04421999242711094] | 0.6551 | 0.7028 | 594596 | 846059 |
Framework versions
- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.