pretrain_custom_tokenizer
This model was trained from scratch on the code_search_net dataset. It achieves the following results on the evaluation set:
- Loss: 2.9012
- Bleu: 0.0437
- Precisions: [0.17073810819731178, 0.05349823043007888, 0.026839997681805762, 0.014878806668698525]
- Brevity Penalty: 1.0
- Length Ratio: 1.8881
- Translation Length: 1493697
- Reference Length: 791127
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Bleu | Brevity Penalty | Length Ratio | Validation Loss | Precisions | Reference Length | Translation Length |
---|---|---|---|---|---|---|---|---|---|
3.87 | 1.0 | 25762 | 0.0330 | 1.0 | 1.9706 | 3.7713 | [0.13936095014631156, 0.040556966110672256, 0.01980787640709075, 0.010544173702481034] | 791127 | 1559002 |
3.6512 | 2.0 | 51524 | 0.0296 | 1.0 | 2.1794 | 3.5106 | [0.12352037416285759, 0.03645373394602681, 0.01784718216329517, 0.009590691394861035] | 791127 | 1724169 |
3.5043 | 3.0 | 77286 | 0.0366 | 1.0 | 2.0612 | 3.3681 | [0.14769295867930776, 0.04478064762804751, 0.022321614312460995, 0.01211843395693241] | 791127 | 1630660 |
3.3524 | 4.0 | 103048 | 0.0373 | 1.0 | 1.9870 | 3.2651 | [0.15228345344701566, 0.04586568142634442, 0.02256759313264377, 0.012219488476642417] | 791127 | 1571983 |
3.2746 | 5.0 | 128810 | 0.0384 | 1.0 | 2.0390 | 3.1935 | [0.1523531796659091, 0.04687264288515885, 0.023561239014433664, 0.012935446265137207] | 791127 | 1613094 |
3.2305 | 6.0 | 154572 | 0.0387 | 1.0 | 1.9700 | 3.1368 | [0.1567301269740848, 0.047534152418592664, 0.023522792038785066, 0.012842802012275794] | 791127 | 1558507 |
3.1199 | 7.0 | 180334 | 0.0406 | 1.0 | 1.9295 | 3.0924 | [0.16104313669485146, 0.0497667381497795, 0.02487902888463687, 0.013686010776483513] | 791127 | 1526473 |
3.1476 | 8.0 | 206096 | 0.0416 | 1.0 | 1.9408 | 3.0537 | [0.16303145796074886, 0.050591823896046224, 0.025582405968043283, 0.014183117767188563] | 791127 | 1535446 |
3.031 | 9.0 | 231858 | 0.0424 | 1.0 | 1.8818 | 3.0262 | [0.16684712738332408, 0.051844235468668176, 0.026003093150347323, 0.01442481092789985] | 791127 | 1488782 |
3.0243 | 10.0 | 257620 | 0.0420 | 1.0 | 1.8859 | 3.0003 | [0.16607697592198523, 0.05141761221771236, 0.025742386869365745, 0.014193381846444359] | 791127 | 1492025 |
3.0343 | 11.0 | 283382 | 0.0428 | 1.0 | 1.8886 | 2.9777 | [0.1691752170217323, 0.052193166007217705, 0.026141681013552288, 0.014484642594473435] | 791127 | 1494090 |
2.9652 | 12.0 | 309144 | 0.0428 | 1.0 | 1.9005 | 2.9615 | [0.16823973933395542, 0.052320879613441, 0.026187658344737696, 0.014502132420939685] | 791127 | 1503533 |
2.9981 | 13.0 | 334906 | 0.0437 | 1.0 | 1.8706 | 2.9445 | [0.16985697826461554, 0.05332124116669285, 0.02686760130903802, 0.014972828451699309] | 791127 | 1479845 |
2.941 | 14.0 | 360668 | 0.0432 | 1.0 | 1.8655 | 2.9335 | [0.17029332390165305, 0.052870421070299455, 0.02642667143527729, 0.014670292675070435] | 791127 | 1475877 |
2.8816 | 15.0 | 386430 | 0.0437 | 1.0 | 1.8631 | 2.9228 | [0.1712148556231003, 0.053400515923720436, 0.026818846008300055, 0.014919424168060396] | 791127 | 1473920 |
2.9124 | 16.0 | 412192 | 0.0435 | 1.0 | 1.8775 | 2.9150 | [0.17018986135899558, 0.05329713851829526, 0.02675404721408581, 0.014813441829455485] | 791127 | 1485347 |
2.9019 | 17.0 | 437954 | 0.0433 | 1.0 | 1.8899 | 2.9091 | [0.17013412808717324, 0.053090832561091934, 0.026579011940635933, 0.014667138889511053] | 791127 | 1495138 |
2.8737 | 18.0 | 463716 | 0.0438 | 1.0 | 1.8892 | 2.9044 | [0.17056889528815428, 0.05354494235512719, 0.026930478436931807, 0.014909059846295177] | 791127 | 1494616 |
2.9192 | 19.0 | 489478 | 0.0439 | 1.0 | 1.8837 | 2.9019 | [0.1710080779910644, 0.05364458098047096, 0.02695562446920049, 0.014976875182171591] | 791127 | 1490222 |
2.8501 | 20.0 | 515240 | 2.9012 | 0.0437 | [0.17073810819731178, 0.05349823043007888, 0.026839997681805762, 0.014878806668698525] | 1.0 | 1.8881 | 1493697 | 791127 |
Framework versions
- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2
- Downloads last month
- 22