pretrain_base_tokenizer

This model was trained from scratch on the code_search_net dataset. It achieves the following results on the evaluation set:

Loss: 2.1008
Bleu: 0.0745
Precisions: [0.370227852188713, 0.13803247473556413, 0.07398987834019316, 0.04421999242711094]
Brevity Penalty: 0.6551
Length Ratio: 0.7028
Translation Length: 594596
Reference Length: 846059

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Bleu	Brevity Penalty	Length Ratio	Validation Loss	Precisions	Reference Length	Translation Length
2.4112	1.0	25762	0.0658	0.6617	0.7078	2.3310	[0.35249982048117723, 0.12327087650542343, 0.06319307060145318, 0.03566263068721848]	846059	598823
2.3334	2.0	51524	0.0681	0.6617	0.7078	2.2582	[0.35832782242172834, 0.127192419612726, 0.06572103592555187, 0.03742664572798245]	846059	598812
2.2441	3.0	77286	0.0696	0.6529	0.7011	2.2180	[0.36256557741844125, 0.13175932444407085, 0.06865854925017445, 0.039288874037902925]	846059	593192
2.1798	4.0	103048	0.0721	0.6729	0.7162	2.1907	[0.36450348611741407, 0.13225409723744794, 0.06903673685953533, 0.0396539707725045]	846059	605975
2.1424	5.0	128810	0.0715	0.6561	0.7035	2.1736	[0.3668020289217111, 0.13425532471838544, 0.07022789074894986, 0.04068744822577534]	846059	595193
2.1132	6.0	154572	0.0739	0.6875	0.7275	2.1539	[0.36025300866163096, 0.13232255476642318, 0.06955911290379053, 0.040195441044440436]	846059	615473
2.0984	7.0	180334	0.0721	0.6587	0.7055	2.1471	[0.36612131721578584, 0.13431329561035363, 0.0708157263719857, 0.041272288902252124]	846059	596865
2.0785	8.0	206096	0.0724	0.6756	0.7183	2.1353	[0.36380808213768595, 0.13209779987841613, 0.06888583628832168, 0.039797612956022015]	846059	607760
2.044	9.0	231858	0.0651	0.5890	0.6539	2.1307	[0.3747329223983659, 0.13597984423722942, 0.07083334152049311, 0.041232475735633954]	846059	553210
2.0022	10.0	257620	0.0678	0.6182	0.6752	2.1244	[0.37057115300122706, 0.13501863826827087, 0.07054691458053057, 0.04094138244503552]	846059	571283
2.0115	11.0	283382	0.0714	0.6437	0.6942	2.1181	[0.3696569336851962, 0.1359002395637604, 0.07172057187893609, 0.04198004369041997]	846059	587350
1.9957	12.0	309144	0.0780	0.7340	0.7638	2.1182	[0.3562361599633563, 0.13051385463885645, 0.06873666863799165, 0.039808098889084334]	846059	646223
1.9816	13.0	334906	0.0748	0.6775	0.7198	2.1112	[0.3644272643077186, 0.1348813193666958, 0.07171355661769002, 0.04221956829440906]	846059	608972
1.9799	14.0	360668	0.0729	0.6567	0.7039	2.1094	[0.3683080239907046, 0.1360146909050323, 0.07189528256366162, 0.04211029597965069]	846059	595564
1.9721	15.0	386430	0.0724	0.6428	0.6935	2.1035	[0.37174066063670774, 0.13775257176864, 0.07323700636731323, 0.042981616643797974]	846059	586737
1.9415	16.0	412192	0.0707	0.6275	0.6822	2.1052	[0.37395952455210174, 0.1379846553581918, 0.07303615398474567, 0.04286080713028393]	846059	577140
1.921	17.0	437954	0.0755	0.6693	0.7135	2.1031	[0.368775375991227, 0.137205943045811, 0.07336018463397673, 0.04358297628398173]	846059	603671
1.9281	18.0	463716	0.0730	0.6426	0.6934	2.1008	[0.3719142378879256, 0.13830769094500395, 0.0738741230374939, 0.043850156367431746]	846059	586646
1.9619	19.0	489478	0.0741	0.6539	0.7019	2.1011	[0.3690046967173499, 0.1375453499071136, 0.07371470581812624, 0.044035281313872486]	846059	593819
1.9177	20.0	515240	2.1008	0.0745	[0.370227852188713, 0.13803247473556413, 0.07398987834019316, 0.04421999242711094]	0.6551	0.7028	594596	846059

Framework versions

Transformers 4.37.2
Pytorch 2.2.0+cu121
Datasets 2.17.0
Tokenizers 0.15.2

sc20fg
/

pretrain_base_tokenizer

pretrain_base_tokenizer

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train sc20fg/pretrain_base_tokenizer

Evaluation results