base_model_custom_tokenizer

This model is a fine-tuned version of t5-base on the code_search_net dataset. It achieves the following results on the evaluation set:

Loss: 2.9297
Bleu: 0.0419
Precisions: [0.16646886171883812, 0.051341379400381214, 0.025538496667355304, 0.01408001744219341]
Brevity Penalty: 1.0
Length Ratio: 1.9160
Translation Length: 1515803
Reference Length: 791127

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Bleu	Brevity Penalty	Length Ratio	Validation Loss	Precisions	Reference Length	Translation Length
3.9604	1.0	25762	0.0311	1.0	2.0901	3.8577	[0.12981129473835085, 0.037916946342151155, 0.018860549385742668, 0.010123458812721054]	791127	1653531
3.7556	2.0	51524	0.0304	1.0	2.0887	3.5650	[0.12978779415458075, 0.037579383019195466, 0.018120049525730805, 0.00967159578808246]	791127	1652405
3.5524	3.0	77286	0.0337	1.0	2.0745	3.4150	[0.1400710094937268, 0.04118126290523918, 0.0203289377688518, 0.01095848654003696]	791127	1641189
3.4698	4.0	103048	0.0340	1.0	2.0788	3.3056	[0.14277601173291565, 0.041700438046903744, 0.020391137906857287, 0.010998711103394348]	791127	1644604
3.3163	5.0	128810	0.0377	1.0	2.0193	3.2312	[0.15481298837386176, 0.04617083876865068, 0.022825576079888228, 0.012408874977873952]	791127	1597521
3.2458	6.0	154572	0.0382	1.0	1.9276	3.1719	[0.1593547435203856, 0.04704355006890476, 0.023023369844916947, 0.012389103841794662]	791127	1524975
3.1574	7.0	180334	0.0373	1.0	2.0231	3.1267	[0.15301209486452477, 0.04557636504175273, 0.022512350851579006, 0.012331176442211789]	791127	1600514
3.1398	8.0	206096	0.0386	1.0	1.9724	3.0893	[0.1577822509066417, 0.04745355472604797, 0.023342833604973825, 0.012766267921605798]	791127	1560429
3.0691	9.0	231858	0.0399	1.0	1.9159	3.0574	[0.16179891666501725, 0.0490436396529825, 0.024170720153435545, 0.013205125551162357]	791127	1515690
3.0536	10.0	257620	0.0410	1.0	1.8550	3.0321	[0.1656489584760067, 0.05027218283158705, 0.024914277684092188, 0.013668271409759075]	791127	1467513
3.0379	11.0	283382	0.0404	1.0	1.8928	3.0082	[0.1630008107267023, 0.049590989569352824, 0.02452930558336929, 0.013463575807213558]	791127	1497422
3.0183	12.0	309144	0.0409	1.0	1.9428	2.9924	[0.16253787482001938, 0.049984123536708294, 0.02498794115282579, 0.01380309274144192]	791127	1536971
2.9442	13.0	334906	0.0413	1.0	1.9288	2.9773	[0.16426924674922966, 0.05052962811986506, 0.025225357778251727, 0.013893123599262487]	791127	1525946
2.9746	14.0	360668	0.0411	1.0	1.9154	2.9622	[0.16395222297528722, 0.050373776569881686, 0.02506334156586741, 0.013817874614866431]	791127	1515289
2.9556	15.0	386430	0.0416	1.0	1.8903	2.9505	[0.16631916674913938, 0.05114349827528396, 0.025291167834370104, 0.013919582587470626]	791127	1495444
2.9423	16.0	412192	0.0415	1.0	1.9161	2.9441	[0.1656048056193977, 0.050903942131636466, 0.02527336097239107, 0.013901882376966617]	791127	1515892
2.9257	17.0	437954	0.0417	1.0	1.9204	2.9387	[0.16566872310834463, 0.051149695919205686, 0.02547749541013215, 0.01403388257902964]	791127	1519291
2.9023	18.0	463716	0.0417	1.0	1.9252	2.9331	[0.16569868978430946, 0.05118214894137258, 0.025432645752525008, 0.014019028423183673]	791127	1523108
2.946	19.0	489478	0.0420	1.0	1.9138	2.9301	[0.16682044755191178, 0.051534782710695386, 0.02563003483561942, 0.014141190855303378]	791127	1514059
2.8761	20.0	515240	2.9297	0.0419	[0.16646886171883812, 0.051341379400381214, 0.025538496667355304, 0.01408001744219341]	1.0	1.9160	1515803	791127

Framework versions

Transformers 4.37.2
Pytorch 2.2.0+cu121
Datasets 2.17.0
Tokenizers 0.15.2

sc20fg
/

base_model_custom_tokenizer

base_model_custom_tokenizer

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from

Dataset used to train sc20fg/base_model_custom_tokenizer

Evaluation results

base_model_custom_tokenizer

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from google-t5/t5-base

Dataset used to train sc20fg/base_model_custom_tokenizer

Evaluation results

Finetuned from