t5-small-paraphrase-pubmed

This model is a fine-tuned version of t5-small on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.4032
Rouge2 Precision: 0.8281
Rouge2 Recall: 0.6346
Rouge2 Fmeasure: 0.6996

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 40
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge2 Precision	Rouge2 Recall	Rouge2 Fmeasure
0.5253	1.0	663	0.4895	0.8217	0.6309	0.695
0.5385	2.0	1326	0.4719	0.822	0.6307	0.6953
0.5255	3.0	1989	0.4579	0.8225	0.631	0.6954
0.4927	4.0	2652	0.4510	0.824	0.6315	0.6965
0.484	5.0	3315	0.4426	0.8254	0.6323	0.6974
0.4691	6.0	3978	0.4383	0.8241	0.6311	0.6962
0.4546	7.0	4641	0.4319	0.8248	0.6322	0.6969
0.4431	8.0	5304	0.4270	0.8254	0.633	0.6977
0.4548	9.0	5967	0.4257	0.8257	0.6322	0.6976
0.4335	10.0	6630	0.4241	0.8271	0.6333	0.6986
0.4234	11.0	7293	0.4203	0.827	0.6341	0.6992
0.433	12.0	7956	0.4185	0.8279	0.6347	0.6998
0.4108	13.0	8619	0.4161	0.8285	0.6352	0.7004
0.4101	14.0	9282	0.4133	0.8289	0.6356	0.7008
0.4155	15.0	9945	0.4149	0.8279	0.635	0.6998
0.3991	16.0	10608	0.4124	0.8289	0.6353	0.7005
0.3962	17.0	11271	0.4113	0.829	0.6353	0.7006
0.3968	18.0	11934	0.4114	0.8285	0.6352	0.7002
0.3962	19.0	12597	0.4100	0.8282	0.6346	0.6998
0.3771	20.0	13260	0.4078	0.829	0.6352	0.7005
0.3902	21.0	13923	0.4083	0.8295	0.6351	0.7006
0.3811	22.0	14586	0.4077	0.8276	0.6346	0.6995
0.38	23.0	15249	0.4076	0.8281	0.6346	0.6997
0.3695	24.0	15912	0.4059	0.8277	0.6344	0.6993
0.3665	25.0	16575	0.4043	0.8278	0.6343	0.6992
0.3728	26.0	17238	0.4059	0.8279	0.6346	0.6994
0.3669	27.0	17901	0.4048	0.8271	0.6342	0.6991
0.3702	28.0	18564	0.4058	0.8265	0.6338	0.6985
0.3674	29.0	19227	0.4049	0.8277	0.6345	0.6993
0.364	30.0	19890	0.4048	0.8273	0.6341	0.699
0.3618	31.0	20553	0.4041	0.828	0.6349	0.6997
0.3609	32.0	21216	0.4040	0.8275	0.6346	0.6994
0.357	33.0	21879	0.4037	0.8278	0.6348	0.6996
0.3638	34.0	22542	0.4038	0.8275	0.634	0.6989
0.3551	35.0	23205	0.4035	0.8275	0.6344	0.6992
0.358	36.0	23868	0.4035	0.8279	0.6347	0.6995
0.3519	37.0	24531	0.4034	0.8277	0.6343	0.6992
0.359	38.0	25194	0.4035	0.8281	0.6346	0.6996
0.3542	39.0	25857	0.4033	0.8281	0.6346	0.6996
0.3592	40.0	26520	0.4032	0.8281	0.6346	0.6996

Framework versions

Transformers 4.12.3
Pytorch 1.9.0+cu111
Datasets 1.15.1
Tokenizers 0.10.3

gayanin
/

t5-small-paraphrase-pubmed

t5-small-paraphrase-pubmed

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Collection including gayanin/t5-small-paraphrase-pubmed

EC-Seq2Seq

Evaluation results