speller-t5-90

This model is a fine-tuned version of sberbank-ai/ruT5-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1486
Rouge1: 19.3503
Rouge2: 8.3898
Rougel: 19.4209
Rougelsum: 19.4915
Gen Len: 41.3136

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 1
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.3435	0.03	500	0.2100	19.3503	8.3898	19.4209	19.4915	41.4492
0.3245	0.07	1000	0.2102	19.5975	8.7571	19.7034	19.774	41.1949
0.3777	0.1	1500	0.2010	19.3503	8.3898	19.4209	19.4915	41.0
0.3643	0.14	2000	0.1980	19.3503	8.3898	19.4209	19.4915	41.0593
0.3212	0.17	2500	0.1986	19.209	8.2062	19.2797	19.2797	41.1525
0.4181	0.2	3000	0.1896	19.3503	8.3898	19.4209	19.4915	42.2373
0.3175	0.24	3500	0.1879	19.3503	8.3898	19.4209	19.4915	41.4576
0.3399	0.27	4000	0.1838	19.3503	8.3898	19.4209	19.4915	41.1102
0.314	0.31	4500	0.1837	19.3503	8.3898	19.4209	19.4915	41.0339
0.3063	0.34	5000	0.1796	19.3503	8.3898	19.4209	19.4915	40.9407
0.3434	0.38	5500	0.1769	19.3503	8.3898	19.4209	19.4915	40.8814
0.376	0.41	6000	0.1790	19.3503	8.3898	19.4209	19.4915	41.0593
0.3355	0.44	6500	0.1735	19.3503	8.3898	19.4209	19.4915	41.4153
0.3181	0.48	7000	0.1665	19.3503	8.3898	19.4209	19.4915	41.0508
0.3017	0.51	7500	0.1701	19.3503	8.3898	19.4209	19.4915	41.2881
0.2953	0.55	8000	0.1664	19.3503	8.3898	19.4209	19.4915	41.2458
0.2711	0.58	8500	0.1664	19.5975	8.7571	19.7034	19.774	41.4068
0.3661	0.61	9000	0.1626	19.5975	8.7571	19.7034	19.774	41.2797
0.273	0.65	9500	0.1585	19.3503	8.3898	19.4209	19.4915	41.3051
0.3346	0.68	10000	0.1627	19.5975	8.7571	19.7034	19.774	41.2797
0.2529	0.72	10500	0.1590	19.3503	8.3898	19.4209	19.4915	41.2627
0.2926	0.75	11000	0.1601	19.5975	8.7571	19.7034	19.774	41.2712
0.2677	0.78	11500	0.1551	19.5975	8.7571	19.7034	19.774	41.2797
0.2746	0.82	12000	0.1570	19.5975	8.7571	19.7034	19.774	41.1186
0.2494	0.85	12500	0.1513	19.3503	8.3898	19.4209	19.4915	41.2373
0.2834	0.89	13000	0.1506	19.5975	8.7571	19.7034	19.774	41.2458
0.2646	0.92	13500	0.1512	19.5975	8.7571	19.7034	19.774	41.3729
0.2782	0.95	14000	0.1528	19.3503	8.3898	19.4209	19.4915	41.3644
0.2954	0.99	14500	0.1486	19.3503	8.3898	19.4209	19.4915	41.3136

Framework versions

Transformers 4.26.0
Pytorch 1.7.1+cu110
Datasets 2.9.0
Tokenizers 0.13.2

summervent
/

speller-t5-90

speller-t5-90

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results