text_shortening_model_v66

This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.1443
Bert precision: 0.8948
Bert recall: 0.8974
Bert f1-score: 0.8956
Average word count: 6.6286
Max word count: 16
Min word count: 2
Average token count: 10.7187
% shortened texts with length > 12: 2.2022

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 40

Training results

Training Loss	Epoch	Step	Validation Loss	Bert precision	Bert recall	Bert f1-score	Average word count	Max word count	Min word count	Average token count	% shortened texts with length > 12
1.9029	1.0	73	1.3504	0.8775	0.8783	0.8772	6.6056	16	2	10.4785	2.2022
1.4456	2.0	146	1.2479	0.8813	0.8826	0.8813	6.6196	16	1	10.5105	1.2012
1.3171	3.0	219	1.1852	0.8834	0.8855	0.8839	6.6266	17	2	10.5806	1.5015
1.2221	4.0	292	1.1588	0.8852	0.8898	0.8869	6.7658	16	2	10.7588	1.9019
1.1597	5.0	365	1.1333	0.8865	0.8879	0.8866	6.5606	16	2	10.4735	1.3013
1.0924	6.0	438	1.1215	0.887	0.892	0.8889	6.8579	16	2	10.8759	2.2022
1.0445	7.0	511	1.1125	0.8897	0.8921	0.8904	6.6587	17	2	10.5996	1.5015
1.0004	8.0	584	1.1074	0.8901	0.8936	0.8913	6.7558	16	2	10.7778	2.4024
0.9619	9.0	657	1.1033	0.8903	0.8928	0.891	6.6677	16	2	10.6807	1.6016
0.9266	10.0	730	1.0955	0.8888	0.8921	0.8899	6.7007	16	2	10.7237	1.8018
0.8997	11.0	803	1.0948	0.8901	0.8918	0.8904	6.6236	16	2	10.6396	2.1021
0.87	12.0	876	1.0894	0.8909	0.8929	0.8913	6.6226	16	2	10.6406	2.2022
0.841	13.0	949	1.0987	0.8926	0.8945	0.893	6.5836	16	2	10.6176	1.8018
0.8137	14.0	1022	1.0864	0.8917	0.8939	0.8923	6.6006	16	2	10.6196	1.5015
0.7931	15.0	1095	1.0959	0.8927	0.8945	0.8931	6.6096	16	1	10.6627	1.9019
0.7774	16.0	1168	1.0996	0.8924	0.8939	0.8926	6.5696	16	1	10.6326	1.7017
0.7494	17.0	1241	1.1002	0.8934	0.8942	0.8933	6.5235	16	1	10.5706	1.6016
0.7429	18.0	1314	1.0967	0.8916	0.8958	0.8932	6.7327	16	1	10.7508	1.8018
0.7154	19.0	1387	1.1036	0.8938	0.8953	0.8941	6.6046	16	1	10.6156	1.7017
0.6968	20.0	1460	1.0964	0.8942	0.8962	0.8947	6.5786	16	1	10.6246	1.7017
0.6913	21.0	1533	1.1004	0.8941	0.8956	0.8943	6.5586	16	1	10.5636	1.7017
0.6775	22.0	1606	1.1009	0.8946	0.8961	0.8949	6.5636	16	1	10.5666	1.8018
0.6616	23.0	1679	1.1088	0.8939	0.8958	0.8943	6.5756	16	1	10.6106	1.8018
0.6451	24.0	1752	1.1169	0.8944	0.8973	0.8954	6.6216	16	1	10.6657	2.3023
0.6385	25.0	1825	1.1169	0.8949	0.8973	0.8956	6.5996	16	1	10.6496	2.2022
0.6305	26.0	1898	1.1231	0.8937	0.8968	0.8948	6.6406	16	1	10.7518	2.1021
0.6215	27.0	1971	1.1229	0.895	0.8972	0.8956	6.6156	16	1	10.6837	2.2022
0.6128	28.0	2044	1.1234	0.8946	0.8964	0.895	6.5676	16	2	10.6346	2.1021
0.6067	29.0	2117	1.1262	0.8945	0.8979	0.8957	6.6797	16	2	10.7588	2.3023
0.6017	30.0	2190	1.1302	0.8941	0.8974	0.8953	6.6667	16	2	10.7588	2.2022
0.5924	31.0	2263	1.1263	0.8947	0.8982	0.896	6.6687	16	2	10.7397	2.1021
0.591	32.0	2336	1.1275	0.8948	0.8971	0.8955	6.5976	16	2	10.6677	2.002
0.5862	33.0	2409	1.1328	0.8949	0.8971	0.8955	6.6096	16	2	10.6647	2.1021
0.5772	34.0	2482	1.1377	0.8947	0.8972	0.8955	6.6036	16	2	10.6937	2.1021
0.5754	35.0	2555	1.1382	0.8951	0.8976	0.8959	6.6216	16	2	10.7087	2.2022
0.5673	36.0	2628	1.1428	0.8943	0.8975	0.8954	6.6557	16	2	10.7758	2.2022
0.5698	37.0	2701	1.1434	0.8946	0.8976	0.8956	6.6466	16	2	10.7548	2.2022
0.5555	38.0	2774	1.1449	0.8946	0.8975	0.8956	6.6436	16	2	10.7447	2.3023
0.5647	39.0	2847	1.1443	0.8948	0.8974	0.8956	6.6366	16	2	10.7297	2.2022
0.5602	40.0	2920	1.1443	0.8948	0.8974	0.8956	6.6286	16	2	10.7187	2.2022

Framework versions

Transformers 4.33.1
Pytorch 2.0.1+cu118
Datasets 2.14.5
Tokenizers 0.13.3

ldos
/

text_shortening_model_v66

text_shortening_model_v66

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ldos/text_shortening_model_v66

Evaluation results