bert2bert-extremecleandata-lr-5e-05-encmaxlen-512-decmaxlen-256-abs

Dev Set: Extreme Clean Data
Encoder max length (input): 512
Decoder max length (output): 256

This model was trained from scratch on the id_liputan6 dataset. It achieves the following results on the evaluation set:

Loss: 3.8067
R1 Precision: 0.3498
R1 Recall: 0.2552
R1 Fmeasure: 0.2924
R2 Precision: 0.1424
R2 Recall: 0.1011
R2 Fmeasure: 0.1171
Rl Precision: 0.2867
Rl Recall: 0.2094
Rl Fmeasure: 0.2398

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 18
eval_batch_size: 18
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	R1 Precision	R1 Recall	R1 Fmeasure	R2 Precision	R2 Recall	R2 Fmeasure	Rl Precision	Rl Recall	Rl Fmeasure
2.3503	1.0	10772	2.6541	0.3481	0.2534	0.2906	0.1416	0.1004	0.1163	0.2906	0.2118	0.2427
1.5052	2.0	21544	2.5634	0.3506	0.2547	0.2924	0.1434	0.1018	0.1179	0.2914	0.2121	0.2432
1.3217	3.0	32316	2.5531	0.358	0.2605	0.2988	0.1477	0.1048	0.1214	0.2974	0.2168	0.2485
1.193	4.0	43088	2.5764	0.3619	0.2648	0.3029	0.1506	0.1074	0.1241	0.2996	0.2192	0.2508
1.0839	5.0	53860	2.6102	0.3556	0.2596	0.2973	0.1468	0.1044	0.1207	0.294	0.2149	0.246
0.9875	6.0	64632	2.6594	0.3569	0.2617	0.2992	0.1467	0.1048	0.1211	0.2942	0.2159	0.2467
0.8956	7.0	75404	2.7380	0.3565	0.2607	0.2983	0.1469	0.1048	0.1211	0.2951	0.2161	0.2471
0.8147	8.0	86176	2.8133	0.3584	0.2625	0.3002	0.1475	0.1053	0.1217	0.2955	0.2166	0.2476
0.7345	9.0	96948	2.9544	0.3577	0.2602	0.2985	0.1476	0.1046	0.1212	0.2933	0.2134	0.2448
0.6626	10.0	107720	3.0282	0.3565	0.2602	0.2981	0.145	0.1034	0.1195	0.2926	0.2138	0.2448
0.5974	11.0	118492	3.1423	0.3547	0.2592	0.2967	0.1448	0.1032	0.1193	0.2901	0.2122	0.2428
0.5357	12.0	129264	3.2344	0.3533	0.2579	0.2954	0.1446	0.1029	0.119	0.2892	0.2113	0.242
0.4262	13.0	140036	3.3609	0.3523	0.2571	0.2946	0.1429	0.1018	0.1177	0.2888	0.211	0.2416
0.3724	14.0	150808	3.4457	0.3558	0.2603	0.2979	0.1459	0.1041	0.1203	0.2913	0.2133	0.2439
0.3318	15.0	161580	3.5539	0.3522	0.2571	0.2945	0.144	0.1024	0.1185	0.2888	0.2109	0.2415
0.297	16.0	172352	3.6266	0.3531	0.2573	0.2949	0.1441	0.1021	0.1183	0.2892	0.2108	0.2415
0.2648	17.0	183124	3.6790	0.353	0.258	0.2954	0.1431	0.102	0.1179	0.289	0.2114	0.2419
0.2378	18.0	193896	3.7428	0.3504	0.2557	0.2929	0.1425	0.1013	0.1172	0.2869	0.2096	0.24
0.2155	19.0	204668	3.7795	0.352	0.2572	0.2945	0.1435	0.1021	0.1181	0.2891	0.2113	0.2419
0.1976	20.0	215440	3.8067	0.3498	0.2552	0.2924	0.1424	0.1011	0.1171	0.2867	0.2094	0.2398

Framework versions

Transformers 4.39.3
Pytorch 2.2.1
Datasets 2.18.0
Tokenizers 0.15.2

Alfahluzi
/

bert2bert-model1

bert2bert-extremecleandata-lr-5e-05-encmaxlen-512-decmaxlen-256-abs

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Alfahluzi/bert2bert-model1

Evaluation results