chinese-bert-wwm-chinese_bert_wwm3

This model is a fine-tuned version of hfl/chinese-bert-wwm on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	1.0	72	0.4251
No log	2.0	144	0.0282
No log	3.0	216	0.0048
No log	4.0	288	0.0018
No log	5.0	360	0.0011
No log	6.0	432	0.0006
0.483	7.0	504	0.0004
0.483	8.0	576	0.0004
0.483	9.0	648	0.0002
0.483	10.0	720	0.0002
0.483	11.0	792	0.0002
0.483	12.0	864	0.0001
0.483	13.0	936	0.0001
0.0031	14.0	1008	0.0001
0.0031	15.0	1080	0.0001
0.0031	16.0	1152	0.0001
0.0031	17.0	1224	0.0001
0.0031	18.0	1296	0.0001
0.0031	19.0	1368	0.0001
0.0031	20.0	1440	0.0001
0.0015	21.0	1512	0.0001
0.0015	22.0	1584	0.0001
0.0015	23.0	1656	0.0001
0.0015	24.0	1728	0.0001
0.0015	25.0	1800	0.0000
0.0015	26.0	1872	0.0001
0.0015	27.0	1944	0.0000
0.001	28.0	2016	0.0000
0.001	29.0	2088	0.0000
0.001	30.0	2160	0.0000