Ryukijano/masked-lm-tpu

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0001, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 111625, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 5875, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
training_precision: float32

Train Loss	Train Accuracy	Validation Loss	Validation Accuracy	Epoch
10.2437	0.0000	10.1909	0.0000	0
10.1151	0.0001	9.9763	0.0016	1
9.8665	0.0107	9.6535	0.0215	2
9.5331	0.0230	9.2992	0.0223	3
9.2000	0.0231	8.9944	0.0222	4
8.9195	0.0229	8.7450	0.0224	5
8.6997	0.0231	8.6124	0.0219	6
8.5689	0.0229	8.4904	0.0222	7
8.4525	0.0230	8.3865	0.0223	8
8.3594	0.0230	8.3069	0.0221	9
8.2662	0.0231	8.2092	0.0224	10
8.1956	0.0231	8.1208	0.0222	11
8.1285	0.0229	8.0806	0.0219	12
8.0345	0.0234	8.0030	0.0220	13
7.9960	0.0228	7.9144	0.0224	14
7.9065	0.0231	7.8661	0.0221	15
7.8449	0.0229	7.7873	0.0219	16
7.7673	0.0232	7.6903	0.0229	17
7.6868	0.0242	7.6129	0.0243	18
7.6206	0.0250	7.5579	0.0246	19
7.5231	0.0258	7.4564	0.0254	20
7.4589	0.0262	7.4136	0.0255	21
7.3658	0.0269	7.2941	0.0265	22
7.2832	0.0274	7.1998	0.0270	23
7.2035	0.0275	7.1203	0.0271	24
7.1116	0.0280	7.0582	0.0269	25
7.0099	0.0287	6.9567	0.0287	26
6.9296	0.0294	6.8759	0.0287	27
6.8524	0.0296	6.8272	0.0285	28
6.7757	0.0300	6.7311	0.0291	29
6.7031	0.0304	6.6316	0.0305	30
6.6361	0.0306	6.5744	0.0307	31
6.5578	0.0312	6.4946	0.0312	32
6.4674	0.0319	6.4212	0.0314	33
6.4096	0.0322	6.3557	0.0320	34
6.3614	0.0321	6.3093	0.0322	35
6.2754	0.0329	6.2240	0.0326	36
6.2609	0.0326	6.2114	0.0321	37
6.1866	0.0329	6.1645	0.0320	38
6.1470	0.0330	6.1193	0.0323	39
6.0936	0.0329	6.0600	0.0324	40
6.0625	0.0330	6.0282	0.0323	41
6.0062	0.0335	5.9649	0.0329	42
5.9731	0.0339	5.9661	0.0330	43
5.9460	0.0335	5.9259	0.0330	44
5.9206	0.0338	5.8926	0.0333	45
5.8734	0.0343	5.8471	0.0340	46
5.8663	0.0341	5.8561	0.0337	47
5.8422	0.0344	5.8152	0.0340	48