pretrained-bert-base-100

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Train Loss: 5.5798
Validation Loss: 14.1522
Epoch: 99

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': 1e-04, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
training_precision: float32

Training results

Train Loss	Validation Loss	Epoch
8.8180	9.8287	0
6.9008	9.9128	1
6.6984	10.5699	2
6.0922	10.6815	3
7.4481	10.2008	4
6.1991	10.7918	5
6.0636	10.8031	6
6.0324	10.8065	7
5.8834	11.0328	8
5.6852	10.9941	9
5.7602	11.4597	10
5.7166	11.3688	11
5.6131	nan	12
5.6185	11.7676	13
5.7273	11.7161	14
5.6162	11.9106	15
5.7344	11.9131	16
5.7757	11.7448	17
5.6769	11.9218	18
5.6946	12.1175	19
5.6924	12.1778	20
5.5770	12.3167	21
5.4709	12.4586	22
5.7594	11.9413	23
5.5429	12.1610	24
5.4948	12.8648	25
5.6066	12.5354	26
5.7700	12.2591	27
5.6883	12.3748	28
5.6293	12.6476	29
5.7073	12.3106	30
5.6654	12.6093	31
5.8030	12.9058	32
5.5708	12.2990	33
5.6817	12.7136	34
5.6733	12.4783	35
5.5641	12.8990	36
5.6529	12.8055	37
5.6624	12.6477	38
5.7040	12.8407	39
5.6736	13.3960	40
5.6500	12.9211	41
5.6443	12.8308	42
5.5996	12.8930	43
5.3710	13.4719	44
5.5483	13.1366	45
5.5923	12.8598	46
5.5535	13.5748	47
5.5364	13.1579	48
5.7182	12.7962	49
5.4856	13.0038	50
5.5241	12.9632	51
5.4996	12.8477	52
5.6620	12.8107	53
5.6451	13.1976	54
5.5493	13.3731	55
5.5629	13.1022	56
5.6177	12.9348	57
5.6781	13.0553	58
5.6112	13.2850	59
5.5908	13.5602	60
5.6984	13.0039	61
5.4979	13.9429	62
5.6750	13.1717	63
5.6696	13.2127	64
5.6631	13.1643	65
5.6421	13.2311	66
5.6400	13.3191	67
5.6845	13.2363	68
5.6620	13.1115	69
5.6084	13.5133	70
5.4539	13.7953	71
5.6143	13.3565	72
5.6153	13.1141	73
5.6301	13.8310	74
5.7122	13.3998	75
5.5747	13.4063	76
5.5796	13.6303	77
5.5496	13.7870	78
5.5954	13.6211	79
5.6439	13.4964	80
5.7678	13.8165	81
5.5670	14.0257	82
5.5355	14.0359	83
5.6323	13.9998	84
5.5381	nan	85
5.6362	13.5828	86
5.6429	13.8217	87
5.5660	13.5157	88
5.5396	14.1864	89
5.5623	13.9653	90
5.6208	14.1349	91
5.5999	13.5511	92
5.6587	13.8928	93
5.6402	13.6646	94
5.6468	13.5333	95
5.5499	14.1628	96
5.5621	14.3442	97
5.5201	14.1347	98
5.5798	14.1522	99

Framework versions

Transformers 4.27.0.dev0
TensorFlow 2.11.0
Datasets 2.9.0
Tokenizers 0.13.2

maskip
/

pretrained-bert-base-100

pretrained-bert-base-100

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results