bottleneckBERTlarge

This model is a fine-tuned version of pborchert/BusinessBERT on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 10

Training Loss	Epoch	Step	Validation Loss
5.0008	0.2565	500	4.7834
4.5841	0.5131	1000	4.3743
4.3433	0.7696	1500	4.1845
4.2164	1.0262	2000	4.0312
4.0528	1.2827	2500	3.9675
4.0343	1.5393	3000	3.8445
3.9097	1.7958	3500	3.7837
3.9147	2.0523	4000	3.7297
3.7895	2.3089	4500	3.6807
3.7637	2.5654	5000	3.6467
3.6943	2.8220	5500	3.5823
3.6166	3.0785	6000	3.5294
3.5574	3.3350	6500	3.5244
3.6346	3.5916	7000	3.4654
3.5088	3.8481	7500	3.4500
3.4837	4.1047	8000	3.4083
3.5246	4.3612	8500	3.3814
3.4569	4.6178	9000	3.3269
3.4142	4.8743	9500	3.3118
3.468	5.1308	10000	3.3323
3.3737	5.3874	10500	3.3062
3.3821	5.6439	11000	3.2732
3.3292	5.9005	11500	3.2607
3.3308	6.1570	12000	3.2599
3.3365	6.4135	12500	3.2209
3.2705	6.6701	13000	3.2004
3.2914	6.9266	13500	3.2082
3.2268	7.1832	14000	3.1665
3.2435	7.4397	14500	3.1607
3.2424	7.6963	15000	3.1655
3.2252	7.9528	15500	3.1442
3.2011	8.2093	16000	3.1570
3.1927	8.4659	16500	3.1337
3.21	8.7224	17000	3.1557
3.1981	8.9790	17500	3.1240
3.1616	9.2355	18000	3.1412
3.2231	9.4920	18500	3.1189
3.1998	9.7486	19000	3.1258