moe_train_run

This model is a fine-tuned version of ModernBERT-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 1

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	F1	Precision	Recall	Threshold	Sim Ratio	Pos Sim	Neg Sim
0.7183	0.0821	10000	3.5469	0.0047	0.8386	0.7972	0.8846	0.8755	1.2415	0.9408	0.7578
0.7053	0.1643	20000	3.6924	0.0047	0.8496	0.7963	0.9104	0.8043	1.383	0.9156	0.6621
0.6003	0.2464	30000	3.9111	0.0047	0.862	0.8148	0.9151	0.7832	1.437	0.9048	0.6296
0.5856	0.3286	40000	3.9771	0.0047	0.8628	0.822	0.9079	0.7718	1.4877	0.894	0.6009
0.5801	0.4107	50000	3.9434	0.0047	0.8704	0.8277	0.9178	0.7749	1.4477	0.8995	0.6214
0.562	0.4929	60000	3.6962	0.0047	0.8685	0.8232	0.9192	0.7930	1.4037	0.9064	0.6457
0.5307	0.5750	70000	3.8964	0.0047	0.875	0.839	0.9142	0.7807	1.4542	0.8973	0.617
0.4793	0.6572	80000	4.0046	0.0047	0.8779	0.8429	0.916	0.7706	1.4946	0.8912	0.5963
0.4978	0.7393	90000	4.0062	0.0047	0.8796	0.8395	0.9239	0.7598	1.4979	0.8879	0.5927
0.4934	0.8215	100000	3.9771	0.0047	0.885	0.8522	0.9204	0.7734	1.478	0.89	0.6022
0.4757	0.9036	110000	4.0861	0.0047	0.884	0.8489	0.9221	0.7636	1.5028	0.8859	0.5895
0.4773	0.9858	120000	3.9877	0.0047	0.8874	0.8558	0.9215	0.7711	1.4765	0.8877	0.6012