amh-modernbert-focal-lr-3e-5

This model is a fine-tuned version of answerdotai/ModernBERT-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 70
num_epochs: 4

Training Loss	Epoch	Step	Validation Loss	F1-micro	F1-macro
49.7041	1.0	178	17.4982	0.2990	0.2069
48.1728	2.0	356	16.9336	0.3527	0.3190
45.6227	3.0	534	16.1424	0.3509	0.3412
42.7613	4.0	712	16.1591	0.3433	0.3351