KMB_SimCSE
This model is a fine-tuned version of x2bee/KoModernBERT-base-mlm on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.0387
- Pearson Cosine: 0.7824
- Spearman Cosine: 0.7845
- Pearson Manhattan: 0.7335
- Spearman Manhattan: 0.7460
- Pearson Euclidean: 0.7337
- Spearman Euclidean: 0.7463
- Pearson Dot: 0.6362
- Spearman Dot: 0.6532
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 1
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 8
- total_train_batch_size: 128
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 10.0
Training results
Training Loss | Epoch | Step | Validation Loss | Pearson Cosine | Spearman Cosine | Pearson Manhattan | Spearman Manhattan | Pearson Euclidean | Spearman Euclidean | Pearson Dot | Spearman Dot |
---|---|---|---|---|---|---|---|---|---|---|---|
1.0084 | 0.1172 | 250 | 0.1579 | 0.6838 | 0.6994 | 0.6615 | 0.6693 | 0.6621 | 0.6694 | 0.3480 | 0.3442 |
0.7072 | 0.2343 | 500 | 0.1364 | 0.7226 | 0.7375 | 0.7207 | 0.7263 | 0.7214 | 0.7271 | 0.4002 | 0.3910 |
0.6207 | 0.3515 | 750 | 0.1194 | 0.7371 | 0.7509 | 0.7295 | 0.7398 | 0.7300 | 0.7401 | 0.4517 | 0.4462 |
0.5767 | 0.4686 | 1000 | 0.1147 | 0.7508 | 0.7636 | 0.7395 | 0.7502 | 0.7400 | 0.7511 | 0.5170 | 0.5181 |
0.5026 | 0.5858 | 1250 | 0.1047 | 0.7507 | 0.7635 | 0.7455 | 0.7558 | 0.7459 | 0.7564 | 0.5487 | 0.5531 |
0.5192 | 0.7029 | 1500 | 0.1166 | 0.7522 | 0.7673 | 0.7487 | 0.7591 | 0.7489 | 0.7594 | 0.5055 | 0.5053 |
0.5046 | 0.8201 | 1750 | 0.1110 | 0.7555 | 0.7675 | 0.7582 | 0.7675 | 0.7581 | 0.7672 | 0.5303 | 0.5391 |
0.5055 | 0.9372 | 2000 | 0.1062 | 0.7546 | 0.7726 | 0.7501 | 0.7650 | 0.7502 | 0.7651 | 0.5638 | 0.5710 |
0.4177 | 1.0544 | 2250 | 0.0942 | 0.7577 | 0.7709 | 0.7511 | 0.7635 | 0.7510 | 0.7633 | 0.5577 | 0.5635 |
0.4136 | 1.1715 | 2500 | 0.0915 | 0.7612 | 0.7727 | 0.7584 | 0.7696 | 0.7586 | 0.7696 | 0.5554 | 0.5595 |
0.4425 | 1.2887 | 2750 | 0.0928 | 0.7605 | 0.7726 | 0.7461 | 0.7591 | 0.7463 | 0.7592 | 0.5498 | 0.5512 |
0.3708 | 1.4058 | 3000 | 0.0819 | 0.7670 | 0.7783 | 0.7478 | 0.7634 | 0.7481 | 0.7637 | 0.5834 | 0.5847 |
0.3934 | 1.5230 | 3250 | 0.0848 | 0.7709 | 0.7814 | 0.7539 | 0.7692 | 0.7542 | 0.7689 | 0.5655 | 0.5668 |
0.3203 | 1.6401 | 3500 | 0.0781 | 0.7706 | 0.7810 | 0.7529 | 0.7689 | 0.7531 | 0.7691 | 0.5871 | 0.5891 |
0.4052 | 1.7573 | 3750 | 0.0824 | 0.7705 | 0.7816 | 0.7628 | 0.7771 | 0.7628 | 0.7771 | 0.5909 | 0.5989 |
0.3723 | 1.8744 | 4000 | 0.0819 | 0.7720 | 0.7840 | 0.7515 | 0.7679 | 0.7520 | 0.7685 | 0.5711 | 0.5713 |
0.3645 | 1.9916 | 4250 | 0.0802 | 0.7676 | 0.7804 | 0.7560 | 0.7704 | 0.7560 | 0.7703 | 0.5685 | 0.5701 |
0.3007 | 2.1087 | 4500 | 0.0662 | 0.7682 | 0.7799 | 0.7572 | 0.7721 | 0.7574 | 0.7721 | 0.5973 | 0.5981 |
0.2397 | 2.2259 | 4750 | 0.0617 | 0.7693 | 0.7782 | 0.7501 | 0.7655 | 0.7502 | 0.7652 | 0.5855 | 0.5898 |
0.28 | 2.3430 | 5000 | 0.0645 | 0.7654 | 0.7760 | 0.7567 | 0.7705 | 0.7569 | 0.7705 | 0.5925 | 0.5970 |
0.2631 | 2.4602 | 5250 | 0.0639 | 0.7712 | 0.7798 | 0.7561 | 0.7705 | 0.7562 | 0.7705 | 0.5715 | 0.5731 |
0.2488 | 2.5773 | 5500 | 0.0636 | 0.7736 | 0.7838 | 0.7537 | 0.7687 | 0.7538 | 0.7685 | 0.5835 | 0.5861 |
0.2557 | 2.6945 | 5750 | 0.0614 | 0.7739 | 0.7830 | 0.7570 | 0.7716 | 0.7571 | 0.7717 | 0.6008 | 0.6041 |
0.2699 | 2.8116 | 6000 | 0.0636 | 0.7722 | 0.7795 | 0.7570 | 0.7699 | 0.7572 | 0.7701 | 0.5844 | 0.5864 |
0.2794 | 2.9288 | 6250 | 0.0639 | 0.7704 | 0.7800 | 0.7582 | 0.7745 | 0.7581 | 0.7746 | 0.5817 | 0.5793 |
0.1778 | 3.0459 | 6500 | 0.0526 | 0.7738 | 0.7811 | 0.7574 | 0.7739 | 0.7573 | 0.7739 | 0.6193 | 0.6255 |
0.1791 | 3.1631 | 6750 | 0.0519 | 0.7728 | 0.7783 | 0.7540 | 0.7704 | 0.7538 | 0.7700 | 0.6116 | 0.6182 |
0.201 | 3.2802 | 7000 | 0.0511 | 0.7755 | 0.7825 | 0.7506 | 0.7671 | 0.7503 | 0.7670 | 0.6039 | 0.6071 |
0.225 | 3.3974 | 7250 | 0.0513 | 0.7684 | 0.7749 | 0.7515 | 0.7689 | 0.7514 | 0.7692 | 0.5867 | 0.5894 |
0.1748 | 3.5145 | 7500 | 0.0502 | 0.7752 | 0.7801 | 0.7459 | 0.7630 | 0.7461 | 0.7636 | 0.5877 | 0.5949 |
0.2045 | 3.6317 | 7750 | 0.0512 | 0.7787 | 0.7856 | 0.7457 | 0.7636 | 0.7460 | 0.7642 | 0.6113 | 0.6156 |
0.1821 | 3.7488 | 8000 | 0.0502 | 0.7782 | 0.7842 | 0.7543 | 0.7707 | 0.7545 | 0.7710 | 0.6045 | 0.6069 |
0.1783 | 3.8660 | 8250 | 0.0491 | 0.7772 | 0.7829 | 0.7455 | 0.7630 | 0.7459 | 0.7637 | 0.5915 | 0.5984 |
0.2055 | 3.9831 | 8500 | 0.0504 | 0.7776 | 0.7832 | 0.7476 | 0.7658 | 0.7480 | 0.7662 | 0.5959 | 0.6017 |
0.1345 | 4.1003 | 8750 | 0.0467 | 0.7762 | 0.7802 | 0.7429 | 0.7606 | 0.7435 | 0.7611 | 0.6206 | 0.6303 |
0.1506 | 4.2174 | 9000 | 0.0477 | 0.7711 | 0.7759 | 0.7466 | 0.7625 | 0.7473 | 0.7631 | 0.5978 | 0.6025 |
0.1565 | 4.3346 | 9250 | 0.0477 | 0.7717 | 0.7768 | 0.7481 | 0.7641 | 0.7486 | 0.7645 | 0.6026 | 0.6102 |
0.1577 | 4.4517 | 9500 | 0.0442 | 0.7794 | 0.7824 | 0.7439 | 0.7627 | 0.7444 | 0.7630 | 0.6182 | 0.6291 |
0.1463 | 4.5689 | 9750 | 0.0456 | 0.7764 | 0.7821 | 0.7401 | 0.7602 | 0.7405 | 0.7604 | 0.5941 | 0.5991 |
0.16 | 4.6860 | 10000 | 0.0460 | 0.7749 | 0.7793 | 0.7495 | 0.7658 | 0.7498 | 0.7660 | 0.6140 | 0.6192 |
0.148 | 4.8032 | 10250 | 0.0436 | 0.7817 | 0.7855 | 0.7421 | 0.7596 | 0.7425 | 0.7601 | 0.6171 | 0.6239 |
0.1382 | 4.9203 | 10500 | 0.0446 | 0.7824 | 0.7872 | 0.7437 | 0.7620 | 0.7443 | 0.7625 | 0.6330 | 0.6424 |
0.1109 | 5.0375 | 10750 | 0.0426 | 0.7796 | 0.7846 | 0.7431 | 0.7600 | 0.7434 | 0.7602 | 0.6195 | 0.6249 |
0.1009 | 5.1546 | 11000 | 0.0431 | 0.7807 | 0.7835 | 0.7423 | 0.7591 | 0.7428 | 0.7591 | 0.6237 | 0.6377 |
0.1082 | 5.2718 | 11250 | 0.0438 | 0.7774 | 0.7818 | 0.7430 | 0.7591 | 0.7433 | 0.7593 | 0.6039 | 0.6129 |
0.1138 | 5.3889 | 11500 | 0.0415 | 0.7829 | 0.7870 | 0.7405 | 0.7560 | 0.7410 | 0.7561 | 0.6347 | 0.6464 |
0.1015 | 5.5061 | 11750 | 0.0420 | 0.7778 | 0.7811 | 0.7437 | 0.7592 | 0.7435 | 0.7589 | 0.6249 | 0.6370 |
0.1153 | 5.6232 | 12000 | 0.0448 | 0.7730 | 0.7784 | 0.7451 | 0.7598 | 0.7453 | 0.7596 | 0.6141 | 0.6214 |
0.1269 | 5.7404 | 12250 | 0.0420 | 0.7802 | 0.7840 | 0.7413 | 0.7562 | 0.7417 | 0.7564 | 0.6217 | 0.6311 |
0.0888 | 5.8575 | 12500 | 0.0414 | 0.7805 | 0.7841 | 0.7408 | 0.7567 | 0.7412 | 0.7568 | 0.6245 | 0.6365 |
0.1202 | 5.9747 | 12750 | 0.0431 | 0.7793 | 0.7835 | 0.7412 | 0.7572 | 0.7414 | 0.7575 | 0.6261 | 0.6405 |
0.0941 | 6.0918 | 13000 | 0.0399 | 0.7838 | 0.7873 | 0.7388 | 0.7527 | 0.7391 | 0.7530 | 0.6493 | 0.6642 |
0.081 | 6.2090 | 13250 | 0.0405 | 0.7814 | 0.7854 | 0.7353 | 0.7513 | 0.7355 | 0.7514 | 0.6356 | 0.6478 |
0.0807 | 6.3261 | 13500 | 0.0401 | 0.7838 | 0.7879 | 0.7339 | 0.7510 | 0.7344 | 0.7513 | 0.6450 | 0.6615 |
0.0863 | 6.4433 | 13750 | 0.0405 | 0.7814 | 0.7841 | 0.7404 | 0.7587 | 0.7408 | 0.7589 | 0.6324 | 0.6479 |
0.0948 | 6.5604 | 14000 | 0.0397 | 0.7830 | 0.7866 | 0.7410 | 0.7578 | 0.7415 | 0.7579 | 0.6308 | 0.6460 |
0.0919 | 6.6776 | 14250 | 0.0409 | 0.7820 | 0.7858 | 0.7402 | 0.7545 | 0.7403 | 0.7544 | 0.6341 | 0.6459 |
0.0784 | 6.7948 | 14500 | 0.0408 | 0.7794 | 0.7839 | 0.7308 | 0.7495 | 0.7312 | 0.7494 | 0.6306 | 0.6427 |
0.0821 | 6.9119 | 14750 | 0.0406 | 0.7789 | 0.7822 | 0.7265 | 0.7446 | 0.7270 | 0.7446 | 0.6377 | 0.6567 |
0.0792 | 7.0291 | 15000 | 0.0401 | 0.7800 | 0.7833 | 0.7398 | 0.7569 | 0.7405 | 0.7572 | 0.6338 | 0.6467 |
0.0698 | 7.1462 | 15250 | 0.0396 | 0.7822 | 0.7855 | 0.7341 | 0.7507 | 0.7346 | 0.7509 | 0.6381 | 0.6552 |
0.0699 | 7.2634 | 15500 | 0.0392 | 0.7820 | 0.7851 | 0.7322 | 0.7502 | 0.7325 | 0.7502 | 0.6466 | 0.6629 |
0.0739 | 7.3805 | 15750 | 0.0389 | 0.7865 | 0.7886 | 0.7323 | 0.7491 | 0.7328 | 0.7495 | 0.6412 | 0.6589 |
0.0745 | 7.4977 | 16000 | 0.0397 | 0.7794 | 0.7827 | 0.7366 | 0.7524 | 0.7373 | 0.7524 | 0.6380 | 0.6504 |
0.0779 | 7.6148 | 16250 | 0.0391 | 0.7826 | 0.7846 | 0.7326 | 0.7462 | 0.7333 | 0.7467 | 0.6372 | 0.6532 |
0.078 | 7.7320 | 16500 | 0.0397 | 0.7810 | 0.7826 | 0.7299 | 0.7461 | 0.7300 | 0.7457 | 0.6364 | 0.6555 |
0.0699 | 7.8491 | 16750 | 0.0405 | 0.7811 | 0.7837 | 0.7308 | 0.7468 | 0.7312 | 0.7470 | 0.6315 | 0.6426 |
0.0735 | 7.9663 | 17000 | 0.0394 | 0.7804 | 0.7823 | 0.7320 | 0.7455 | 0.7326 | 0.7462 | 0.6468 | 0.6607 |
0.0682 | 8.0834 | 17250 | 0.0386 | 0.7845 | 0.7869 | 0.7306 | 0.7447 | 0.7311 | 0.7449 | 0.6431 | 0.6613 |
0.0526 | 8.2006 | 17500 | 0.0389 | 0.7824 | 0.7832 | 0.7272 | 0.7431 | 0.7275 | 0.7431 | 0.6370 | 0.6539 |
0.0558 | 8.3177 | 17750 | 0.0385 | 0.7856 | 0.7865 | 0.7370 | 0.7513 | 0.7376 | 0.7518 | 0.6517 | 0.6679 |
0.0633 | 8.4349 | 18000 | 0.0392 | 0.7822 | 0.7845 | 0.7388 | 0.7537 | 0.7395 | 0.7542 | 0.6512 | 0.6664 |
0.0568 | 8.5520 | 18250 | 0.0389 | 0.7826 | 0.7831 | 0.7358 | 0.7510 | 0.7362 | 0.7509 | 0.6378 | 0.6536 |
0.0645 | 8.6692 | 18500 | 0.0377 | 0.7888 | 0.7892 | 0.7315 | 0.7495 | 0.7319 | 0.7499 | 0.6514 | 0.6704 |
0.0563 | 8.7863 | 18750 | 0.0376 | 0.7870 | 0.7878 | 0.7285 | 0.7451 | 0.7289 | 0.7454 | 0.6393 | 0.6606 |
0.0669 | 8.9035 | 19000 | 0.0383 | 0.7850 | 0.7866 | 0.7238 | 0.7433 | 0.7244 | 0.7437 | 0.6359 | 0.6571 |
0.0436 | 9.0206 | 19250 | 0.0377 | 0.7855 | 0.7856 | 0.7289 | 0.7462 | 0.7293 | 0.7465 | 0.6489 | 0.6696 |
0.047 | 9.1378 | 19500 | 0.0377 | 0.7870 | 0.7882 | 0.7249 | 0.7414 | 0.7254 | 0.7413 | 0.6459 | 0.6694 |
0.0482 | 9.2549 | 19750 | 0.0377 | 0.7863 | 0.7871 | 0.7296 | 0.7442 | 0.7306 | 0.7449 | 0.6498 | 0.6690 |
0.0529 | 9.3721 | 20000 | 0.0377 | 0.7873 | 0.7888 | 0.7285 | 0.7423 | 0.7290 | 0.7426 | 0.6490 | 0.6690 |
0.0429 | 9.4892 | 20250 | 0.0378 | 0.7868 | 0.7883 | 0.7286 | 0.7426 | 0.7292 | 0.7431 | 0.6503 | 0.6684 |
0.0534 | 9.6064 | 20500 | 0.0380 | 0.7861 | 0.7881 | 0.7300 | 0.7443 | 0.7305 | 0.7451 | 0.6446 | 0.6635 |
0.0531 | 9.7235 | 20750 | 0.0375 | 0.7886 | 0.7894 | 0.7350 | 0.7492 | 0.7356 | 0.7498 | 0.6442 | 0.6634 |
0.0464 | 9.8407 | 21000 | 0.0380 | 0.7861 | 0.7871 | 0.7314 | 0.7464 | 0.7320 | 0.7468 | 0.6415 | 0.6600 |
0.0406 | 9.9578 | 21250 | 0.0387 | 0.7824 | 0.7845 | 0.7335 | 0.7460 | 0.7337 | 0.7463 | 0.6362 | 0.6532 |
Framework versions
- Transformers 4.48.0.dev0
- Pytorch 2.5.1+cu121
- Datasets 3.1.0
- Tokenizers 0.21.0
- Downloads last month
- 0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
HF Inference deployability: The model has no pipeline_tag.
Model tree for CocoRoF/KMB_SimCSE
Base model
x2bee/KoModernBERT-base-mlm