bert_bilstm_mega_crf-ner-weibo
This model is a fine-tuned version of hfl/chinese-roberta-wwm-ext-large on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.2341
- Precision: 0.6657
- Recall: 0.7075
- F1: 0.6860
- Accuracy: 0.9683
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 100
Training results
Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|---|---|---|
1.7329 | 1.0 | 11 | 0.4907 | 0.0 | 0.0 | 0.0 | 0.9274 |
0.4493 | 2.0 | 22 | 0.3486 | 0.0 | 0.0 | 0.0 | 0.9274 |
0.3203 | 3.0 | 33 | 0.2384 | 0.2941 | 0.0629 | 0.1036 | 0.9354 |
0.2259 | 4.0 | 44 | 0.1618 | 0.5219 | 0.4874 | 0.5041 | 0.9586 |
0.1617 | 5.0 | 55 | 0.1318 | 0.5476 | 0.5975 | 0.5714 | 0.9642 |
0.1171 | 6.0 | 66 | 0.1202 | 0.5718 | 0.6509 | 0.6088 | 0.9676 |
0.0956 | 7.0 | 77 | 0.1210 | 0.6022 | 0.6855 | 0.6412 | 0.9692 |
0.0666 | 8.0 | 88 | 0.1208 | 0.5951 | 0.6887 | 0.6385 | 0.9690 |
0.0567 | 9.0 | 99 | 0.1205 | 0.5963 | 0.7107 | 0.6485 | 0.9687 |
0.0433 | 10.0 | 110 | 0.1219 | 0.6230 | 0.7170 | 0.6667 | 0.9699 |
0.0333 | 11.0 | 121 | 0.1365 | 0.6375 | 0.6635 | 0.6502 | 0.9687 |
0.0309 | 12.0 | 132 | 0.1421 | 0.6011 | 0.6918 | 0.6433 | 0.9672 |
0.0239 | 13.0 | 143 | 0.1460 | 0.6398 | 0.6981 | 0.6677 | 0.9687 |
0.0235 | 14.0 | 154 | 0.1539 | 0.6518 | 0.6887 | 0.6697 | 0.9687 |
0.0188 | 15.0 | 165 | 0.1604 | 0.6656 | 0.6824 | 0.6739 | 0.9694 |
0.0193 | 16.0 | 176 | 0.1625 | 0.6471 | 0.6918 | 0.6687 | 0.9687 |
0.0155 | 17.0 | 187 | 0.1758 | 0.6770 | 0.6855 | 0.6813 | 0.9683 |
0.0148 | 18.0 | 198 | 0.1714 | 0.6506 | 0.6792 | 0.6646 | 0.9688 |
0.014 | 19.0 | 209 | 0.1626 | 0.6391 | 0.7296 | 0.6814 | 0.9674 |
0.0116 | 20.0 | 220 | 0.1718 | 0.6459 | 0.7170 | 0.6796 | 0.9687 |
0.0111 | 21.0 | 231 | 0.1840 | 0.6718 | 0.6824 | 0.6771 | 0.9694 |
0.0097 | 22.0 | 242 | 0.1807 | 0.6479 | 0.6887 | 0.6677 | 0.9677 |
0.0098 | 23.0 | 253 | 0.1787 | 0.6391 | 0.7296 | 0.6814 | 0.9664 |
0.0089 | 24.0 | 264 | 0.1877 | 0.6518 | 0.6887 | 0.6697 | 0.9688 |
0.0077 | 25.0 | 275 | 0.1896 | 0.6519 | 0.6950 | 0.6728 | 0.9693 |
0.008 | 26.0 | 286 | 0.1915 | 0.6608 | 0.7107 | 0.6848 | 0.9690 |
0.0079 | 27.0 | 297 | 0.2008 | 0.6606 | 0.6792 | 0.6698 | 0.9687 |
0.0072 | 28.0 | 308 | 0.1961 | 0.6486 | 0.7138 | 0.6796 | 0.9681 |
0.0067 | 29.0 | 319 | 0.2040 | 0.6617 | 0.7013 | 0.6809 | 0.9691 |
0.0063 | 30.0 | 330 | 0.2028 | 0.6725 | 0.7296 | 0.6998 | 0.9688 |
0.0056 | 31.0 | 341 | 0.2053 | 0.6716 | 0.7201 | 0.6950 | 0.9689 |
0.0073 | 32.0 | 352 | 0.2088 | 0.6465 | 0.6730 | 0.6595 | 0.9674 |
0.0061 | 33.0 | 363 | 0.1936 | 0.6138 | 0.7296 | 0.6667 | 0.9673 |
0.0057 | 34.0 | 374 | 0.2061 | 0.6596 | 0.6824 | 0.6708 | 0.9683 |
0.0062 | 35.0 | 385 | 0.2077 | 0.6627 | 0.7044 | 0.6829 | 0.9680 |
0.0046 | 36.0 | 396 | 0.2133 | 0.6738 | 0.6950 | 0.6842 | 0.9689 |
0.0062 | 37.0 | 407 | 0.2029 | 0.6696 | 0.7201 | 0.6939 | 0.9680 |
0.0058 | 38.0 | 418 | 0.2039 | 0.6707 | 0.7044 | 0.6871 | 0.9678 |
0.0047 | 39.0 | 429 | 0.2055 | 0.6667 | 0.7233 | 0.6938 | 0.9685 |
0.0049 | 40.0 | 440 | 0.2105 | 0.6757 | 0.7075 | 0.6912 | 0.9692 |
0.0048 | 41.0 | 451 | 0.2052 | 0.6667 | 0.7107 | 0.6880 | 0.9683 |
0.0049 | 42.0 | 462 | 0.2081 | 0.6590 | 0.7170 | 0.6867 | 0.9687 |
0.0063 | 43.0 | 473 | 0.2011 | 0.6552 | 0.7170 | 0.6847 | 0.9683 |
0.0046 | 44.0 | 484 | 0.1994 | 0.6477 | 0.7170 | 0.6806 | 0.9676 |
0.0047 | 45.0 | 495 | 0.2122 | 0.6790 | 0.6918 | 0.6854 | 0.9693 |
0.0048 | 46.0 | 506 | 0.2082 | 0.6609 | 0.7233 | 0.6907 | 0.9687 |
0.0042 | 47.0 | 517 | 0.2140 | 0.6769 | 0.6918 | 0.6843 | 0.9695 |
0.0054 | 48.0 | 528 | 0.2054 | 0.6514 | 0.7170 | 0.6826 | 0.9681 |
0.0037 | 49.0 | 539 | 0.2070 | 0.6686 | 0.7107 | 0.6890 | 0.9689 |
0.0045 | 50.0 | 550 | 0.2093 | 0.6514 | 0.7170 | 0.6826 | 0.9686 |
0.004 | 51.0 | 561 | 0.2163 | 0.6787 | 0.7107 | 0.6943 | 0.9698 |
0.0038 | 52.0 | 572 | 0.2173 | 0.6706 | 0.7107 | 0.6901 | 0.9694 |
0.0042 | 53.0 | 583 | 0.2156 | 0.6745 | 0.7233 | 0.6980 | 0.9694 |
0.0039 | 54.0 | 594 | 0.2190 | 0.6727 | 0.6981 | 0.6852 | 0.9689 |
0.0037 | 55.0 | 605 | 0.2213 | 0.6767 | 0.7044 | 0.6903 | 0.9687 |
0.0043 | 56.0 | 616 | 0.2247 | 0.6829 | 0.7044 | 0.6935 | 0.9690 |
0.0034 | 57.0 | 627 | 0.2291 | 0.6789 | 0.6981 | 0.6884 | 0.9689 |
0.0046 | 58.0 | 638 | 0.2258 | 0.6737 | 0.7075 | 0.6902 | 0.9686 |
0.0033 | 59.0 | 649 | 0.2254 | 0.6736 | 0.7138 | 0.6931 | 0.9689 |
0.0036 | 60.0 | 660 | 0.2255 | 0.6758 | 0.7013 | 0.6883 | 0.9690 |
0.0038 | 61.0 | 671 | 0.2200 | 0.6580 | 0.7138 | 0.6848 | 0.9682 |
0.0036 | 62.0 | 682 | 0.2210 | 0.6657 | 0.7075 | 0.6860 | 0.9687 |
0.0039 | 63.0 | 693 | 0.2237 | 0.6647 | 0.7107 | 0.6869 | 0.9682 |
0.0039 | 64.0 | 704 | 0.2295 | 0.6727 | 0.6981 | 0.6852 | 0.9688 |
0.0032 | 65.0 | 715 | 0.2271 | 0.6707 | 0.7044 | 0.6871 | 0.9687 |
0.0038 | 66.0 | 726 | 0.2290 | 0.6677 | 0.7013 | 0.6840 | 0.9687 |
0.0033 | 67.0 | 737 | 0.2260 | 0.6617 | 0.7013 | 0.6809 | 0.9682 |
0.0038 | 68.0 | 748 | 0.2250 | 0.6676 | 0.7138 | 0.6900 | 0.9686 |
0.0037 | 69.0 | 759 | 0.2254 | 0.6618 | 0.7075 | 0.6839 | 0.9684 |
0.0039 | 70.0 | 770 | 0.2281 | 0.6687 | 0.6981 | 0.6831 | 0.9687 |
0.0036 | 71.0 | 781 | 0.2317 | 0.6687 | 0.6981 | 0.6831 | 0.9687 |
0.0034 | 72.0 | 792 | 0.2272 | 0.6609 | 0.7170 | 0.6878 | 0.9686 |
0.0036 | 73.0 | 803 | 0.2278 | 0.6756 | 0.7138 | 0.6942 | 0.9687 |
0.0035 | 74.0 | 814 | 0.2287 | 0.6677 | 0.7075 | 0.6870 | 0.9683 |
0.0034 | 75.0 | 825 | 0.2283 | 0.6686 | 0.7107 | 0.6890 | 0.9681 |
0.0032 | 76.0 | 836 | 0.2331 | 0.6657 | 0.7075 | 0.6860 | 0.9672 |
0.0041 | 77.0 | 847 | 0.2357 | 0.6598 | 0.7075 | 0.6829 | 0.9675 |
0.0033 | 78.0 | 858 | 0.2352 | 0.6706 | 0.7170 | 0.6930 | 0.9676 |
0.0039 | 79.0 | 869 | 0.2363 | 0.6696 | 0.7075 | 0.6881 | 0.9689 |
0.0036 | 80.0 | 880 | 0.2367 | 0.6627 | 0.6918 | 0.6769 | 0.9685 |
0.0032 | 81.0 | 891 | 0.2369 | 0.6607 | 0.6981 | 0.6789 | 0.9683 |
0.0036 | 82.0 | 902 | 0.2331 | 0.6696 | 0.7201 | 0.6939 | 0.9687 |
0.0036 | 83.0 | 913 | 0.2286 | 0.6599 | 0.7138 | 0.6858 | 0.9682 |
0.0034 | 84.0 | 924 | 0.2276 | 0.6637 | 0.7138 | 0.6879 | 0.9687 |
0.0035 | 85.0 | 935 | 0.2286 | 0.6647 | 0.7107 | 0.6869 | 0.9687 |
0.0031 | 86.0 | 946 | 0.2296 | 0.6667 | 0.7044 | 0.6850 | 0.9689 |
0.0036 | 87.0 | 957 | 0.2296 | 0.6677 | 0.7075 | 0.6870 | 0.9687 |
0.0033 | 88.0 | 968 | 0.2299 | 0.6706 | 0.7170 | 0.6930 | 0.9688 |
0.0033 | 89.0 | 979 | 0.2301 | 0.6618 | 0.7138 | 0.6868 | 0.9683 |
0.0034 | 90.0 | 990 | 0.2320 | 0.6766 | 0.7170 | 0.6962 | 0.9687 |
0.0031 | 91.0 | 1001 | 0.2309 | 0.6766 | 0.7170 | 0.6962 | 0.9686 |
0.0033 | 92.0 | 1012 | 0.2315 | 0.6736 | 0.7138 | 0.6931 | 0.9685 |
0.0037 | 93.0 | 1023 | 0.2333 | 0.6696 | 0.7075 | 0.6881 | 0.9684 |
0.0031 | 94.0 | 1034 | 0.2342 | 0.6696 | 0.7075 | 0.6881 | 0.9684 |
0.0029 | 95.0 | 1045 | 0.2351 | 0.6687 | 0.7044 | 0.6861 | 0.9683 |
0.004 | 96.0 | 1056 | 0.2347 | 0.6667 | 0.7044 | 0.6850 | 0.9683 |
0.0032 | 97.0 | 1067 | 0.2346 | 0.6667 | 0.7044 | 0.6850 | 0.9683 |
0.0033 | 98.0 | 1078 | 0.2343 | 0.6667 | 0.7044 | 0.6850 | 0.9683 |
0.0032 | 99.0 | 1089 | 0.2341 | 0.6647 | 0.7044 | 0.6840 | 0.9682 |
0.0034 | 100.0 | 1100 | 0.2341 | 0.6657 | 0.7075 | 0.6860 | 0.9683 |
Framework versions
- Transformers 4.46.2
- Pytorch 2.4.1+cu124
- Datasets 3.1.0
- Tokenizers 0.20.3
- Downloads last month
- 8
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for PassbyGrocer/bert_bilstm_mega_crf-ner-weibo
Base model
hfl/chinese-roberta-wwm-ext-large