Edit model card

bert_bilstm_mega_crf-ner-weibo

This model is a fine-tuned version of hfl/chinese-roberta-wwm-ext-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2341
  • Precision: 0.6657
  • Recall: 0.7075
  • F1: 0.6860
  • Accuracy: 0.9683

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
1.7329 1.0 11 0.4907 0.0 0.0 0.0 0.9274
0.4493 2.0 22 0.3486 0.0 0.0 0.0 0.9274
0.3203 3.0 33 0.2384 0.2941 0.0629 0.1036 0.9354
0.2259 4.0 44 0.1618 0.5219 0.4874 0.5041 0.9586
0.1617 5.0 55 0.1318 0.5476 0.5975 0.5714 0.9642
0.1171 6.0 66 0.1202 0.5718 0.6509 0.6088 0.9676
0.0956 7.0 77 0.1210 0.6022 0.6855 0.6412 0.9692
0.0666 8.0 88 0.1208 0.5951 0.6887 0.6385 0.9690
0.0567 9.0 99 0.1205 0.5963 0.7107 0.6485 0.9687
0.0433 10.0 110 0.1219 0.6230 0.7170 0.6667 0.9699
0.0333 11.0 121 0.1365 0.6375 0.6635 0.6502 0.9687
0.0309 12.0 132 0.1421 0.6011 0.6918 0.6433 0.9672
0.0239 13.0 143 0.1460 0.6398 0.6981 0.6677 0.9687
0.0235 14.0 154 0.1539 0.6518 0.6887 0.6697 0.9687
0.0188 15.0 165 0.1604 0.6656 0.6824 0.6739 0.9694
0.0193 16.0 176 0.1625 0.6471 0.6918 0.6687 0.9687
0.0155 17.0 187 0.1758 0.6770 0.6855 0.6813 0.9683
0.0148 18.0 198 0.1714 0.6506 0.6792 0.6646 0.9688
0.014 19.0 209 0.1626 0.6391 0.7296 0.6814 0.9674
0.0116 20.0 220 0.1718 0.6459 0.7170 0.6796 0.9687
0.0111 21.0 231 0.1840 0.6718 0.6824 0.6771 0.9694
0.0097 22.0 242 0.1807 0.6479 0.6887 0.6677 0.9677
0.0098 23.0 253 0.1787 0.6391 0.7296 0.6814 0.9664
0.0089 24.0 264 0.1877 0.6518 0.6887 0.6697 0.9688
0.0077 25.0 275 0.1896 0.6519 0.6950 0.6728 0.9693
0.008 26.0 286 0.1915 0.6608 0.7107 0.6848 0.9690
0.0079 27.0 297 0.2008 0.6606 0.6792 0.6698 0.9687
0.0072 28.0 308 0.1961 0.6486 0.7138 0.6796 0.9681
0.0067 29.0 319 0.2040 0.6617 0.7013 0.6809 0.9691
0.0063 30.0 330 0.2028 0.6725 0.7296 0.6998 0.9688
0.0056 31.0 341 0.2053 0.6716 0.7201 0.6950 0.9689
0.0073 32.0 352 0.2088 0.6465 0.6730 0.6595 0.9674
0.0061 33.0 363 0.1936 0.6138 0.7296 0.6667 0.9673
0.0057 34.0 374 0.2061 0.6596 0.6824 0.6708 0.9683
0.0062 35.0 385 0.2077 0.6627 0.7044 0.6829 0.9680
0.0046 36.0 396 0.2133 0.6738 0.6950 0.6842 0.9689
0.0062 37.0 407 0.2029 0.6696 0.7201 0.6939 0.9680
0.0058 38.0 418 0.2039 0.6707 0.7044 0.6871 0.9678
0.0047 39.0 429 0.2055 0.6667 0.7233 0.6938 0.9685
0.0049 40.0 440 0.2105 0.6757 0.7075 0.6912 0.9692
0.0048 41.0 451 0.2052 0.6667 0.7107 0.6880 0.9683
0.0049 42.0 462 0.2081 0.6590 0.7170 0.6867 0.9687
0.0063 43.0 473 0.2011 0.6552 0.7170 0.6847 0.9683
0.0046 44.0 484 0.1994 0.6477 0.7170 0.6806 0.9676
0.0047 45.0 495 0.2122 0.6790 0.6918 0.6854 0.9693
0.0048 46.0 506 0.2082 0.6609 0.7233 0.6907 0.9687
0.0042 47.0 517 0.2140 0.6769 0.6918 0.6843 0.9695
0.0054 48.0 528 0.2054 0.6514 0.7170 0.6826 0.9681
0.0037 49.0 539 0.2070 0.6686 0.7107 0.6890 0.9689
0.0045 50.0 550 0.2093 0.6514 0.7170 0.6826 0.9686
0.004 51.0 561 0.2163 0.6787 0.7107 0.6943 0.9698
0.0038 52.0 572 0.2173 0.6706 0.7107 0.6901 0.9694
0.0042 53.0 583 0.2156 0.6745 0.7233 0.6980 0.9694
0.0039 54.0 594 0.2190 0.6727 0.6981 0.6852 0.9689
0.0037 55.0 605 0.2213 0.6767 0.7044 0.6903 0.9687
0.0043 56.0 616 0.2247 0.6829 0.7044 0.6935 0.9690
0.0034 57.0 627 0.2291 0.6789 0.6981 0.6884 0.9689
0.0046 58.0 638 0.2258 0.6737 0.7075 0.6902 0.9686
0.0033 59.0 649 0.2254 0.6736 0.7138 0.6931 0.9689
0.0036 60.0 660 0.2255 0.6758 0.7013 0.6883 0.9690
0.0038 61.0 671 0.2200 0.6580 0.7138 0.6848 0.9682
0.0036 62.0 682 0.2210 0.6657 0.7075 0.6860 0.9687
0.0039 63.0 693 0.2237 0.6647 0.7107 0.6869 0.9682
0.0039 64.0 704 0.2295 0.6727 0.6981 0.6852 0.9688
0.0032 65.0 715 0.2271 0.6707 0.7044 0.6871 0.9687
0.0038 66.0 726 0.2290 0.6677 0.7013 0.6840 0.9687
0.0033 67.0 737 0.2260 0.6617 0.7013 0.6809 0.9682
0.0038 68.0 748 0.2250 0.6676 0.7138 0.6900 0.9686
0.0037 69.0 759 0.2254 0.6618 0.7075 0.6839 0.9684
0.0039 70.0 770 0.2281 0.6687 0.6981 0.6831 0.9687
0.0036 71.0 781 0.2317 0.6687 0.6981 0.6831 0.9687
0.0034 72.0 792 0.2272 0.6609 0.7170 0.6878 0.9686
0.0036 73.0 803 0.2278 0.6756 0.7138 0.6942 0.9687
0.0035 74.0 814 0.2287 0.6677 0.7075 0.6870 0.9683
0.0034 75.0 825 0.2283 0.6686 0.7107 0.6890 0.9681
0.0032 76.0 836 0.2331 0.6657 0.7075 0.6860 0.9672
0.0041 77.0 847 0.2357 0.6598 0.7075 0.6829 0.9675
0.0033 78.0 858 0.2352 0.6706 0.7170 0.6930 0.9676
0.0039 79.0 869 0.2363 0.6696 0.7075 0.6881 0.9689
0.0036 80.0 880 0.2367 0.6627 0.6918 0.6769 0.9685
0.0032 81.0 891 0.2369 0.6607 0.6981 0.6789 0.9683
0.0036 82.0 902 0.2331 0.6696 0.7201 0.6939 0.9687
0.0036 83.0 913 0.2286 0.6599 0.7138 0.6858 0.9682
0.0034 84.0 924 0.2276 0.6637 0.7138 0.6879 0.9687
0.0035 85.0 935 0.2286 0.6647 0.7107 0.6869 0.9687
0.0031 86.0 946 0.2296 0.6667 0.7044 0.6850 0.9689
0.0036 87.0 957 0.2296 0.6677 0.7075 0.6870 0.9687
0.0033 88.0 968 0.2299 0.6706 0.7170 0.6930 0.9688
0.0033 89.0 979 0.2301 0.6618 0.7138 0.6868 0.9683
0.0034 90.0 990 0.2320 0.6766 0.7170 0.6962 0.9687
0.0031 91.0 1001 0.2309 0.6766 0.7170 0.6962 0.9686
0.0033 92.0 1012 0.2315 0.6736 0.7138 0.6931 0.9685
0.0037 93.0 1023 0.2333 0.6696 0.7075 0.6881 0.9684
0.0031 94.0 1034 0.2342 0.6696 0.7075 0.6881 0.9684
0.0029 95.0 1045 0.2351 0.6687 0.7044 0.6861 0.9683
0.004 96.0 1056 0.2347 0.6667 0.7044 0.6850 0.9683
0.0032 97.0 1067 0.2346 0.6667 0.7044 0.6850 0.9683
0.0033 98.0 1078 0.2343 0.6667 0.7044 0.6850 0.9683
0.0032 99.0 1089 0.2341 0.6647 0.7044 0.6840 0.9682
0.0034 100.0 1100 0.2341 0.6657 0.7075 0.6860 0.9683

Framework versions

  • Transformers 4.46.2
  • Pytorch 2.4.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
8
Safetensors
Model size
324M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for PassbyGrocer/bert_bilstm_mega_crf-ner-weibo

Finetuned
(7)
this model