2023-10-13 09:11:34,164 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:11:34,165 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 09:11:34,165 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:11:34,165 MultiCorpus: 1214 train + 266 dev + 251 test sentences - NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator 2023-10-13 09:11:34,165 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:11:34,166 Train: 1214 sentences 2023-10-13 09:11:34,166 (train_with_dev=False, train_with_test=False) 2023-10-13 09:11:34,166 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:11:34,166 Training Params: 2023-10-13 09:11:34,166 - learning_rate: "3e-05" 2023-10-13 09:11:34,166 - mini_batch_size: "4" 2023-10-13 09:11:34,166 - max_epochs: "10" 2023-10-13 09:11:34,166 - shuffle: "True" 2023-10-13 09:11:34,166 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:11:34,166 Plugins: 2023-10-13 09:11:34,166 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 09:11:34,166 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:11:34,166 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 09:11:34,166 - metric: "('micro avg', 'f1-score')" 2023-10-13 09:11:34,166 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:11:34,166 Computation: 2023-10-13 09:11:34,166 - compute on device: cuda:0 2023-10-13 09:11:34,166 - embedding storage: none 2023-10-13 09:11:34,166 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:11:34,166 Model training base path: "hmbench-ajmc/en-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-13 09:11:34,166 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:11:34,166 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:11:35,576 epoch 1 - iter 30/304 - loss 3.35375126 - time (sec): 1.41 - samples/sec: 2181.26 - lr: 0.000003 - momentum: 0.000000 2023-10-13 09:11:36,887 epoch 1 - iter 60/304 - loss 2.95856678 - time (sec): 2.72 - samples/sec: 2324.09 - lr: 0.000006 - momentum: 0.000000 2023-10-13 09:11:38,204 epoch 1 - iter 90/304 - loss 2.26373515 - time (sec): 4.04 - samples/sec: 2333.05 - lr: 0.000009 - momentum: 0.000000 2023-10-13 09:11:39,515 epoch 1 - iter 120/304 - loss 1.90301912 - time (sec): 5.35 - samples/sec: 2295.26 - lr: 0.000012 - momentum: 0.000000 2023-10-13 09:11:40,828 epoch 1 - iter 150/304 - loss 1.63730036 - time (sec): 6.66 - samples/sec: 2328.35 - lr: 0.000015 - momentum: 0.000000 2023-10-13 09:11:42,128 epoch 1 - iter 180/304 - loss 1.44861964 - time (sec): 7.96 - samples/sec: 2314.20 - lr: 0.000018 - momentum: 0.000000 2023-10-13 09:11:43,445 epoch 1 - iter 210/304 - loss 1.29232475 - time (sec): 9.28 - samples/sec: 2302.80 - lr: 0.000021 - momentum: 0.000000 2023-10-13 09:11:44,775 epoch 1 - iter 240/304 - loss 1.17915743 - time (sec): 10.61 - samples/sec: 2300.40 - lr: 0.000024 - momentum: 0.000000 2023-10-13 09:11:46,095 epoch 1 - iter 270/304 - loss 1.08551974 - time (sec): 11.93 - samples/sec: 2293.21 - lr: 0.000027 - momentum: 0.000000 2023-10-13 09:11:47,410 epoch 1 - iter 300/304 - loss 1.00236337 - time (sec): 13.24 - samples/sec: 2309.81 - lr: 0.000030 - momentum: 0.000000 2023-10-13 09:11:47,587 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:11:47,587 EPOCH 1 done: loss 0.9936 - lr: 0.000030 2023-10-13 09:11:48,584 DEV : loss 0.2579602897167206 - f1-score (micro avg) 0.4773 2023-10-13 09:11:48,589 saving best model 2023-10-13 09:11:48,954 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:11:50,414 epoch 2 - iter 30/304 - loss 0.27558021 - time (sec): 1.46 - samples/sec: 2040.19 - lr: 0.000030 - momentum: 0.000000 2023-10-13 09:11:51,866 epoch 2 - iter 60/304 - loss 0.24334816 - time (sec): 2.91 - samples/sec: 2070.02 - lr: 0.000029 - momentum: 0.000000 2023-10-13 09:11:53,355 epoch 2 - iter 90/304 - loss 0.20641996 - time (sec): 4.40 - samples/sec: 2062.78 - lr: 0.000029 - momentum: 0.000000 2023-10-13 09:11:54,768 epoch 2 - iter 120/304 - loss 0.20667576 - time (sec): 5.81 - samples/sec: 2088.44 - lr: 0.000029 - momentum: 0.000000 2023-10-13 09:11:56,148 epoch 2 - iter 150/304 - loss 0.19291693 - time (sec): 7.19 - samples/sec: 2134.07 - lr: 0.000028 - momentum: 0.000000 2023-10-13 09:11:57,479 epoch 2 - iter 180/304 - loss 0.18565077 - time (sec): 8.52 - samples/sec: 2142.09 - lr: 0.000028 - momentum: 0.000000 2023-10-13 09:11:58,816 epoch 2 - iter 210/304 - loss 0.17594776 - time (sec): 9.86 - samples/sec: 2185.21 - lr: 0.000028 - momentum: 0.000000 2023-10-13 09:12:00,120 epoch 2 - iter 240/304 - loss 0.16750310 - time (sec): 11.16 - samples/sec: 2214.63 - lr: 0.000027 - momentum: 0.000000 2023-10-13 09:12:01,435 epoch 2 - iter 270/304 - loss 0.16907637 - time (sec): 12.48 - samples/sec: 2227.95 - lr: 0.000027 - momentum: 0.000000 2023-10-13 09:12:02,752 epoch 2 - iter 300/304 - loss 0.16188542 - time (sec): 13.80 - samples/sec: 2231.51 - lr: 0.000027 - momentum: 0.000000 2023-10-13 09:12:02,924 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:12:02,925 EPOCH 2 done: loss 0.1622 - lr: 0.000027 2023-10-13 09:12:03,976 DEV : loss 0.14629144966602325 - f1-score (micro avg) 0.8153 2023-10-13 09:12:03,986 saving best model 2023-10-13 09:12:04,499 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:12:05,891 epoch 3 - iter 30/304 - loss 0.05968550 - time (sec): 1.39 - samples/sec: 2190.17 - lr: 0.000026 - momentum: 0.000000 2023-10-13 09:12:07,227 epoch 3 - iter 60/304 - loss 0.06661242 - time (sec): 2.72 - samples/sec: 2235.46 - lr: 0.000026 - momentum: 0.000000 2023-10-13 09:12:08,541 epoch 3 - iter 90/304 - loss 0.07023246 - time (sec): 4.04 - samples/sec: 2324.92 - lr: 0.000026 - momentum: 0.000000 2023-10-13 09:12:09,852 epoch 3 - iter 120/304 - loss 0.06898371 - time (sec): 5.35 - samples/sec: 2301.97 - lr: 0.000025 - momentum: 0.000000 2023-10-13 09:12:11,165 epoch 3 - iter 150/304 - loss 0.07165433 - time (sec): 6.66 - samples/sec: 2299.15 - lr: 0.000025 - momentum: 0.000000 2023-10-13 09:12:12,471 epoch 3 - iter 180/304 - loss 0.07808419 - time (sec): 7.97 - samples/sec: 2282.30 - lr: 0.000025 - momentum: 0.000000 2023-10-13 09:12:13,808 epoch 3 - iter 210/304 - loss 0.08536130 - time (sec): 9.31 - samples/sec: 2294.60 - lr: 0.000024 - momentum: 0.000000 2023-10-13 09:12:15,143 epoch 3 - iter 240/304 - loss 0.08939663 - time (sec): 10.64 - samples/sec: 2288.75 - lr: 0.000024 - momentum: 0.000000 2023-10-13 09:12:16,498 epoch 3 - iter 270/304 - loss 0.08897527 - time (sec): 12.00 - samples/sec: 2314.40 - lr: 0.000024 - momentum: 0.000000 2023-10-13 09:12:17,828 epoch 3 - iter 300/304 - loss 0.09180584 - time (sec): 13.32 - samples/sec: 2292.68 - lr: 0.000023 - momentum: 0.000000 2023-10-13 09:12:18,004 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:12:18,004 EPOCH 3 done: loss 0.0914 - lr: 0.000023 2023-10-13 09:12:18,999 DEV : loss 0.16112016141414642 - f1-score (micro avg) 0.7892 2023-10-13 09:12:19,007 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:12:20,582 epoch 4 - iter 30/304 - loss 0.04115302 - time (sec): 1.57 - samples/sec: 2039.80 - lr: 0.000023 - momentum: 0.000000 2023-10-13 09:12:22,196 epoch 4 - iter 60/304 - loss 0.03239643 - time (sec): 3.19 - samples/sec: 1970.78 - lr: 0.000023 - momentum: 0.000000 2023-10-13 09:12:23,552 epoch 4 - iter 90/304 - loss 0.05223740 - time (sec): 4.54 - samples/sec: 2037.80 - lr: 0.000022 - momentum: 0.000000 2023-10-13 09:12:24,864 epoch 4 - iter 120/304 - loss 0.04822337 - time (sec): 5.85 - samples/sec: 2078.61 - lr: 0.000022 - momentum: 0.000000 2023-10-13 09:12:26,184 epoch 4 - iter 150/304 - loss 0.04568180 - time (sec): 7.17 - samples/sec: 2139.95 - lr: 0.000022 - momentum: 0.000000 2023-10-13 09:12:27,484 epoch 4 - iter 180/304 - loss 0.05593238 - time (sec): 8.48 - samples/sec: 2165.17 - lr: 0.000021 - momentum: 0.000000 2023-10-13 09:12:28,819 epoch 4 - iter 210/304 - loss 0.05875782 - time (sec): 9.81 - samples/sec: 2183.31 - lr: 0.000021 - momentum: 0.000000 2023-10-13 09:12:30,345 epoch 4 - iter 240/304 - loss 0.06307644 - time (sec): 11.34 - samples/sec: 2169.16 - lr: 0.000021 - momentum: 0.000000 2023-10-13 09:12:31,680 epoch 4 - iter 270/304 - loss 0.06423237 - time (sec): 12.67 - samples/sec: 2185.58 - lr: 0.000020 - momentum: 0.000000 2023-10-13 09:12:33,006 epoch 4 - iter 300/304 - loss 0.06965799 - time (sec): 14.00 - samples/sec: 2185.76 - lr: 0.000020 - momentum: 0.000000 2023-10-13 09:12:33,182 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:12:33,183 EPOCH 4 done: loss 0.0689 - lr: 0.000020 2023-10-13 09:12:34,219 DEV : loss 0.17021441459655762 - f1-score (micro avg) 0.8286 2023-10-13 09:12:34,227 saving best model 2023-10-13 09:12:34,694 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:12:36,292 epoch 5 - iter 30/304 - loss 0.04112044 - time (sec): 1.60 - samples/sec: 1991.10 - lr: 0.000020 - momentum: 0.000000 2023-10-13 09:12:37,843 epoch 5 - iter 60/304 - loss 0.03209144 - time (sec): 3.15 - samples/sec: 2031.90 - lr: 0.000019 - momentum: 0.000000 2023-10-13 09:12:39,400 epoch 5 - iter 90/304 - loss 0.02949647 - time (sec): 4.70 - samples/sec: 2006.05 - lr: 0.000019 - momentum: 0.000000 2023-10-13 09:12:41,037 epoch 5 - iter 120/304 - loss 0.03606315 - time (sec): 6.34 - samples/sec: 1966.62 - lr: 0.000019 - momentum: 0.000000 2023-10-13 09:12:42,703 epoch 5 - iter 150/304 - loss 0.04624669 - time (sec): 8.01 - samples/sec: 1938.38 - lr: 0.000018 - momentum: 0.000000 2023-10-13 09:12:44,336 epoch 5 - iter 180/304 - loss 0.04420425 - time (sec): 9.64 - samples/sec: 1927.80 - lr: 0.000018 - momentum: 0.000000 2023-10-13 09:12:45,947 epoch 5 - iter 210/304 - loss 0.04227590 - time (sec): 11.25 - samples/sec: 1913.60 - lr: 0.000018 - momentum: 0.000000 2023-10-13 09:12:47,650 epoch 5 - iter 240/304 - loss 0.04668771 - time (sec): 12.95 - samples/sec: 1901.23 - lr: 0.000017 - momentum: 0.000000 2023-10-13 09:12:49,300 epoch 5 - iter 270/304 - loss 0.04686959 - time (sec): 14.60 - samples/sec: 1890.67 - lr: 0.000017 - momentum: 0.000000 2023-10-13 09:12:50,826 epoch 5 - iter 300/304 - loss 0.04646029 - time (sec): 16.13 - samples/sec: 1900.96 - lr: 0.000017 - momentum: 0.000000 2023-10-13 09:12:50,994 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:12:50,995 EPOCH 5 done: loss 0.0465 - lr: 0.000017 2023-10-13 09:12:52,042 DEV : loss 0.2058655172586441 - f1-score (micro avg) 0.8233 2023-10-13 09:12:52,052 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:12:53,684 epoch 6 - iter 30/304 - loss 0.05046724 - time (sec): 1.63 - samples/sec: 1717.40 - lr: 0.000016 - momentum: 0.000000 2023-10-13 09:12:55,044 epoch 6 - iter 60/304 - loss 0.04432771 - time (sec): 2.99 - samples/sec: 2025.89 - lr: 0.000016 - momentum: 0.000000 2023-10-13 09:12:56,388 epoch 6 - iter 90/304 - loss 0.03676034 - time (sec): 4.33 - samples/sec: 2102.11 - lr: 0.000016 - momentum: 0.000000 2023-10-13 09:12:57,721 epoch 6 - iter 120/304 - loss 0.03256242 - time (sec): 5.67 - samples/sec: 2151.39 - lr: 0.000015 - momentum: 0.000000 2023-10-13 09:12:59,020 epoch 6 - iter 150/304 - loss 0.02929072 - time (sec): 6.97 - samples/sec: 2149.63 - lr: 0.000015 - momentum: 0.000000 2023-10-13 09:13:00,326 epoch 6 - iter 180/304 - loss 0.02920208 - time (sec): 8.27 - samples/sec: 2186.08 - lr: 0.000015 - momentum: 0.000000 2023-10-13 09:13:01,697 epoch 6 - iter 210/304 - loss 0.03288655 - time (sec): 9.64 - samples/sec: 2205.83 - lr: 0.000014 - momentum: 0.000000 2023-10-13 09:13:03,026 epoch 6 - iter 240/304 - loss 0.03747087 - time (sec): 10.97 - samples/sec: 2217.29 - lr: 0.000014 - momentum: 0.000000 2023-10-13 09:13:04,362 epoch 6 - iter 270/304 - loss 0.03500701 - time (sec): 12.31 - samples/sec: 2220.88 - lr: 0.000014 - momentum: 0.000000 2023-10-13 09:13:05,772 epoch 6 - iter 300/304 - loss 0.03544502 - time (sec): 13.72 - samples/sec: 2232.62 - lr: 0.000013 - momentum: 0.000000 2023-10-13 09:13:05,956 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:13:05,957 EPOCH 6 done: loss 0.0355 - lr: 0.000013 2023-10-13 09:13:06,892 DEV : loss 0.20123393833637238 - f1-score (micro avg) 0.8241 2023-10-13 09:13:06,899 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:13:08,180 epoch 7 - iter 30/304 - loss 0.01961303 - time (sec): 1.28 - samples/sec: 2408.89 - lr: 0.000013 - momentum: 0.000000 2023-10-13 09:13:09,496 epoch 7 - iter 60/304 - loss 0.03318365 - time (sec): 2.60 - samples/sec: 2358.80 - lr: 0.000013 - momentum: 0.000000 2023-10-13 09:13:10,818 epoch 7 - iter 90/304 - loss 0.03814019 - time (sec): 3.92 - samples/sec: 2318.15 - lr: 0.000012 - momentum: 0.000000 2023-10-13 09:13:12,170 epoch 7 - iter 120/304 - loss 0.03423627 - time (sec): 5.27 - samples/sec: 2376.25 - lr: 0.000012 - momentum: 0.000000 2023-10-13 09:13:13,501 epoch 7 - iter 150/304 - loss 0.02933986 - time (sec): 6.60 - samples/sec: 2344.86 - lr: 0.000012 - momentum: 0.000000 2023-10-13 09:13:14,848 epoch 7 - iter 180/304 - loss 0.02722790 - time (sec): 7.95 - samples/sec: 2343.82 - lr: 0.000011 - momentum: 0.000000 2023-10-13 09:13:16,165 epoch 7 - iter 210/304 - loss 0.02545864 - time (sec): 9.26 - samples/sec: 2346.31 - lr: 0.000011 - momentum: 0.000000 2023-10-13 09:13:17,580 epoch 7 - iter 240/304 - loss 0.02870239 - time (sec): 10.68 - samples/sec: 2349.68 - lr: 0.000011 - momentum: 0.000000 2023-10-13 09:13:18,888 epoch 7 - iter 270/304 - loss 0.02761758 - time (sec): 11.99 - samples/sec: 2320.40 - lr: 0.000010 - momentum: 0.000000 2023-10-13 09:13:20,213 epoch 7 - iter 300/304 - loss 0.02791349 - time (sec): 13.31 - samples/sec: 2307.01 - lr: 0.000010 - momentum: 0.000000 2023-10-13 09:13:20,384 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:13:20,384 EPOCH 7 done: loss 0.0277 - lr: 0.000010 2023-10-13 09:13:21,348 DEV : loss 0.20731189846992493 - f1-score (micro avg) 0.8302 2023-10-13 09:13:21,356 saving best model 2023-10-13 09:13:21,837 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:13:23,445 epoch 8 - iter 30/304 - loss 0.02594408 - time (sec): 1.61 - samples/sec: 2099.48 - lr: 0.000010 - momentum: 0.000000 2023-10-13 09:13:25,119 epoch 8 - iter 60/304 - loss 0.01870384 - time (sec): 3.28 - samples/sec: 1925.49 - lr: 0.000009 - momentum: 0.000000 2023-10-13 09:13:26,740 epoch 8 - iter 90/304 - loss 0.02645844 - time (sec): 4.90 - samples/sec: 1918.28 - lr: 0.000009 - momentum: 0.000000 2023-10-13 09:13:28,361 epoch 8 - iter 120/304 - loss 0.02459100 - time (sec): 6.52 - samples/sec: 1904.79 - lr: 0.000009 - momentum: 0.000000 2023-10-13 09:13:29,920 epoch 8 - iter 150/304 - loss 0.02686734 - time (sec): 8.08 - samples/sec: 1931.80 - lr: 0.000008 - momentum: 0.000000 2023-10-13 09:13:31,475 epoch 8 - iter 180/304 - loss 0.02297305 - time (sec): 9.63 - samples/sec: 1943.02 - lr: 0.000008 - momentum: 0.000000 2023-10-13 09:13:32,798 epoch 8 - iter 210/304 - loss 0.02610168 - time (sec): 10.96 - samples/sec: 1977.75 - lr: 0.000008 - momentum: 0.000000 2023-10-13 09:13:34,132 epoch 8 - iter 240/304 - loss 0.02625596 - time (sec): 12.29 - samples/sec: 2013.37 - lr: 0.000007 - momentum: 0.000000 2023-10-13 09:13:35,462 epoch 8 - iter 270/304 - loss 0.02450793 - time (sec): 13.62 - samples/sec: 2031.04 - lr: 0.000007 - momentum: 0.000000 2023-10-13 09:13:36,847 epoch 8 - iter 300/304 - loss 0.02535150 - time (sec): 15.01 - samples/sec: 2038.87 - lr: 0.000007 - momentum: 0.000000 2023-10-13 09:13:37,058 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:13:37,058 EPOCH 8 done: loss 0.0250 - lr: 0.000007 2023-10-13 09:13:38,070 DEV : loss 0.22579838335514069 - f1-score (micro avg) 0.8206 2023-10-13 09:13:38,077 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:13:39,666 epoch 9 - iter 30/304 - loss 0.02789099 - time (sec): 1.59 - samples/sec: 1737.54 - lr: 0.000006 - momentum: 0.000000 2023-10-13 09:13:41,230 epoch 9 - iter 60/304 - loss 0.01659011 - time (sec): 3.15 - samples/sec: 1874.35 - lr: 0.000006 - momentum: 0.000000 2023-10-13 09:13:42,851 epoch 9 - iter 90/304 - loss 0.03203659 - time (sec): 4.77 - samples/sec: 1928.77 - lr: 0.000006 - momentum: 0.000000 2023-10-13 09:13:44,201 epoch 9 - iter 120/304 - loss 0.02910105 - time (sec): 6.12 - samples/sec: 2021.87 - lr: 0.000005 - momentum: 0.000000 2023-10-13 09:13:45,535 epoch 9 - iter 150/304 - loss 0.02440992 - time (sec): 7.46 - samples/sec: 2060.58 - lr: 0.000005 - momentum: 0.000000 2023-10-13 09:13:46,861 epoch 9 - iter 180/304 - loss 0.02453832 - time (sec): 8.78 - samples/sec: 2091.82 - lr: 0.000005 - momentum: 0.000000 2023-10-13 09:13:48,164 epoch 9 - iter 210/304 - loss 0.02290375 - time (sec): 10.09 - samples/sec: 2103.04 - lr: 0.000004 - momentum: 0.000000 2023-10-13 09:13:49,488 epoch 9 - iter 240/304 - loss 0.02113890 - time (sec): 11.41 - samples/sec: 2156.02 - lr: 0.000004 - momentum: 0.000000 2023-10-13 09:13:50,787 epoch 9 - iter 270/304 - loss 0.01953599 - time (sec): 12.71 - samples/sec: 2171.98 - lr: 0.000004 - momentum: 0.000000 2023-10-13 09:13:52,088 epoch 9 - iter 300/304 - loss 0.01811998 - time (sec): 14.01 - samples/sec: 2185.33 - lr: 0.000003 - momentum: 0.000000 2023-10-13 09:13:52,256 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:13:52,256 EPOCH 9 done: loss 0.0179 - lr: 0.000003 2023-10-13 09:13:53,233 DEV : loss 0.22185303270816803 - f1-score (micro avg) 0.8248 2023-10-13 09:13:53,239 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:13:54,640 epoch 10 - iter 30/304 - loss 0.00070490 - time (sec): 1.40 - samples/sec: 2084.76 - lr: 0.000003 - momentum: 0.000000 2023-10-13 09:13:56,021 epoch 10 - iter 60/304 - loss 0.00875389 - time (sec): 2.78 - samples/sec: 2206.26 - lr: 0.000003 - momentum: 0.000000 2023-10-13 09:13:57,389 epoch 10 - iter 90/304 - loss 0.00957439 - time (sec): 4.15 - samples/sec: 2290.89 - lr: 0.000002 - momentum: 0.000000 2023-10-13 09:13:58,819 epoch 10 - iter 120/304 - loss 0.01901622 - time (sec): 5.58 - samples/sec: 2243.53 - lr: 0.000002 - momentum: 0.000000 2023-10-13 09:14:00,270 epoch 10 - iter 150/304 - loss 0.01749786 - time (sec): 7.03 - samples/sec: 2184.54 - lr: 0.000002 - momentum: 0.000000 2023-10-13 09:14:01,659 epoch 10 - iter 180/304 - loss 0.01476365 - time (sec): 8.42 - samples/sec: 2219.69 - lr: 0.000001 - momentum: 0.000000 2023-10-13 09:14:02,992 epoch 10 - iter 210/304 - loss 0.01674996 - time (sec): 9.75 - samples/sec: 2234.48 - lr: 0.000001 - momentum: 0.000000 2023-10-13 09:14:04,337 epoch 10 - iter 240/304 - loss 0.01678235 - time (sec): 11.10 - samples/sec: 2216.99 - lr: 0.000001 - momentum: 0.000000 2023-10-13 09:14:05,680 epoch 10 - iter 270/304 - loss 0.01578168 - time (sec): 12.44 - samples/sec: 2226.66 - lr: 0.000000 - momentum: 0.000000 2023-10-13 09:14:07,025 epoch 10 - iter 300/304 - loss 0.01543314 - time (sec): 13.78 - samples/sec: 2233.18 - lr: 0.000000 - momentum: 0.000000 2023-10-13 09:14:07,194 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:14:07,194 EPOCH 10 done: loss 0.0153 - lr: 0.000000 2023-10-13 09:14:08,184 DEV : loss 0.22600068151950836 - f1-score (micro avg) 0.82 2023-10-13 09:14:08,594 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:14:08,595 Loading model from best epoch ... 2023-10-13 09:14:10,239 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object 2023-10-13 09:14:11,351 Results: - F-score (micro) 0.7847 - F-score (macro) 0.5885 - Accuracy 0.6542 By class: precision recall f1-score support scope 0.7806 0.8013 0.7908 151 work 0.6489 0.8947 0.7522 95 pers 0.7479 0.9271 0.8279 96 loc 0.5000 0.6667 0.5714 3 date 0.0000 0.0000 0.0000 3 micro avg 0.7262 0.8534 0.7847 348 macro avg 0.5355 0.6580 0.5885 348 weighted avg 0.7265 0.8534 0.7818 348 2023-10-13 09:14:11,351 ----------------------------------------------------------------------------------------------------