2023-10-18 17:55:06,196 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:55:06,196 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 17:55:06,196 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:55:06,196 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences - NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator 2023-10-18 17:55:06,196 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:55:06,196 Train: 3575 sentences 2023-10-18 17:55:06,196 (train_with_dev=False, train_with_test=False) 2023-10-18 17:55:06,196 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:55:06,196 Training Params: 2023-10-18 17:55:06,196 - learning_rate: "3e-05" 2023-10-18 17:55:06,196 - mini_batch_size: "8" 2023-10-18 17:55:06,196 - max_epochs: "10" 2023-10-18 17:55:06,196 - shuffle: "True" 2023-10-18 17:55:06,196 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:55:06,196 Plugins: 2023-10-18 17:55:06,196 - TensorboardLogger 2023-10-18 17:55:06,196 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 17:55:06,196 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:55:06,196 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 17:55:06,196 - metric: "('micro avg', 'f1-score')" 2023-10-18 17:55:06,197 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:55:06,197 Computation: 2023-10-18 17:55:06,197 - compute on device: cuda:0 2023-10-18 17:55:06,197 - embedding storage: none 2023-10-18 17:55:06,197 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:55:06,197 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-18 17:55:06,197 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:55:06,197 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:55:06,197 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 17:55:07,276 epoch 1 - iter 44/447 - loss 3.21526268 - time (sec): 1.08 - samples/sec: 8384.48 - lr: 0.000003 - momentum: 0.000000 2023-10-18 17:55:08,312 epoch 1 - iter 88/447 - loss 3.11068415 - time (sec): 2.12 - samples/sec: 8839.97 - lr: 0.000006 - momentum: 0.000000 2023-10-18 17:55:09,320 epoch 1 - iter 132/447 - loss 2.96404838 - time (sec): 3.12 - samples/sec: 8377.12 - lr: 0.000009 - momentum: 0.000000 2023-10-18 17:55:10,315 epoch 1 - iter 176/447 - loss 2.74904136 - time (sec): 4.12 - samples/sec: 8216.71 - lr: 0.000012 - momentum: 0.000000 2023-10-18 17:55:11,325 epoch 1 - iter 220/447 - loss 2.49019133 - time (sec): 5.13 - samples/sec: 8152.73 - lr: 0.000015 - momentum: 0.000000 2023-10-18 17:55:12,348 epoch 1 - iter 264/447 - loss 2.21707471 - time (sec): 6.15 - samples/sec: 8184.32 - lr: 0.000018 - momentum: 0.000000 2023-10-18 17:55:13,366 epoch 1 - iter 308/447 - loss 1.98807103 - time (sec): 7.17 - samples/sec: 8266.22 - lr: 0.000021 - momentum: 0.000000 2023-10-18 17:55:14,401 epoch 1 - iter 352/447 - loss 1.79499806 - time (sec): 8.20 - samples/sec: 8382.75 - lr: 0.000024 - momentum: 0.000000 2023-10-18 17:55:15,414 epoch 1 - iter 396/447 - loss 1.66282855 - time (sec): 9.22 - samples/sec: 8369.73 - lr: 0.000027 - momentum: 0.000000 2023-10-18 17:55:16,456 epoch 1 - iter 440/447 - loss 1.56914304 - time (sec): 10.26 - samples/sec: 8314.49 - lr: 0.000029 - momentum: 0.000000 2023-10-18 17:55:16,622 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:55:16,623 EPOCH 1 done: loss 1.5580 - lr: 0.000029 2023-10-18 17:55:18,857 DEV : loss 0.4793676435947418 - f1-score (micro avg) 0.0 2023-10-18 17:55:18,886 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:55:19,932 epoch 2 - iter 44/447 - loss 0.53138539 - time (sec): 1.04 - samples/sec: 7990.03 - lr: 0.000030 - momentum: 0.000000 2023-10-18 17:55:20,938 epoch 2 - iter 88/447 - loss 0.56693289 - time (sec): 2.05 - samples/sec: 8359.24 - lr: 0.000029 - momentum: 0.000000 2023-10-18 17:55:21,950 epoch 2 - iter 132/447 - loss 0.55886578 - time (sec): 3.06 - samples/sec: 8124.75 - lr: 0.000029 - momentum: 0.000000 2023-10-18 17:55:22,968 epoch 2 - iter 176/447 - loss 0.55590184 - time (sec): 4.08 - samples/sec: 8087.50 - lr: 0.000029 - momentum: 0.000000 2023-10-18 17:55:24,032 epoch 2 - iter 220/447 - loss 0.56384809 - time (sec): 5.14 - samples/sec: 8272.06 - lr: 0.000028 - momentum: 0.000000 2023-10-18 17:55:24,999 epoch 2 - iter 264/447 - loss 0.55035800 - time (sec): 6.11 - samples/sec: 8342.82 - lr: 0.000028 - momentum: 0.000000 2023-10-18 17:55:26,073 epoch 2 - iter 308/447 - loss 0.54447525 - time (sec): 7.19 - samples/sec: 8485.09 - lr: 0.000028 - momentum: 0.000000 2023-10-18 17:55:27,099 epoch 2 - iter 352/447 - loss 0.54060510 - time (sec): 8.21 - samples/sec: 8347.52 - lr: 0.000027 - momentum: 0.000000 2023-10-18 17:55:28,159 epoch 2 - iter 396/447 - loss 0.53839069 - time (sec): 9.27 - samples/sec: 8313.01 - lr: 0.000027 - momentum: 0.000000 2023-10-18 17:55:29,206 epoch 2 - iter 440/447 - loss 0.53391658 - time (sec): 10.32 - samples/sec: 8278.84 - lr: 0.000027 - momentum: 0.000000 2023-10-18 17:55:29,356 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:55:29,356 EPOCH 2 done: loss 0.5351 - lr: 0.000027 2023-10-18 17:55:34,588 DEV : loss 0.37521103024482727 - f1-score (micro avg) 0.0091 2023-10-18 17:55:34,615 saving best model 2023-10-18 17:55:34,648 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:55:35,674 epoch 3 - iter 44/447 - loss 0.48728016 - time (sec): 1.02 - samples/sec: 8784.04 - lr: 0.000026 - momentum: 0.000000 2023-10-18 17:55:36,702 epoch 3 - iter 88/447 - loss 0.48785073 - time (sec): 2.05 - samples/sec: 8542.35 - lr: 0.000026 - momentum: 0.000000 2023-10-18 17:55:37,757 epoch 3 - iter 132/447 - loss 0.48141366 - time (sec): 3.11 - samples/sec: 8435.54 - lr: 0.000026 - momentum: 0.000000 2023-10-18 17:55:38,791 epoch 3 - iter 176/447 - loss 0.49254271 - time (sec): 4.14 - samples/sec: 8208.75 - lr: 0.000025 - momentum: 0.000000 2023-10-18 17:55:39,800 epoch 3 - iter 220/447 - loss 0.48009553 - time (sec): 5.15 - samples/sec: 8137.02 - lr: 0.000025 - momentum: 0.000000 2023-10-18 17:55:40,791 epoch 3 - iter 264/447 - loss 0.48043658 - time (sec): 6.14 - samples/sec: 8202.99 - lr: 0.000025 - momentum: 0.000000 2023-10-18 17:55:41,817 epoch 3 - iter 308/447 - loss 0.46945156 - time (sec): 7.17 - samples/sec: 8223.78 - lr: 0.000024 - momentum: 0.000000 2023-10-18 17:55:42,923 epoch 3 - iter 352/447 - loss 0.47147629 - time (sec): 8.27 - samples/sec: 8205.12 - lr: 0.000024 - momentum: 0.000000 2023-10-18 17:55:43,958 epoch 3 - iter 396/447 - loss 0.46777770 - time (sec): 9.31 - samples/sec: 8238.31 - lr: 0.000024 - momentum: 0.000000 2023-10-18 17:55:44,956 epoch 3 - iter 440/447 - loss 0.46362258 - time (sec): 10.31 - samples/sec: 8291.65 - lr: 0.000023 - momentum: 0.000000 2023-10-18 17:55:45,101 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:55:45,101 EPOCH 3 done: loss 0.4632 - lr: 0.000023 2023-10-18 17:55:50,289 DEV : loss 0.3417363464832306 - f1-score (micro avg) 0.121 2023-10-18 17:55:50,316 saving best model 2023-10-18 17:55:50,353 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:55:51,337 epoch 4 - iter 44/447 - loss 0.45316300 - time (sec): 0.98 - samples/sec: 8248.65 - lr: 0.000023 - momentum: 0.000000 2023-10-18 17:55:52,461 epoch 4 - iter 88/447 - loss 0.40894581 - time (sec): 2.11 - samples/sec: 8613.06 - lr: 0.000023 - momentum: 0.000000 2023-10-18 17:55:53,478 epoch 4 - iter 132/447 - loss 0.41135455 - time (sec): 3.12 - samples/sec: 8514.01 - lr: 0.000022 - momentum: 0.000000 2023-10-18 17:55:54,490 epoch 4 - iter 176/447 - loss 0.41549718 - time (sec): 4.14 - samples/sec: 8527.39 - lr: 0.000022 - momentum: 0.000000 2023-10-18 17:55:55,517 epoch 4 - iter 220/447 - loss 0.40581863 - time (sec): 5.16 - samples/sec: 8532.55 - lr: 0.000022 - momentum: 0.000000 2023-10-18 17:55:56,476 epoch 4 - iter 264/447 - loss 0.41064648 - time (sec): 6.12 - samples/sec: 8543.42 - lr: 0.000021 - momentum: 0.000000 2023-10-18 17:55:57,474 epoch 4 - iter 308/447 - loss 0.40762911 - time (sec): 7.12 - samples/sec: 8506.55 - lr: 0.000021 - momentum: 0.000000 2023-10-18 17:55:58,507 epoch 4 - iter 352/447 - loss 0.41255573 - time (sec): 8.15 - samples/sec: 8467.32 - lr: 0.000021 - momentum: 0.000000 2023-10-18 17:55:59,560 epoch 4 - iter 396/447 - loss 0.41309072 - time (sec): 9.21 - samples/sec: 8372.24 - lr: 0.000020 - momentum: 0.000000 2023-10-18 17:56:00,588 epoch 4 - iter 440/447 - loss 0.41180366 - time (sec): 10.23 - samples/sec: 8331.22 - lr: 0.000020 - momentum: 0.000000 2023-10-18 17:56:00,754 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:56:00,754 EPOCH 4 done: loss 0.4114 - lr: 0.000020 2023-10-18 17:56:06,011 DEV : loss 0.3258393108844757 - f1-score (micro avg) 0.2263 2023-10-18 17:56:06,039 saving best model 2023-10-18 17:56:06,080 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:56:07,143 epoch 5 - iter 44/447 - loss 0.39673676 - time (sec): 1.06 - samples/sec: 8067.49 - lr: 0.000020 - momentum: 0.000000 2023-10-18 17:56:08,174 epoch 5 - iter 88/447 - loss 0.38054984 - time (sec): 2.09 - samples/sec: 8516.95 - lr: 0.000019 - momentum: 0.000000 2023-10-18 17:56:09,178 epoch 5 - iter 132/447 - loss 0.38129747 - time (sec): 3.10 - samples/sec: 8224.78 - lr: 0.000019 - momentum: 0.000000 2023-10-18 17:56:10,186 epoch 5 - iter 176/447 - loss 0.38376034 - time (sec): 4.10 - samples/sec: 8330.52 - lr: 0.000019 - momentum: 0.000000 2023-10-18 17:56:11,164 epoch 5 - iter 220/447 - loss 0.38176883 - time (sec): 5.08 - samples/sec: 8263.63 - lr: 0.000018 - momentum: 0.000000 2023-10-18 17:56:12,170 epoch 5 - iter 264/447 - loss 0.39071888 - time (sec): 6.09 - samples/sec: 8230.29 - lr: 0.000018 - momentum: 0.000000 2023-10-18 17:56:13,215 epoch 5 - iter 308/447 - loss 0.39004818 - time (sec): 7.13 - samples/sec: 8307.87 - lr: 0.000018 - momentum: 0.000000 2023-10-18 17:56:14,251 epoch 5 - iter 352/447 - loss 0.39166105 - time (sec): 8.17 - samples/sec: 8394.62 - lr: 0.000017 - momentum: 0.000000 2023-10-18 17:56:15,268 epoch 5 - iter 396/447 - loss 0.38803979 - time (sec): 9.19 - samples/sec: 8403.08 - lr: 0.000017 - momentum: 0.000000 2023-10-18 17:56:16,234 epoch 5 - iter 440/447 - loss 0.38221068 - time (sec): 10.15 - samples/sec: 8356.57 - lr: 0.000017 - momentum: 0.000000 2023-10-18 17:56:16,414 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:56:16,414 EPOCH 5 done: loss 0.3805 - lr: 0.000017 2023-10-18 17:56:21,341 DEV : loss 0.31553900241851807 - f1-score (micro avg) 0.2805 2023-10-18 17:56:21,370 saving best model 2023-10-18 17:56:21,411 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:56:22,414 epoch 6 - iter 44/447 - loss 0.36401615 - time (sec): 1.00 - samples/sec: 8342.53 - lr: 0.000016 - momentum: 0.000000 2023-10-18 17:56:23,388 epoch 6 - iter 88/447 - loss 0.36764697 - time (sec): 1.98 - samples/sec: 8250.88 - lr: 0.000016 - momentum: 0.000000 2023-10-18 17:56:24,398 epoch 6 - iter 132/447 - loss 0.35531431 - time (sec): 2.99 - samples/sec: 8010.00 - lr: 0.000016 - momentum: 0.000000 2023-10-18 17:56:25,413 epoch 6 - iter 176/447 - loss 0.37688784 - time (sec): 4.00 - samples/sec: 8058.37 - lr: 0.000015 - momentum: 0.000000 2023-10-18 17:56:26,407 epoch 6 - iter 220/447 - loss 0.37863630 - time (sec): 5.00 - samples/sec: 8094.73 - lr: 0.000015 - momentum: 0.000000 2023-10-18 17:56:27,494 epoch 6 - iter 264/447 - loss 0.38822234 - time (sec): 6.08 - samples/sec: 8250.16 - lr: 0.000015 - momentum: 0.000000 2023-10-18 17:56:28,473 epoch 6 - iter 308/447 - loss 0.38211922 - time (sec): 7.06 - samples/sec: 8350.66 - lr: 0.000014 - momentum: 0.000000 2023-10-18 17:56:29,470 epoch 6 - iter 352/447 - loss 0.36918547 - time (sec): 8.06 - samples/sec: 8369.52 - lr: 0.000014 - momentum: 0.000000 2023-10-18 17:56:30,866 epoch 6 - iter 396/447 - loss 0.37353967 - time (sec): 9.45 - samples/sec: 8139.09 - lr: 0.000014 - momentum: 0.000000 2023-10-18 17:56:31,890 epoch 6 - iter 440/447 - loss 0.37112856 - time (sec): 10.48 - samples/sec: 8148.79 - lr: 0.000013 - momentum: 0.000000 2023-10-18 17:56:32,042 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:56:32,042 EPOCH 6 done: loss 0.3704 - lr: 0.000013 2023-10-18 17:56:36,973 DEV : loss 0.3080426752567291 - f1-score (micro avg) 0.3137 2023-10-18 17:56:37,001 saving best model 2023-10-18 17:56:37,033 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:56:38,063 epoch 7 - iter 44/447 - loss 0.34458627 - time (sec): 1.03 - samples/sec: 8532.09 - lr: 0.000013 - momentum: 0.000000 2023-10-18 17:56:39,058 epoch 7 - iter 88/447 - loss 0.36276274 - time (sec): 2.02 - samples/sec: 8477.47 - lr: 0.000013 - momentum: 0.000000 2023-10-18 17:56:40,050 epoch 7 - iter 132/447 - loss 0.36409525 - time (sec): 3.02 - samples/sec: 8458.28 - lr: 0.000012 - momentum: 0.000000 2023-10-18 17:56:41,028 epoch 7 - iter 176/447 - loss 0.36044782 - time (sec): 3.99 - samples/sec: 8392.40 - lr: 0.000012 - momentum: 0.000000 2023-10-18 17:56:42,036 epoch 7 - iter 220/447 - loss 0.35380827 - time (sec): 5.00 - samples/sec: 8372.21 - lr: 0.000012 - momentum: 0.000000 2023-10-18 17:56:43,123 epoch 7 - iter 264/447 - loss 0.35154941 - time (sec): 6.09 - samples/sec: 8383.26 - lr: 0.000011 - momentum: 0.000000 2023-10-18 17:56:44,136 epoch 7 - iter 308/447 - loss 0.35471366 - time (sec): 7.10 - samples/sec: 8397.72 - lr: 0.000011 - momentum: 0.000000 2023-10-18 17:56:45,256 epoch 7 - iter 352/447 - loss 0.35780965 - time (sec): 8.22 - samples/sec: 8385.37 - lr: 0.000011 - momentum: 0.000000 2023-10-18 17:56:46,261 epoch 7 - iter 396/447 - loss 0.35812707 - time (sec): 9.23 - samples/sec: 8401.29 - lr: 0.000010 - momentum: 0.000000 2023-10-18 17:56:47,236 epoch 7 - iter 440/447 - loss 0.35473205 - time (sec): 10.20 - samples/sec: 8363.23 - lr: 0.000010 - momentum: 0.000000 2023-10-18 17:56:47,395 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:56:47,395 EPOCH 7 done: loss 0.3549 - lr: 0.000010 2023-10-18 17:56:52,657 DEV : loss 0.3085794150829315 - f1-score (micro avg) 0.3269 2023-10-18 17:56:52,684 saving best model 2023-10-18 17:56:52,717 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:56:53,703 epoch 8 - iter 44/447 - loss 0.36921770 - time (sec): 0.99 - samples/sec: 8919.36 - lr: 0.000010 - momentum: 0.000000 2023-10-18 17:56:54,749 epoch 8 - iter 88/447 - loss 0.35527406 - time (sec): 2.03 - samples/sec: 8941.97 - lr: 0.000009 - momentum: 0.000000 2023-10-18 17:56:55,762 epoch 8 - iter 132/447 - loss 0.36100556 - time (sec): 3.04 - samples/sec: 8581.87 - lr: 0.000009 - momentum: 0.000000 2023-10-18 17:56:56,773 epoch 8 - iter 176/447 - loss 0.35302564 - time (sec): 4.06 - samples/sec: 8608.51 - lr: 0.000009 - momentum: 0.000000 2023-10-18 17:56:57,805 epoch 8 - iter 220/447 - loss 0.34583755 - time (sec): 5.09 - samples/sec: 8697.64 - lr: 0.000008 - momentum: 0.000000 2023-10-18 17:56:58,817 epoch 8 - iter 264/447 - loss 0.34574712 - time (sec): 6.10 - samples/sec: 8621.15 - lr: 0.000008 - momentum: 0.000000 2023-10-18 17:56:59,826 epoch 8 - iter 308/447 - loss 0.34898159 - time (sec): 7.11 - samples/sec: 8491.51 - lr: 0.000008 - momentum: 0.000000 2023-10-18 17:57:00,842 epoch 8 - iter 352/447 - loss 0.34473749 - time (sec): 8.12 - samples/sec: 8519.03 - lr: 0.000007 - momentum: 0.000000 2023-10-18 17:57:01,853 epoch 8 - iter 396/447 - loss 0.35007107 - time (sec): 9.14 - samples/sec: 8506.48 - lr: 0.000007 - momentum: 0.000000 2023-10-18 17:57:02,814 epoch 8 - iter 440/447 - loss 0.34545478 - time (sec): 10.10 - samples/sec: 8429.17 - lr: 0.000007 - momentum: 0.000000 2023-10-18 17:57:02,969 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:57:02,969 EPOCH 8 done: loss 0.3436 - lr: 0.000007 2023-10-18 17:57:08,280 DEV : loss 0.31093230843544006 - f1-score (micro avg) 0.3267 2023-10-18 17:57:08,308 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:57:09,296 epoch 9 - iter 44/447 - loss 0.27711530 - time (sec): 0.99 - samples/sec: 8467.89 - lr: 0.000006 - momentum: 0.000000 2023-10-18 17:57:10,263 epoch 9 - iter 88/447 - loss 0.30403006 - time (sec): 1.95 - samples/sec: 8410.40 - lr: 0.000006 - momentum: 0.000000 2023-10-18 17:57:11,341 epoch 9 - iter 132/447 - loss 0.32196509 - time (sec): 3.03 - samples/sec: 8560.05 - lr: 0.000006 - momentum: 0.000000 2023-10-18 17:57:12,347 epoch 9 - iter 176/447 - loss 0.33477590 - time (sec): 4.04 - samples/sec: 8658.73 - lr: 0.000005 - momentum: 0.000000 2023-10-18 17:57:13,354 epoch 9 - iter 220/447 - loss 0.33986829 - time (sec): 5.05 - samples/sec: 8487.89 - lr: 0.000005 - momentum: 0.000000 2023-10-18 17:57:14,336 epoch 9 - iter 264/447 - loss 0.34157564 - time (sec): 6.03 - samples/sec: 8451.12 - lr: 0.000005 - momentum: 0.000000 2023-10-18 17:57:15,340 epoch 9 - iter 308/447 - loss 0.34210988 - time (sec): 7.03 - samples/sec: 8489.03 - lr: 0.000004 - momentum: 0.000000 2023-10-18 17:57:16,382 epoch 9 - iter 352/447 - loss 0.33347179 - time (sec): 8.07 - samples/sec: 8556.03 - lr: 0.000004 - momentum: 0.000000 2023-10-18 17:57:17,378 epoch 9 - iter 396/447 - loss 0.34133189 - time (sec): 9.07 - samples/sec: 8512.57 - lr: 0.000004 - momentum: 0.000000 2023-10-18 17:57:18,400 epoch 9 - iter 440/447 - loss 0.34251923 - time (sec): 10.09 - samples/sec: 8463.40 - lr: 0.000003 - momentum: 0.000000 2023-10-18 17:57:18,555 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:57:18,555 EPOCH 9 done: loss 0.3423 - lr: 0.000003 2023-10-18 17:57:23,821 DEV : loss 0.3027634918689728 - f1-score (micro avg) 0.336 2023-10-18 17:57:23,848 saving best model 2023-10-18 17:57:23,883 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:57:24,923 epoch 10 - iter 44/447 - loss 0.32917961 - time (sec): 1.04 - samples/sec: 9385.39 - lr: 0.000003 - momentum: 0.000000 2023-10-18 17:57:25,908 epoch 10 - iter 88/447 - loss 0.34487478 - time (sec): 2.02 - samples/sec: 8835.74 - lr: 0.000003 - momentum: 0.000000 2023-10-18 17:57:26,943 epoch 10 - iter 132/447 - loss 0.32864293 - time (sec): 3.06 - samples/sec: 8583.72 - lr: 0.000002 - momentum: 0.000000 2023-10-18 17:57:27,980 epoch 10 - iter 176/447 - loss 0.33560550 - time (sec): 4.10 - samples/sec: 8595.93 - lr: 0.000002 - momentum: 0.000000 2023-10-18 17:57:28,966 epoch 10 - iter 220/447 - loss 0.33553915 - time (sec): 5.08 - samples/sec: 8386.68 - lr: 0.000002 - momentum: 0.000000 2023-10-18 17:57:29,980 epoch 10 - iter 264/447 - loss 0.33465517 - time (sec): 6.10 - samples/sec: 8469.08 - lr: 0.000001 - momentum: 0.000000 2023-10-18 17:57:30,990 epoch 10 - iter 308/447 - loss 0.33940697 - time (sec): 7.11 - samples/sec: 8554.99 - lr: 0.000001 - momentum: 0.000000 2023-10-18 17:57:31,891 epoch 10 - iter 352/447 - loss 0.34259012 - time (sec): 8.01 - samples/sec: 8631.10 - lr: 0.000001 - momentum: 0.000000 2023-10-18 17:57:32,846 epoch 10 - iter 396/447 - loss 0.33938014 - time (sec): 8.96 - samples/sec: 8600.64 - lr: 0.000000 - momentum: 0.000000 2023-10-18 17:57:33,869 epoch 10 - iter 440/447 - loss 0.33500316 - time (sec): 9.99 - samples/sec: 8516.18 - lr: 0.000000 - momentum: 0.000000 2023-10-18 17:57:34,041 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:57:34,041 EPOCH 10 done: loss 0.3344 - lr: 0.000000 2023-10-18 17:57:39,029 DEV : loss 0.30143699049949646 - f1-score (micro avg) 0.3358 2023-10-18 17:57:39,089 ---------------------------------------------------------------------------------------------------- 2023-10-18 17:57:39,089 Loading model from best epoch ... 2023-10-18 17:57:39,170 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time 2023-10-18 17:57:41,510 Results: - F-score (micro) 0.3516 - F-score (macro) 0.1343 - Accuracy 0.2215 By class: precision recall f1-score support loc 0.4709 0.5436 0.5047 596 pers 0.1922 0.1471 0.1667 333 org 0.0000 0.0000 0.0000 132 prod 0.0000 0.0000 0.0000 66 time 0.0000 0.0000 0.0000 49 micro avg 0.3943 0.3172 0.3516 1176 macro avg 0.1326 0.1382 0.1343 1176 weighted avg 0.2931 0.3172 0.3030 1176 2023-10-18 17:57:41,510 ----------------------------------------------------------------------------------------------------