stefan-it's picture
Upload folder using huggingface_hub
7fd2927
2023-10-18 17:55:06,196 ----------------------------------------------------------------------------------------------------
2023-10-18 17:55:06,196 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 17:55:06,196 ----------------------------------------------------------------------------------------------------
2023-10-18 17:55:06,196 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-18 17:55:06,196 ----------------------------------------------------------------------------------------------------
2023-10-18 17:55:06,196 Train: 3575 sentences
2023-10-18 17:55:06,196 (train_with_dev=False, train_with_test=False)
2023-10-18 17:55:06,196 ----------------------------------------------------------------------------------------------------
2023-10-18 17:55:06,196 Training Params:
2023-10-18 17:55:06,196 - learning_rate: "3e-05"
2023-10-18 17:55:06,196 - mini_batch_size: "8"
2023-10-18 17:55:06,196 - max_epochs: "10"
2023-10-18 17:55:06,196 - shuffle: "True"
2023-10-18 17:55:06,196 ----------------------------------------------------------------------------------------------------
2023-10-18 17:55:06,196 Plugins:
2023-10-18 17:55:06,196 - TensorboardLogger
2023-10-18 17:55:06,196 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 17:55:06,196 ----------------------------------------------------------------------------------------------------
2023-10-18 17:55:06,196 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 17:55:06,196 - metric: "('micro avg', 'f1-score')"
2023-10-18 17:55:06,197 ----------------------------------------------------------------------------------------------------
2023-10-18 17:55:06,197 Computation:
2023-10-18 17:55:06,197 - compute on device: cuda:0
2023-10-18 17:55:06,197 - embedding storage: none
2023-10-18 17:55:06,197 ----------------------------------------------------------------------------------------------------
2023-10-18 17:55:06,197 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-18 17:55:06,197 ----------------------------------------------------------------------------------------------------
2023-10-18 17:55:06,197 ----------------------------------------------------------------------------------------------------
2023-10-18 17:55:06,197 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 17:55:07,276 epoch 1 - iter 44/447 - loss 3.21526268 - time (sec): 1.08 - samples/sec: 8384.48 - lr: 0.000003 - momentum: 0.000000
2023-10-18 17:55:08,312 epoch 1 - iter 88/447 - loss 3.11068415 - time (sec): 2.12 - samples/sec: 8839.97 - lr: 0.000006 - momentum: 0.000000
2023-10-18 17:55:09,320 epoch 1 - iter 132/447 - loss 2.96404838 - time (sec): 3.12 - samples/sec: 8377.12 - lr: 0.000009 - momentum: 0.000000
2023-10-18 17:55:10,315 epoch 1 - iter 176/447 - loss 2.74904136 - time (sec): 4.12 - samples/sec: 8216.71 - lr: 0.000012 - momentum: 0.000000
2023-10-18 17:55:11,325 epoch 1 - iter 220/447 - loss 2.49019133 - time (sec): 5.13 - samples/sec: 8152.73 - lr: 0.000015 - momentum: 0.000000
2023-10-18 17:55:12,348 epoch 1 - iter 264/447 - loss 2.21707471 - time (sec): 6.15 - samples/sec: 8184.32 - lr: 0.000018 - momentum: 0.000000
2023-10-18 17:55:13,366 epoch 1 - iter 308/447 - loss 1.98807103 - time (sec): 7.17 - samples/sec: 8266.22 - lr: 0.000021 - momentum: 0.000000
2023-10-18 17:55:14,401 epoch 1 - iter 352/447 - loss 1.79499806 - time (sec): 8.20 - samples/sec: 8382.75 - lr: 0.000024 - momentum: 0.000000
2023-10-18 17:55:15,414 epoch 1 - iter 396/447 - loss 1.66282855 - time (sec): 9.22 - samples/sec: 8369.73 - lr: 0.000027 - momentum: 0.000000
2023-10-18 17:55:16,456 epoch 1 - iter 440/447 - loss 1.56914304 - time (sec): 10.26 - samples/sec: 8314.49 - lr: 0.000029 - momentum: 0.000000
2023-10-18 17:55:16,622 ----------------------------------------------------------------------------------------------------
2023-10-18 17:55:16,623 EPOCH 1 done: loss 1.5580 - lr: 0.000029
2023-10-18 17:55:18,857 DEV : loss 0.4793676435947418 - f1-score (micro avg) 0.0
2023-10-18 17:55:18,886 ----------------------------------------------------------------------------------------------------
2023-10-18 17:55:19,932 epoch 2 - iter 44/447 - loss 0.53138539 - time (sec): 1.04 - samples/sec: 7990.03 - lr: 0.000030 - momentum: 0.000000
2023-10-18 17:55:20,938 epoch 2 - iter 88/447 - loss 0.56693289 - time (sec): 2.05 - samples/sec: 8359.24 - lr: 0.000029 - momentum: 0.000000
2023-10-18 17:55:21,950 epoch 2 - iter 132/447 - loss 0.55886578 - time (sec): 3.06 - samples/sec: 8124.75 - lr: 0.000029 - momentum: 0.000000
2023-10-18 17:55:22,968 epoch 2 - iter 176/447 - loss 0.55590184 - time (sec): 4.08 - samples/sec: 8087.50 - lr: 0.000029 - momentum: 0.000000
2023-10-18 17:55:24,032 epoch 2 - iter 220/447 - loss 0.56384809 - time (sec): 5.14 - samples/sec: 8272.06 - lr: 0.000028 - momentum: 0.000000
2023-10-18 17:55:24,999 epoch 2 - iter 264/447 - loss 0.55035800 - time (sec): 6.11 - samples/sec: 8342.82 - lr: 0.000028 - momentum: 0.000000
2023-10-18 17:55:26,073 epoch 2 - iter 308/447 - loss 0.54447525 - time (sec): 7.19 - samples/sec: 8485.09 - lr: 0.000028 - momentum: 0.000000
2023-10-18 17:55:27,099 epoch 2 - iter 352/447 - loss 0.54060510 - time (sec): 8.21 - samples/sec: 8347.52 - lr: 0.000027 - momentum: 0.000000
2023-10-18 17:55:28,159 epoch 2 - iter 396/447 - loss 0.53839069 - time (sec): 9.27 - samples/sec: 8313.01 - lr: 0.000027 - momentum: 0.000000
2023-10-18 17:55:29,206 epoch 2 - iter 440/447 - loss 0.53391658 - time (sec): 10.32 - samples/sec: 8278.84 - lr: 0.000027 - momentum: 0.000000
2023-10-18 17:55:29,356 ----------------------------------------------------------------------------------------------------
2023-10-18 17:55:29,356 EPOCH 2 done: loss 0.5351 - lr: 0.000027
2023-10-18 17:55:34,588 DEV : loss 0.37521103024482727 - f1-score (micro avg) 0.0091
2023-10-18 17:55:34,615 saving best model
2023-10-18 17:55:34,648 ----------------------------------------------------------------------------------------------------
2023-10-18 17:55:35,674 epoch 3 - iter 44/447 - loss 0.48728016 - time (sec): 1.02 - samples/sec: 8784.04 - lr: 0.000026 - momentum: 0.000000
2023-10-18 17:55:36,702 epoch 3 - iter 88/447 - loss 0.48785073 - time (sec): 2.05 - samples/sec: 8542.35 - lr: 0.000026 - momentum: 0.000000
2023-10-18 17:55:37,757 epoch 3 - iter 132/447 - loss 0.48141366 - time (sec): 3.11 - samples/sec: 8435.54 - lr: 0.000026 - momentum: 0.000000
2023-10-18 17:55:38,791 epoch 3 - iter 176/447 - loss 0.49254271 - time (sec): 4.14 - samples/sec: 8208.75 - lr: 0.000025 - momentum: 0.000000
2023-10-18 17:55:39,800 epoch 3 - iter 220/447 - loss 0.48009553 - time (sec): 5.15 - samples/sec: 8137.02 - lr: 0.000025 - momentum: 0.000000
2023-10-18 17:55:40,791 epoch 3 - iter 264/447 - loss 0.48043658 - time (sec): 6.14 - samples/sec: 8202.99 - lr: 0.000025 - momentum: 0.000000
2023-10-18 17:55:41,817 epoch 3 - iter 308/447 - loss 0.46945156 - time (sec): 7.17 - samples/sec: 8223.78 - lr: 0.000024 - momentum: 0.000000
2023-10-18 17:55:42,923 epoch 3 - iter 352/447 - loss 0.47147629 - time (sec): 8.27 - samples/sec: 8205.12 - lr: 0.000024 - momentum: 0.000000
2023-10-18 17:55:43,958 epoch 3 - iter 396/447 - loss 0.46777770 - time (sec): 9.31 - samples/sec: 8238.31 - lr: 0.000024 - momentum: 0.000000
2023-10-18 17:55:44,956 epoch 3 - iter 440/447 - loss 0.46362258 - time (sec): 10.31 - samples/sec: 8291.65 - lr: 0.000023 - momentum: 0.000000
2023-10-18 17:55:45,101 ----------------------------------------------------------------------------------------------------
2023-10-18 17:55:45,101 EPOCH 3 done: loss 0.4632 - lr: 0.000023
2023-10-18 17:55:50,289 DEV : loss 0.3417363464832306 - f1-score (micro avg) 0.121
2023-10-18 17:55:50,316 saving best model
2023-10-18 17:55:50,353 ----------------------------------------------------------------------------------------------------
2023-10-18 17:55:51,337 epoch 4 - iter 44/447 - loss 0.45316300 - time (sec): 0.98 - samples/sec: 8248.65 - lr: 0.000023 - momentum: 0.000000
2023-10-18 17:55:52,461 epoch 4 - iter 88/447 - loss 0.40894581 - time (sec): 2.11 - samples/sec: 8613.06 - lr: 0.000023 - momentum: 0.000000
2023-10-18 17:55:53,478 epoch 4 - iter 132/447 - loss 0.41135455 - time (sec): 3.12 - samples/sec: 8514.01 - lr: 0.000022 - momentum: 0.000000
2023-10-18 17:55:54,490 epoch 4 - iter 176/447 - loss 0.41549718 - time (sec): 4.14 - samples/sec: 8527.39 - lr: 0.000022 - momentum: 0.000000
2023-10-18 17:55:55,517 epoch 4 - iter 220/447 - loss 0.40581863 - time (sec): 5.16 - samples/sec: 8532.55 - lr: 0.000022 - momentum: 0.000000
2023-10-18 17:55:56,476 epoch 4 - iter 264/447 - loss 0.41064648 - time (sec): 6.12 - samples/sec: 8543.42 - lr: 0.000021 - momentum: 0.000000
2023-10-18 17:55:57,474 epoch 4 - iter 308/447 - loss 0.40762911 - time (sec): 7.12 - samples/sec: 8506.55 - lr: 0.000021 - momentum: 0.000000
2023-10-18 17:55:58,507 epoch 4 - iter 352/447 - loss 0.41255573 - time (sec): 8.15 - samples/sec: 8467.32 - lr: 0.000021 - momentum: 0.000000
2023-10-18 17:55:59,560 epoch 4 - iter 396/447 - loss 0.41309072 - time (sec): 9.21 - samples/sec: 8372.24 - lr: 0.000020 - momentum: 0.000000
2023-10-18 17:56:00,588 epoch 4 - iter 440/447 - loss 0.41180366 - time (sec): 10.23 - samples/sec: 8331.22 - lr: 0.000020 - momentum: 0.000000
2023-10-18 17:56:00,754 ----------------------------------------------------------------------------------------------------
2023-10-18 17:56:00,754 EPOCH 4 done: loss 0.4114 - lr: 0.000020
2023-10-18 17:56:06,011 DEV : loss 0.3258393108844757 - f1-score (micro avg) 0.2263
2023-10-18 17:56:06,039 saving best model
2023-10-18 17:56:06,080 ----------------------------------------------------------------------------------------------------
2023-10-18 17:56:07,143 epoch 5 - iter 44/447 - loss 0.39673676 - time (sec): 1.06 - samples/sec: 8067.49 - lr: 0.000020 - momentum: 0.000000
2023-10-18 17:56:08,174 epoch 5 - iter 88/447 - loss 0.38054984 - time (sec): 2.09 - samples/sec: 8516.95 - lr: 0.000019 - momentum: 0.000000
2023-10-18 17:56:09,178 epoch 5 - iter 132/447 - loss 0.38129747 - time (sec): 3.10 - samples/sec: 8224.78 - lr: 0.000019 - momentum: 0.000000
2023-10-18 17:56:10,186 epoch 5 - iter 176/447 - loss 0.38376034 - time (sec): 4.10 - samples/sec: 8330.52 - lr: 0.000019 - momentum: 0.000000
2023-10-18 17:56:11,164 epoch 5 - iter 220/447 - loss 0.38176883 - time (sec): 5.08 - samples/sec: 8263.63 - lr: 0.000018 - momentum: 0.000000
2023-10-18 17:56:12,170 epoch 5 - iter 264/447 - loss 0.39071888 - time (sec): 6.09 - samples/sec: 8230.29 - lr: 0.000018 - momentum: 0.000000
2023-10-18 17:56:13,215 epoch 5 - iter 308/447 - loss 0.39004818 - time (sec): 7.13 - samples/sec: 8307.87 - lr: 0.000018 - momentum: 0.000000
2023-10-18 17:56:14,251 epoch 5 - iter 352/447 - loss 0.39166105 - time (sec): 8.17 - samples/sec: 8394.62 - lr: 0.000017 - momentum: 0.000000
2023-10-18 17:56:15,268 epoch 5 - iter 396/447 - loss 0.38803979 - time (sec): 9.19 - samples/sec: 8403.08 - lr: 0.000017 - momentum: 0.000000
2023-10-18 17:56:16,234 epoch 5 - iter 440/447 - loss 0.38221068 - time (sec): 10.15 - samples/sec: 8356.57 - lr: 0.000017 - momentum: 0.000000
2023-10-18 17:56:16,414 ----------------------------------------------------------------------------------------------------
2023-10-18 17:56:16,414 EPOCH 5 done: loss 0.3805 - lr: 0.000017
2023-10-18 17:56:21,341 DEV : loss 0.31553900241851807 - f1-score (micro avg) 0.2805
2023-10-18 17:56:21,370 saving best model
2023-10-18 17:56:21,411 ----------------------------------------------------------------------------------------------------
2023-10-18 17:56:22,414 epoch 6 - iter 44/447 - loss 0.36401615 - time (sec): 1.00 - samples/sec: 8342.53 - lr: 0.000016 - momentum: 0.000000
2023-10-18 17:56:23,388 epoch 6 - iter 88/447 - loss 0.36764697 - time (sec): 1.98 - samples/sec: 8250.88 - lr: 0.000016 - momentum: 0.000000
2023-10-18 17:56:24,398 epoch 6 - iter 132/447 - loss 0.35531431 - time (sec): 2.99 - samples/sec: 8010.00 - lr: 0.000016 - momentum: 0.000000
2023-10-18 17:56:25,413 epoch 6 - iter 176/447 - loss 0.37688784 - time (sec): 4.00 - samples/sec: 8058.37 - lr: 0.000015 - momentum: 0.000000
2023-10-18 17:56:26,407 epoch 6 - iter 220/447 - loss 0.37863630 - time (sec): 5.00 - samples/sec: 8094.73 - lr: 0.000015 - momentum: 0.000000
2023-10-18 17:56:27,494 epoch 6 - iter 264/447 - loss 0.38822234 - time (sec): 6.08 - samples/sec: 8250.16 - lr: 0.000015 - momentum: 0.000000
2023-10-18 17:56:28,473 epoch 6 - iter 308/447 - loss 0.38211922 - time (sec): 7.06 - samples/sec: 8350.66 - lr: 0.000014 - momentum: 0.000000
2023-10-18 17:56:29,470 epoch 6 - iter 352/447 - loss 0.36918547 - time (sec): 8.06 - samples/sec: 8369.52 - lr: 0.000014 - momentum: 0.000000
2023-10-18 17:56:30,866 epoch 6 - iter 396/447 - loss 0.37353967 - time (sec): 9.45 - samples/sec: 8139.09 - lr: 0.000014 - momentum: 0.000000
2023-10-18 17:56:31,890 epoch 6 - iter 440/447 - loss 0.37112856 - time (sec): 10.48 - samples/sec: 8148.79 - lr: 0.000013 - momentum: 0.000000
2023-10-18 17:56:32,042 ----------------------------------------------------------------------------------------------------
2023-10-18 17:56:32,042 EPOCH 6 done: loss 0.3704 - lr: 0.000013
2023-10-18 17:56:36,973 DEV : loss 0.3080426752567291 - f1-score (micro avg) 0.3137
2023-10-18 17:56:37,001 saving best model
2023-10-18 17:56:37,033 ----------------------------------------------------------------------------------------------------
2023-10-18 17:56:38,063 epoch 7 - iter 44/447 - loss 0.34458627 - time (sec): 1.03 - samples/sec: 8532.09 - lr: 0.000013 - momentum: 0.000000
2023-10-18 17:56:39,058 epoch 7 - iter 88/447 - loss 0.36276274 - time (sec): 2.02 - samples/sec: 8477.47 - lr: 0.000013 - momentum: 0.000000
2023-10-18 17:56:40,050 epoch 7 - iter 132/447 - loss 0.36409525 - time (sec): 3.02 - samples/sec: 8458.28 - lr: 0.000012 - momentum: 0.000000
2023-10-18 17:56:41,028 epoch 7 - iter 176/447 - loss 0.36044782 - time (sec): 3.99 - samples/sec: 8392.40 - lr: 0.000012 - momentum: 0.000000
2023-10-18 17:56:42,036 epoch 7 - iter 220/447 - loss 0.35380827 - time (sec): 5.00 - samples/sec: 8372.21 - lr: 0.000012 - momentum: 0.000000
2023-10-18 17:56:43,123 epoch 7 - iter 264/447 - loss 0.35154941 - time (sec): 6.09 - samples/sec: 8383.26 - lr: 0.000011 - momentum: 0.000000
2023-10-18 17:56:44,136 epoch 7 - iter 308/447 - loss 0.35471366 - time (sec): 7.10 - samples/sec: 8397.72 - lr: 0.000011 - momentum: 0.000000
2023-10-18 17:56:45,256 epoch 7 - iter 352/447 - loss 0.35780965 - time (sec): 8.22 - samples/sec: 8385.37 - lr: 0.000011 - momentum: 0.000000
2023-10-18 17:56:46,261 epoch 7 - iter 396/447 - loss 0.35812707 - time (sec): 9.23 - samples/sec: 8401.29 - lr: 0.000010 - momentum: 0.000000
2023-10-18 17:56:47,236 epoch 7 - iter 440/447 - loss 0.35473205 - time (sec): 10.20 - samples/sec: 8363.23 - lr: 0.000010 - momentum: 0.000000
2023-10-18 17:56:47,395 ----------------------------------------------------------------------------------------------------
2023-10-18 17:56:47,395 EPOCH 7 done: loss 0.3549 - lr: 0.000010
2023-10-18 17:56:52,657 DEV : loss 0.3085794150829315 - f1-score (micro avg) 0.3269
2023-10-18 17:56:52,684 saving best model
2023-10-18 17:56:52,717 ----------------------------------------------------------------------------------------------------
2023-10-18 17:56:53,703 epoch 8 - iter 44/447 - loss 0.36921770 - time (sec): 0.99 - samples/sec: 8919.36 - lr: 0.000010 - momentum: 0.000000
2023-10-18 17:56:54,749 epoch 8 - iter 88/447 - loss 0.35527406 - time (sec): 2.03 - samples/sec: 8941.97 - lr: 0.000009 - momentum: 0.000000
2023-10-18 17:56:55,762 epoch 8 - iter 132/447 - loss 0.36100556 - time (sec): 3.04 - samples/sec: 8581.87 - lr: 0.000009 - momentum: 0.000000
2023-10-18 17:56:56,773 epoch 8 - iter 176/447 - loss 0.35302564 - time (sec): 4.06 - samples/sec: 8608.51 - lr: 0.000009 - momentum: 0.000000
2023-10-18 17:56:57,805 epoch 8 - iter 220/447 - loss 0.34583755 - time (sec): 5.09 - samples/sec: 8697.64 - lr: 0.000008 - momentum: 0.000000
2023-10-18 17:56:58,817 epoch 8 - iter 264/447 - loss 0.34574712 - time (sec): 6.10 - samples/sec: 8621.15 - lr: 0.000008 - momentum: 0.000000
2023-10-18 17:56:59,826 epoch 8 - iter 308/447 - loss 0.34898159 - time (sec): 7.11 - samples/sec: 8491.51 - lr: 0.000008 - momentum: 0.000000
2023-10-18 17:57:00,842 epoch 8 - iter 352/447 - loss 0.34473749 - time (sec): 8.12 - samples/sec: 8519.03 - lr: 0.000007 - momentum: 0.000000
2023-10-18 17:57:01,853 epoch 8 - iter 396/447 - loss 0.35007107 - time (sec): 9.14 - samples/sec: 8506.48 - lr: 0.000007 - momentum: 0.000000
2023-10-18 17:57:02,814 epoch 8 - iter 440/447 - loss 0.34545478 - time (sec): 10.10 - samples/sec: 8429.17 - lr: 0.000007 - momentum: 0.000000
2023-10-18 17:57:02,969 ----------------------------------------------------------------------------------------------------
2023-10-18 17:57:02,969 EPOCH 8 done: loss 0.3436 - lr: 0.000007
2023-10-18 17:57:08,280 DEV : loss 0.31093230843544006 - f1-score (micro avg) 0.3267
2023-10-18 17:57:08,308 ----------------------------------------------------------------------------------------------------
2023-10-18 17:57:09,296 epoch 9 - iter 44/447 - loss 0.27711530 - time (sec): 0.99 - samples/sec: 8467.89 - lr: 0.000006 - momentum: 0.000000
2023-10-18 17:57:10,263 epoch 9 - iter 88/447 - loss 0.30403006 - time (sec): 1.95 - samples/sec: 8410.40 - lr: 0.000006 - momentum: 0.000000
2023-10-18 17:57:11,341 epoch 9 - iter 132/447 - loss 0.32196509 - time (sec): 3.03 - samples/sec: 8560.05 - lr: 0.000006 - momentum: 0.000000
2023-10-18 17:57:12,347 epoch 9 - iter 176/447 - loss 0.33477590 - time (sec): 4.04 - samples/sec: 8658.73 - lr: 0.000005 - momentum: 0.000000
2023-10-18 17:57:13,354 epoch 9 - iter 220/447 - loss 0.33986829 - time (sec): 5.05 - samples/sec: 8487.89 - lr: 0.000005 - momentum: 0.000000
2023-10-18 17:57:14,336 epoch 9 - iter 264/447 - loss 0.34157564 - time (sec): 6.03 - samples/sec: 8451.12 - lr: 0.000005 - momentum: 0.000000
2023-10-18 17:57:15,340 epoch 9 - iter 308/447 - loss 0.34210988 - time (sec): 7.03 - samples/sec: 8489.03 - lr: 0.000004 - momentum: 0.000000
2023-10-18 17:57:16,382 epoch 9 - iter 352/447 - loss 0.33347179 - time (sec): 8.07 - samples/sec: 8556.03 - lr: 0.000004 - momentum: 0.000000
2023-10-18 17:57:17,378 epoch 9 - iter 396/447 - loss 0.34133189 - time (sec): 9.07 - samples/sec: 8512.57 - lr: 0.000004 - momentum: 0.000000
2023-10-18 17:57:18,400 epoch 9 - iter 440/447 - loss 0.34251923 - time (sec): 10.09 - samples/sec: 8463.40 - lr: 0.000003 - momentum: 0.000000
2023-10-18 17:57:18,555 ----------------------------------------------------------------------------------------------------
2023-10-18 17:57:18,555 EPOCH 9 done: loss 0.3423 - lr: 0.000003
2023-10-18 17:57:23,821 DEV : loss 0.3027634918689728 - f1-score (micro avg) 0.336
2023-10-18 17:57:23,848 saving best model
2023-10-18 17:57:23,883 ----------------------------------------------------------------------------------------------------
2023-10-18 17:57:24,923 epoch 10 - iter 44/447 - loss 0.32917961 - time (sec): 1.04 - samples/sec: 9385.39 - lr: 0.000003 - momentum: 0.000000
2023-10-18 17:57:25,908 epoch 10 - iter 88/447 - loss 0.34487478 - time (sec): 2.02 - samples/sec: 8835.74 - lr: 0.000003 - momentum: 0.000000
2023-10-18 17:57:26,943 epoch 10 - iter 132/447 - loss 0.32864293 - time (sec): 3.06 - samples/sec: 8583.72 - lr: 0.000002 - momentum: 0.000000
2023-10-18 17:57:27,980 epoch 10 - iter 176/447 - loss 0.33560550 - time (sec): 4.10 - samples/sec: 8595.93 - lr: 0.000002 - momentum: 0.000000
2023-10-18 17:57:28,966 epoch 10 - iter 220/447 - loss 0.33553915 - time (sec): 5.08 - samples/sec: 8386.68 - lr: 0.000002 - momentum: 0.000000
2023-10-18 17:57:29,980 epoch 10 - iter 264/447 - loss 0.33465517 - time (sec): 6.10 - samples/sec: 8469.08 - lr: 0.000001 - momentum: 0.000000
2023-10-18 17:57:30,990 epoch 10 - iter 308/447 - loss 0.33940697 - time (sec): 7.11 - samples/sec: 8554.99 - lr: 0.000001 - momentum: 0.000000
2023-10-18 17:57:31,891 epoch 10 - iter 352/447 - loss 0.34259012 - time (sec): 8.01 - samples/sec: 8631.10 - lr: 0.000001 - momentum: 0.000000
2023-10-18 17:57:32,846 epoch 10 - iter 396/447 - loss 0.33938014 - time (sec): 8.96 - samples/sec: 8600.64 - lr: 0.000000 - momentum: 0.000000
2023-10-18 17:57:33,869 epoch 10 - iter 440/447 - loss 0.33500316 - time (sec): 9.99 - samples/sec: 8516.18 - lr: 0.000000 - momentum: 0.000000
2023-10-18 17:57:34,041 ----------------------------------------------------------------------------------------------------
2023-10-18 17:57:34,041 EPOCH 10 done: loss 0.3344 - lr: 0.000000
2023-10-18 17:57:39,029 DEV : loss 0.30143699049949646 - f1-score (micro avg) 0.3358
2023-10-18 17:57:39,089 ----------------------------------------------------------------------------------------------------
2023-10-18 17:57:39,089 Loading model from best epoch ...
2023-10-18 17:57:39,170 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-18 17:57:41,510
Results:
- F-score (micro) 0.3516
- F-score (macro) 0.1343
- Accuracy 0.2215
By class:
precision recall f1-score support
loc 0.4709 0.5436 0.5047 596
pers 0.1922 0.1471 0.1667 333
org 0.0000 0.0000 0.0000 132
prod 0.0000 0.0000 0.0000 66
time 0.0000 0.0000 0.0000 49
micro avg 0.3943 0.3172 0.3516 1176
macro avg 0.1326 0.1382 0.1343 1176
weighted avg 0.2931 0.3172 0.3030 1176
2023-10-18 17:57:41,510 ----------------------------------------------------------------------------------------------------