stefan-it's picture
Upload ./training.log with huggingface_hub
dfe85dd
2023-10-25 17:10:39,021 ----------------------------------------------------------------------------------------------------
2023-10-25 17:10:39,022 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 17:10:39,022 ----------------------------------------------------------------------------------------------------
2023-10-25 17:10:39,022 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-25 17:10:39,022 ----------------------------------------------------------------------------------------------------
2023-10-25 17:10:39,022 Train: 20847 sentences
2023-10-25 17:10:39,022 (train_with_dev=False, train_with_test=False)
2023-10-25 17:10:39,022 ----------------------------------------------------------------------------------------------------
2023-10-25 17:10:39,022 Training Params:
2023-10-25 17:10:39,022 - learning_rate: "3e-05"
2023-10-25 17:10:39,022 - mini_batch_size: "4"
2023-10-25 17:10:39,022 - max_epochs: "10"
2023-10-25 17:10:39,022 - shuffle: "True"
2023-10-25 17:10:39,022 ----------------------------------------------------------------------------------------------------
2023-10-25 17:10:39,022 Plugins:
2023-10-25 17:10:39,022 - TensorboardLogger
2023-10-25 17:10:39,022 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 17:10:39,022 ----------------------------------------------------------------------------------------------------
2023-10-25 17:10:39,022 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 17:10:39,022 - metric: "('micro avg', 'f1-score')"
2023-10-25 17:10:39,022 ----------------------------------------------------------------------------------------------------
2023-10-25 17:10:39,022 Computation:
2023-10-25 17:10:39,022 - compute on device: cuda:0
2023-10-25 17:10:39,022 - embedding storage: none
2023-10-25 17:10:39,022 ----------------------------------------------------------------------------------------------------
2023-10-25 17:10:39,022 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-25 17:10:39,022 ----------------------------------------------------------------------------------------------------
2023-10-25 17:10:39,023 ----------------------------------------------------------------------------------------------------
2023-10-25 17:10:39,023 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 17:11:01,402 epoch 1 - iter 521/5212 - loss 1.42812369 - time (sec): 22.38 - samples/sec: 1563.36 - lr: 0.000003 - momentum: 0.000000
2023-10-25 17:11:23,707 epoch 1 - iter 1042/5212 - loss 0.88349774 - time (sec): 44.68 - samples/sec: 1637.20 - lr: 0.000006 - momentum: 0.000000
2023-10-25 17:11:45,540 epoch 1 - iter 1563/5212 - loss 0.69007273 - time (sec): 66.52 - samples/sec: 1598.78 - lr: 0.000009 - momentum: 0.000000
2023-10-25 17:12:07,631 epoch 1 - iter 2084/5212 - loss 0.57540747 - time (sec): 88.61 - samples/sec: 1644.33 - lr: 0.000012 - momentum: 0.000000
2023-10-25 17:12:29,962 epoch 1 - iter 2605/5212 - loss 0.50612270 - time (sec): 110.94 - samples/sec: 1647.07 - lr: 0.000015 - momentum: 0.000000
2023-10-25 17:12:51,920 epoch 1 - iter 3126/5212 - loss 0.45647568 - time (sec): 132.90 - samples/sec: 1658.24 - lr: 0.000018 - momentum: 0.000000
2023-10-25 17:13:13,904 epoch 1 - iter 3647/5212 - loss 0.41815257 - time (sec): 154.88 - samples/sec: 1670.30 - lr: 0.000021 - momentum: 0.000000
2023-10-25 17:13:36,167 epoch 1 - iter 4168/5212 - loss 0.39256890 - time (sec): 177.14 - samples/sec: 1665.90 - lr: 0.000024 - momentum: 0.000000
2023-10-25 17:13:58,652 epoch 1 - iter 4689/5212 - loss 0.37195175 - time (sec): 199.63 - samples/sec: 1659.85 - lr: 0.000027 - momentum: 0.000000
2023-10-25 17:14:21,064 epoch 1 - iter 5210/5212 - loss 0.35466532 - time (sec): 222.04 - samples/sec: 1654.66 - lr: 0.000030 - momentum: 0.000000
2023-10-25 17:14:21,146 ----------------------------------------------------------------------------------------------------
2023-10-25 17:14:21,147 EPOCH 1 done: loss 0.3546 - lr: 0.000030
2023-10-25 17:14:24,869 DEV : loss 0.12253964692354202 - f1-score (micro avg) 0.1856
2023-10-25 17:14:24,895 saving best model
2023-10-25 17:14:25,372 ----------------------------------------------------------------------------------------------------
2023-10-25 17:14:47,269 epoch 2 - iter 521/5212 - loss 0.17884856 - time (sec): 21.90 - samples/sec: 1682.18 - lr: 0.000030 - momentum: 0.000000
2023-10-25 17:15:09,326 epoch 2 - iter 1042/5212 - loss 0.17084209 - time (sec): 43.95 - samples/sec: 1746.73 - lr: 0.000029 - momentum: 0.000000
2023-10-25 17:15:31,350 epoch 2 - iter 1563/5212 - loss 0.17404067 - time (sec): 65.98 - samples/sec: 1744.50 - lr: 0.000029 - momentum: 0.000000
2023-10-25 17:15:53,385 epoch 2 - iter 2084/5212 - loss 0.17709317 - time (sec): 88.01 - samples/sec: 1724.88 - lr: 0.000029 - momentum: 0.000000
2023-10-25 17:16:15,212 epoch 2 - iter 2605/5212 - loss 0.17555934 - time (sec): 109.84 - samples/sec: 1693.49 - lr: 0.000028 - momentum: 0.000000
2023-10-25 17:16:36,826 epoch 2 - iter 3126/5212 - loss 0.17428858 - time (sec): 131.45 - samples/sec: 1697.41 - lr: 0.000028 - momentum: 0.000000
2023-10-25 17:16:58,764 epoch 2 - iter 3647/5212 - loss 0.17078016 - time (sec): 153.39 - samples/sec: 1704.73 - lr: 0.000028 - momentum: 0.000000
2023-10-25 17:17:20,835 epoch 2 - iter 4168/5212 - loss 0.16926793 - time (sec): 175.46 - samples/sec: 1708.53 - lr: 0.000027 - momentum: 0.000000
2023-10-25 17:17:42,843 epoch 2 - iter 4689/5212 - loss 0.16791217 - time (sec): 197.47 - samples/sec: 1689.55 - lr: 0.000027 - momentum: 0.000000
2023-10-25 17:18:04,488 epoch 2 - iter 5210/5212 - loss 0.16890519 - time (sec): 219.11 - samples/sec: 1675.45 - lr: 0.000027 - momentum: 0.000000
2023-10-25 17:18:04,576 ----------------------------------------------------------------------------------------------------
2023-10-25 17:18:04,576 EPOCH 2 done: loss 0.1691 - lr: 0.000027
2023-10-25 17:18:11,491 DEV : loss 0.1517297476530075 - f1-score (micro avg) 0.3623
2023-10-25 17:18:11,516 saving best model
2023-10-25 17:18:12,122 ----------------------------------------------------------------------------------------------------
2023-10-25 17:18:34,231 epoch 3 - iter 521/5212 - loss 0.10765596 - time (sec): 22.10 - samples/sec: 1778.85 - lr: 0.000026 - momentum: 0.000000
2023-10-25 17:18:55,895 epoch 3 - iter 1042/5212 - loss 0.10781574 - time (sec): 43.77 - samples/sec: 1737.42 - lr: 0.000026 - momentum: 0.000000
2023-10-25 17:19:17,465 epoch 3 - iter 1563/5212 - loss 0.11097772 - time (sec): 65.34 - samples/sec: 1697.21 - lr: 0.000026 - momentum: 0.000000
2023-10-25 17:19:40,795 epoch 3 - iter 2084/5212 - loss 0.11524448 - time (sec): 88.67 - samples/sec: 1696.34 - lr: 0.000025 - momentum: 0.000000
2023-10-25 17:20:02,364 epoch 3 - iter 2605/5212 - loss 0.11375610 - time (sec): 110.24 - samples/sec: 1681.78 - lr: 0.000025 - momentum: 0.000000
2023-10-25 17:20:24,583 epoch 3 - iter 3126/5212 - loss 0.11298056 - time (sec): 132.46 - samples/sec: 1671.52 - lr: 0.000025 - momentum: 0.000000
2023-10-25 17:20:46,653 epoch 3 - iter 3647/5212 - loss 0.11408116 - time (sec): 154.53 - samples/sec: 1664.56 - lr: 0.000024 - momentum: 0.000000
2023-10-25 17:21:08,743 epoch 3 - iter 4168/5212 - loss 0.11524557 - time (sec): 176.62 - samples/sec: 1668.38 - lr: 0.000024 - momentum: 0.000000
2023-10-25 17:21:30,574 epoch 3 - iter 4689/5212 - loss 0.11726213 - time (sec): 198.45 - samples/sec: 1665.60 - lr: 0.000024 - momentum: 0.000000
2023-10-25 17:21:52,582 epoch 3 - iter 5210/5212 - loss 0.11559631 - time (sec): 220.45 - samples/sec: 1666.47 - lr: 0.000023 - momentum: 0.000000
2023-10-25 17:21:52,662 ----------------------------------------------------------------------------------------------------
2023-10-25 17:21:52,662 EPOCH 3 done: loss 0.1156 - lr: 0.000023
2023-10-25 17:21:58,870 DEV : loss 0.2178632616996765 - f1-score (micro avg) 0.4066
2023-10-25 17:21:58,897 saving best model
2023-10-25 17:21:59,382 ----------------------------------------------------------------------------------------------------
2023-10-25 17:22:21,118 epoch 4 - iter 521/5212 - loss 0.08093001 - time (sec): 21.73 - samples/sec: 1628.13 - lr: 0.000023 - momentum: 0.000000
2023-10-25 17:22:42,719 epoch 4 - iter 1042/5212 - loss 0.08353216 - time (sec): 43.34 - samples/sec: 1609.79 - lr: 0.000023 - momentum: 0.000000
2023-10-25 17:23:05,589 epoch 4 - iter 1563/5212 - loss 0.07892836 - time (sec): 66.21 - samples/sec: 1646.97 - lr: 0.000022 - momentum: 0.000000
2023-10-25 17:23:27,349 epoch 4 - iter 2084/5212 - loss 0.08223074 - time (sec): 87.97 - samples/sec: 1671.53 - lr: 0.000022 - momentum: 0.000000
2023-10-25 17:23:49,217 epoch 4 - iter 2605/5212 - loss 0.08120219 - time (sec): 109.83 - samples/sec: 1669.44 - lr: 0.000022 - momentum: 0.000000
2023-10-25 17:24:11,386 epoch 4 - iter 3126/5212 - loss 0.08400528 - time (sec): 132.00 - samples/sec: 1666.34 - lr: 0.000021 - momentum: 0.000000
2023-10-25 17:24:33,293 epoch 4 - iter 3647/5212 - loss 0.08287545 - time (sec): 153.91 - samples/sec: 1670.99 - lr: 0.000021 - momentum: 0.000000
2023-10-25 17:24:55,564 epoch 4 - iter 4168/5212 - loss 0.08112875 - time (sec): 176.18 - samples/sec: 1676.36 - lr: 0.000021 - momentum: 0.000000
2023-10-25 17:25:17,655 epoch 4 - iter 4689/5212 - loss 0.08165353 - time (sec): 198.27 - samples/sec: 1675.32 - lr: 0.000020 - momentum: 0.000000
2023-10-25 17:25:39,735 epoch 4 - iter 5210/5212 - loss 0.08271518 - time (sec): 220.35 - samples/sec: 1667.18 - lr: 0.000020 - momentum: 0.000000
2023-10-25 17:25:39,814 ----------------------------------------------------------------------------------------------------
2023-10-25 17:25:39,814 EPOCH 4 done: loss 0.0827 - lr: 0.000020
2023-10-25 17:25:46,171 DEV : loss 0.22580073773860931 - f1-score (micro avg) 0.3863
2023-10-25 17:25:46,199 ----------------------------------------------------------------------------------------------------
2023-10-25 17:26:08,425 epoch 5 - iter 521/5212 - loss 0.06486546 - time (sec): 22.22 - samples/sec: 1728.16 - lr: 0.000020 - momentum: 0.000000
2023-10-25 17:26:30,115 epoch 5 - iter 1042/5212 - loss 0.06119080 - time (sec): 43.92 - samples/sec: 1733.02 - lr: 0.000019 - momentum: 0.000000
2023-10-25 17:26:51,896 epoch 5 - iter 1563/5212 - loss 0.05815220 - time (sec): 65.70 - samples/sec: 1720.52 - lr: 0.000019 - momentum: 0.000000
2023-10-25 17:27:13,839 epoch 5 - iter 2084/5212 - loss 0.05823306 - time (sec): 87.64 - samples/sec: 1688.74 - lr: 0.000019 - momentum: 0.000000
2023-10-25 17:27:35,819 epoch 5 - iter 2605/5212 - loss 0.06005092 - time (sec): 109.62 - samples/sec: 1692.64 - lr: 0.000018 - momentum: 0.000000
2023-10-25 17:27:58,545 epoch 5 - iter 3126/5212 - loss 0.05992666 - time (sec): 132.34 - samples/sec: 1678.65 - lr: 0.000018 - momentum: 0.000000
2023-10-25 17:28:20,832 epoch 5 - iter 3647/5212 - loss 0.06213572 - time (sec): 154.63 - samples/sec: 1674.98 - lr: 0.000018 - momentum: 0.000000
2023-10-25 17:28:42,976 epoch 5 - iter 4168/5212 - loss 0.06186123 - time (sec): 176.78 - samples/sec: 1673.56 - lr: 0.000017 - momentum: 0.000000
2023-10-25 17:29:04,588 epoch 5 - iter 4689/5212 - loss 0.06183511 - time (sec): 198.39 - samples/sec: 1675.21 - lr: 0.000017 - momentum: 0.000000
2023-10-25 17:29:25,823 epoch 5 - iter 5210/5212 - loss 0.06174339 - time (sec): 219.62 - samples/sec: 1672.76 - lr: 0.000017 - momentum: 0.000000
2023-10-25 17:29:25,901 ----------------------------------------------------------------------------------------------------
2023-10-25 17:29:25,901 EPOCH 5 done: loss 0.0617 - lr: 0.000017
2023-10-25 17:29:32,121 DEV : loss 0.26587387919425964 - f1-score (micro avg) 0.4384
2023-10-25 17:29:32,148 saving best model
2023-10-25 17:29:32,748 ----------------------------------------------------------------------------------------------------
2023-10-25 17:29:55,254 epoch 6 - iter 521/5212 - loss 0.04134678 - time (sec): 22.50 - samples/sec: 1747.44 - lr: 0.000016 - momentum: 0.000000
2023-10-25 17:30:17,890 epoch 6 - iter 1042/5212 - loss 0.03876985 - time (sec): 45.14 - samples/sec: 1695.68 - lr: 0.000016 - momentum: 0.000000
2023-10-25 17:30:40,689 epoch 6 - iter 1563/5212 - loss 0.03994635 - time (sec): 67.94 - samples/sec: 1680.45 - lr: 0.000016 - momentum: 0.000000
2023-10-25 17:31:02,501 epoch 6 - iter 2084/5212 - loss 0.04059620 - time (sec): 89.75 - samples/sec: 1674.02 - lr: 0.000015 - momentum: 0.000000
2023-10-25 17:31:24,817 epoch 6 - iter 2605/5212 - loss 0.04082835 - time (sec): 112.07 - samples/sec: 1649.42 - lr: 0.000015 - momentum: 0.000000
2023-10-25 17:31:46,564 epoch 6 - iter 3126/5212 - loss 0.04210774 - time (sec): 133.81 - samples/sec: 1643.11 - lr: 0.000015 - momentum: 0.000000
2023-10-25 17:32:09,005 epoch 6 - iter 3647/5212 - loss 0.04150600 - time (sec): 156.26 - samples/sec: 1658.69 - lr: 0.000014 - momentum: 0.000000
2023-10-25 17:32:31,318 epoch 6 - iter 4168/5212 - loss 0.04190735 - time (sec): 178.57 - samples/sec: 1652.85 - lr: 0.000014 - momentum: 0.000000
2023-10-25 17:32:53,308 epoch 6 - iter 4689/5212 - loss 0.04188495 - time (sec): 200.56 - samples/sec: 1651.75 - lr: 0.000014 - momentum: 0.000000
2023-10-25 17:33:15,961 epoch 6 - iter 5210/5212 - loss 0.04187712 - time (sec): 223.21 - samples/sec: 1645.76 - lr: 0.000013 - momentum: 0.000000
2023-10-25 17:33:16,042 ----------------------------------------------------------------------------------------------------
2023-10-25 17:33:16,043 EPOCH 6 done: loss 0.0419 - lr: 0.000013
2023-10-25 17:33:22,241 DEV : loss 0.3574075996875763 - f1-score (micro avg) 0.397
2023-10-25 17:33:22,267 ----------------------------------------------------------------------------------------------------
2023-10-25 17:33:44,354 epoch 7 - iter 521/5212 - loss 0.03039355 - time (sec): 22.09 - samples/sec: 1677.53 - lr: 0.000013 - momentum: 0.000000
2023-10-25 17:34:07,659 epoch 7 - iter 1042/5212 - loss 0.03164622 - time (sec): 45.39 - samples/sec: 1622.40 - lr: 0.000013 - momentum: 0.000000
2023-10-25 17:34:29,729 epoch 7 - iter 1563/5212 - loss 0.03063082 - time (sec): 67.46 - samples/sec: 1650.05 - lr: 0.000012 - momentum: 0.000000
2023-10-25 17:34:51,817 epoch 7 - iter 2084/5212 - loss 0.03002049 - time (sec): 89.55 - samples/sec: 1647.24 - lr: 0.000012 - momentum: 0.000000
2023-10-25 17:35:14,190 epoch 7 - iter 2605/5212 - loss 0.02845181 - time (sec): 111.92 - samples/sec: 1641.54 - lr: 0.000012 - momentum: 0.000000
2023-10-25 17:35:36,402 epoch 7 - iter 3126/5212 - loss 0.02817657 - time (sec): 134.13 - samples/sec: 1652.56 - lr: 0.000011 - momentum: 0.000000
2023-10-25 17:35:58,702 epoch 7 - iter 3647/5212 - loss 0.02817558 - time (sec): 156.43 - samples/sec: 1675.68 - lr: 0.000011 - momentum: 0.000000
2023-10-25 17:36:20,947 epoch 7 - iter 4168/5212 - loss 0.03018002 - time (sec): 178.68 - samples/sec: 1667.05 - lr: 0.000011 - momentum: 0.000000
2023-10-25 17:36:42,851 epoch 7 - iter 4689/5212 - loss 0.02972812 - time (sec): 200.58 - samples/sec: 1657.09 - lr: 0.000010 - momentum: 0.000000
2023-10-25 17:37:04,941 epoch 7 - iter 5210/5212 - loss 0.02981065 - time (sec): 222.67 - samples/sec: 1649.75 - lr: 0.000010 - momentum: 0.000000
2023-10-25 17:37:05,019 ----------------------------------------------------------------------------------------------------
2023-10-25 17:37:05,019 EPOCH 7 done: loss 0.0298 - lr: 0.000010
2023-10-25 17:37:11,912 DEV : loss 0.41519442200660706 - f1-score (micro avg) 0.3892
2023-10-25 17:37:11,938 ----------------------------------------------------------------------------------------------------
2023-10-25 17:37:33,570 epoch 8 - iter 521/5212 - loss 0.01779766 - time (sec): 21.63 - samples/sec: 1669.98 - lr: 0.000010 - momentum: 0.000000
2023-10-25 17:37:55,659 epoch 8 - iter 1042/5212 - loss 0.01916489 - time (sec): 43.72 - samples/sec: 1668.19 - lr: 0.000009 - momentum: 0.000000
2023-10-25 17:38:17,995 epoch 8 - iter 1563/5212 - loss 0.02003561 - time (sec): 66.06 - samples/sec: 1647.34 - lr: 0.000009 - momentum: 0.000000
2023-10-25 17:38:40,210 epoch 8 - iter 2084/5212 - loss 0.01983414 - time (sec): 88.27 - samples/sec: 1640.49 - lr: 0.000009 - momentum: 0.000000
2023-10-25 17:39:02,258 epoch 8 - iter 2605/5212 - loss 0.02006743 - time (sec): 110.32 - samples/sec: 1647.44 - lr: 0.000008 - momentum: 0.000000
2023-10-25 17:39:24,542 epoch 8 - iter 3126/5212 - loss 0.02243560 - time (sec): 132.60 - samples/sec: 1646.87 - lr: 0.000008 - momentum: 0.000000
2023-10-25 17:39:46,461 epoch 8 - iter 3647/5212 - loss 0.02188283 - time (sec): 154.52 - samples/sec: 1668.89 - lr: 0.000008 - momentum: 0.000000
2023-10-25 17:40:08,478 epoch 8 - iter 4168/5212 - loss 0.02198132 - time (sec): 176.54 - samples/sec: 1666.15 - lr: 0.000007 - momentum: 0.000000
2023-10-25 17:40:31,186 epoch 8 - iter 4689/5212 - loss 0.02180658 - time (sec): 199.25 - samples/sec: 1656.91 - lr: 0.000007 - momentum: 0.000000
2023-10-25 17:40:53,169 epoch 8 - iter 5210/5212 - loss 0.02133745 - time (sec): 221.23 - samples/sec: 1660.21 - lr: 0.000007 - momentum: 0.000000
2023-10-25 17:40:53,252 ----------------------------------------------------------------------------------------------------
2023-10-25 17:40:53,252 EPOCH 8 done: loss 0.0213 - lr: 0.000007
2023-10-25 17:41:00,281 DEV : loss 0.3952998220920563 - f1-score (micro avg) 0.3988
2023-10-25 17:41:00,307 ----------------------------------------------------------------------------------------------------
2023-10-25 17:41:22,341 epoch 9 - iter 521/5212 - loss 0.01866035 - time (sec): 22.03 - samples/sec: 1715.77 - lr: 0.000006 - momentum: 0.000000
2023-10-25 17:41:45,005 epoch 9 - iter 1042/5212 - loss 0.01578284 - time (sec): 44.70 - samples/sec: 1662.33 - lr: 0.000006 - momentum: 0.000000
2023-10-25 17:42:07,148 epoch 9 - iter 1563/5212 - loss 0.01409868 - time (sec): 66.84 - samples/sec: 1649.15 - lr: 0.000006 - momentum: 0.000000
2023-10-25 17:42:29,360 epoch 9 - iter 2084/5212 - loss 0.01328640 - time (sec): 89.05 - samples/sec: 1672.80 - lr: 0.000005 - momentum: 0.000000
2023-10-25 17:42:51,215 epoch 9 - iter 2605/5212 - loss 0.01404307 - time (sec): 110.91 - samples/sec: 1671.13 - lr: 0.000005 - momentum: 0.000000
2023-10-25 17:43:13,076 epoch 9 - iter 3126/5212 - loss 0.01393654 - time (sec): 132.77 - samples/sec: 1662.03 - lr: 0.000005 - momentum: 0.000000
2023-10-25 17:43:35,371 epoch 9 - iter 3647/5212 - loss 0.01363863 - time (sec): 155.06 - samples/sec: 1667.51 - lr: 0.000004 - momentum: 0.000000
2023-10-25 17:43:57,538 epoch 9 - iter 4168/5212 - loss 0.01418941 - time (sec): 177.23 - samples/sec: 1662.81 - lr: 0.000004 - momentum: 0.000000
2023-10-25 17:44:19,964 epoch 9 - iter 4689/5212 - loss 0.01443190 - time (sec): 199.66 - samples/sec: 1664.10 - lr: 0.000004 - momentum: 0.000000
2023-10-25 17:44:42,080 epoch 9 - iter 5210/5212 - loss 0.01434777 - time (sec): 221.77 - samples/sec: 1656.30 - lr: 0.000003 - momentum: 0.000000
2023-10-25 17:44:42,163 ----------------------------------------------------------------------------------------------------
2023-10-25 17:44:42,163 EPOCH 9 done: loss 0.0143 - lr: 0.000003
2023-10-25 17:44:49,057 DEV : loss 0.4204791486263275 - f1-score (micro avg) 0.4073
2023-10-25 17:44:49,084 ----------------------------------------------------------------------------------------------------
2023-10-25 17:45:11,388 epoch 10 - iter 521/5212 - loss 0.01094782 - time (sec): 22.30 - samples/sec: 1662.01 - lr: 0.000003 - momentum: 0.000000
2023-10-25 17:45:33,362 epoch 10 - iter 1042/5212 - loss 0.00960251 - time (sec): 44.28 - samples/sec: 1649.16 - lr: 0.000003 - momentum: 0.000000
2023-10-25 17:45:55,731 epoch 10 - iter 1563/5212 - loss 0.00887085 - time (sec): 66.65 - samples/sec: 1680.98 - lr: 0.000002 - momentum: 0.000000
2023-10-25 17:46:17,794 epoch 10 - iter 2084/5212 - loss 0.00932400 - time (sec): 88.71 - samples/sec: 1696.50 - lr: 0.000002 - momentum: 0.000000
2023-10-25 17:46:40,350 epoch 10 - iter 2605/5212 - loss 0.00933893 - time (sec): 111.27 - samples/sec: 1701.90 - lr: 0.000002 - momentum: 0.000000
2023-10-25 17:47:02,076 epoch 10 - iter 3126/5212 - loss 0.00908651 - time (sec): 132.99 - samples/sec: 1681.57 - lr: 0.000001 - momentum: 0.000000
2023-10-25 17:47:23,742 epoch 10 - iter 3647/5212 - loss 0.00933220 - time (sec): 154.66 - samples/sec: 1667.79 - lr: 0.000001 - momentum: 0.000000
2023-10-25 17:47:45,231 epoch 10 - iter 4168/5212 - loss 0.00889164 - time (sec): 176.15 - samples/sec: 1661.09 - lr: 0.000001 - momentum: 0.000000
2023-10-25 17:48:06,956 epoch 10 - iter 4689/5212 - loss 0.00870107 - time (sec): 197.87 - samples/sec: 1669.40 - lr: 0.000000 - momentum: 0.000000
2023-10-25 17:48:28,945 epoch 10 - iter 5210/5212 - loss 0.00875027 - time (sec): 219.86 - samples/sec: 1669.93 - lr: 0.000000 - momentum: 0.000000
2023-10-25 17:48:29,040 ----------------------------------------------------------------------------------------------------
2023-10-25 17:48:29,040 EPOCH 10 done: loss 0.0087 - lr: 0.000000
2023-10-25 17:48:35,948 DEV : loss 0.4886936545372009 - f1-score (micro avg) 0.3916
2023-10-25 17:48:36,442 ----------------------------------------------------------------------------------------------------
2023-10-25 17:48:36,443 Loading model from best epoch ...
2023-10-25 17:48:38,032 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 17:48:48,031
Results:
- F-score (micro) 0.4406
- F-score (macro) 0.298
- Accuracy 0.2872
By class:
precision recall f1-score support
LOC 0.4976 0.5049 0.5012 1214
PER 0.4266 0.4282 0.4274 808
ORG 0.2867 0.2436 0.2634 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4441 0.4372 0.4406 2390
macro avg 0.3027 0.2942 0.2980 2390
weighted avg 0.4393 0.4372 0.4380 2390
2023-10-25 17:48:48,031 ----------------------------------------------------------------------------------------------------