stefan-it's picture
Upload folder using huggingface_hub
fc094bc
2023-10-13 08:49:59,758 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:59,759 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 08:49:59,759 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:59,759 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-13 08:49:59,759 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:59,759 Train: 1100 sentences
2023-10-13 08:49:59,759 (train_with_dev=False, train_with_test=False)
2023-10-13 08:49:59,759 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:59,759 Training Params:
2023-10-13 08:49:59,760 - learning_rate: "3e-05"
2023-10-13 08:49:59,760 - mini_batch_size: "4"
2023-10-13 08:49:59,760 - max_epochs: "10"
2023-10-13 08:49:59,760 - shuffle: "True"
2023-10-13 08:49:59,760 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:59,760 Plugins:
2023-10-13 08:49:59,760 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 08:49:59,760 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:59,760 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 08:49:59,760 - metric: "('micro avg', 'f1-score')"
2023-10-13 08:49:59,760 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:59,760 Computation:
2023-10-13 08:49:59,760 - compute on device: cuda:0
2023-10-13 08:49:59,760 - embedding storage: none
2023-10-13 08:49:59,760 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:59,760 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-13 08:49:59,760 ----------------------------------------------------------------------------------------------------
2023-10-13 08:49:59,760 ----------------------------------------------------------------------------------------------------
2023-10-13 08:50:01,127 epoch 1 - iter 27/275 - loss 3.18511496 - time (sec): 1.37 - samples/sec: 1543.30 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:50:02,538 epoch 1 - iter 54/275 - loss 2.81783400 - time (sec): 2.78 - samples/sec: 1472.59 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:50:04,007 epoch 1 - iter 81/275 - loss 2.24004312 - time (sec): 4.25 - samples/sec: 1466.57 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:50:05,345 epoch 1 - iter 108/275 - loss 1.85643127 - time (sec): 5.58 - samples/sec: 1552.80 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:50:06,626 epoch 1 - iter 135/275 - loss 1.59454014 - time (sec): 6.87 - samples/sec: 1599.77 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:50:07,837 epoch 1 - iter 162/275 - loss 1.42708412 - time (sec): 8.08 - samples/sec: 1637.63 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:50:09,006 epoch 1 - iter 189/275 - loss 1.27632203 - time (sec): 9.25 - samples/sec: 1691.77 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:50:10,173 epoch 1 - iter 216/275 - loss 1.17092393 - time (sec): 10.41 - samples/sec: 1711.69 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:50:11,346 epoch 1 - iter 243/275 - loss 1.07418944 - time (sec): 11.59 - samples/sec: 1735.60 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:50:12,564 epoch 1 - iter 270/275 - loss 1.00087942 - time (sec): 12.80 - samples/sec: 1746.13 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:50:12,786 ----------------------------------------------------------------------------------------------------
2023-10-13 08:50:12,787 EPOCH 1 done: loss 0.9886 - lr: 0.000029
2023-10-13 08:50:13,394 DEV : loss 0.21467053890228271 - f1-score (micro avg) 0.7057
2023-10-13 08:50:13,399 saving best model
2023-10-13 08:50:13,826 ----------------------------------------------------------------------------------------------------
2023-10-13 08:50:15,065 epoch 2 - iter 27/275 - loss 0.27193987 - time (sec): 1.24 - samples/sec: 1819.61 - lr: 0.000030 - momentum: 0.000000
2023-10-13 08:50:16,282 epoch 2 - iter 54/275 - loss 0.21355875 - time (sec): 2.45 - samples/sec: 1813.82 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:50:17,494 epoch 2 - iter 81/275 - loss 0.19361040 - time (sec): 3.67 - samples/sec: 1870.35 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:50:18,729 epoch 2 - iter 108/275 - loss 0.19697749 - time (sec): 4.90 - samples/sec: 1916.06 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:50:19,914 epoch 2 - iter 135/275 - loss 0.19480719 - time (sec): 6.09 - samples/sec: 1925.14 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:50:21,167 epoch 2 - iter 162/275 - loss 0.19036163 - time (sec): 7.34 - samples/sec: 1875.08 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:50:22,536 epoch 2 - iter 189/275 - loss 0.18915738 - time (sec): 8.71 - samples/sec: 1807.53 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:50:23,861 epoch 2 - iter 216/275 - loss 0.18240104 - time (sec): 10.03 - samples/sec: 1803.76 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:50:25,223 epoch 2 - iter 243/275 - loss 0.17734318 - time (sec): 11.40 - samples/sec: 1784.59 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:50:26,645 epoch 2 - iter 270/275 - loss 0.17468446 - time (sec): 12.82 - samples/sec: 1746.37 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:50:26,888 ----------------------------------------------------------------------------------------------------
2023-10-13 08:50:26,889 EPOCH 2 done: loss 0.1748 - lr: 0.000027
2023-10-13 08:50:27,579 DEV : loss 0.1474207043647766 - f1-score (micro avg) 0.8391
2023-10-13 08:50:27,590 saving best model
2023-10-13 08:50:28,107 ----------------------------------------------------------------------------------------------------
2023-10-13 08:50:29,336 epoch 3 - iter 27/275 - loss 0.08983742 - time (sec): 1.22 - samples/sec: 1806.13 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:50:30,584 epoch 3 - iter 54/275 - loss 0.08149810 - time (sec): 2.47 - samples/sec: 1848.52 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:50:31,790 epoch 3 - iter 81/275 - loss 0.09230059 - time (sec): 3.68 - samples/sec: 1816.70 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:50:32,999 epoch 3 - iter 108/275 - loss 0.09328727 - time (sec): 4.88 - samples/sec: 1796.11 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:50:34,547 epoch 3 - iter 135/275 - loss 0.09080755 - time (sec): 6.43 - samples/sec: 1724.53 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:50:35,986 epoch 3 - iter 162/275 - loss 0.09660612 - time (sec): 7.87 - samples/sec: 1684.26 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:50:37,446 epoch 3 - iter 189/275 - loss 0.09770689 - time (sec): 9.33 - samples/sec: 1685.88 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:50:38,923 epoch 3 - iter 216/275 - loss 0.09415039 - time (sec): 10.81 - samples/sec: 1676.24 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:50:40,394 epoch 3 - iter 243/275 - loss 0.09545758 - time (sec): 12.28 - samples/sec: 1646.69 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:50:41,859 epoch 3 - iter 270/275 - loss 0.09920655 - time (sec): 13.74 - samples/sec: 1628.69 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:50:42,134 ----------------------------------------------------------------------------------------------------
2023-10-13 08:50:42,135 EPOCH 3 done: loss 0.0982 - lr: 0.000023
2023-10-13 08:50:42,865 DEV : loss 0.16546553373336792 - f1-score (micro avg) 0.845
2023-10-13 08:50:42,871 saving best model
2023-10-13 08:50:43,384 ----------------------------------------------------------------------------------------------------
2023-10-13 08:50:44,768 epoch 4 - iter 27/275 - loss 0.04689526 - time (sec): 1.38 - samples/sec: 1725.94 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:50:46,183 epoch 4 - iter 54/275 - loss 0.06555587 - time (sec): 2.80 - samples/sec: 1710.68 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:50:47,607 epoch 4 - iter 81/275 - loss 0.05828751 - time (sec): 4.22 - samples/sec: 1642.25 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:50:48,987 epoch 4 - iter 108/275 - loss 0.06656748 - time (sec): 5.60 - samples/sec: 1588.98 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:50:50,314 epoch 4 - iter 135/275 - loss 0.07012843 - time (sec): 6.93 - samples/sec: 1583.06 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:50:51,524 epoch 4 - iter 162/275 - loss 0.06218253 - time (sec): 8.14 - samples/sec: 1629.60 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:50:52,693 epoch 4 - iter 189/275 - loss 0.07424359 - time (sec): 9.31 - samples/sec: 1652.20 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:50:53,958 epoch 4 - iter 216/275 - loss 0.07690610 - time (sec): 10.57 - samples/sec: 1686.64 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:50:55,225 epoch 4 - iter 243/275 - loss 0.07530572 - time (sec): 11.84 - samples/sec: 1684.34 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:50:56,482 epoch 4 - iter 270/275 - loss 0.07860860 - time (sec): 13.09 - samples/sec: 1702.90 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:50:56,725 ----------------------------------------------------------------------------------------------------
2023-10-13 08:50:56,725 EPOCH 4 done: loss 0.0777 - lr: 0.000020
2023-10-13 08:50:57,403 DEV : loss 0.1455877721309662 - f1-score (micro avg) 0.8412
2023-10-13 08:50:57,408 ----------------------------------------------------------------------------------------------------
2023-10-13 08:50:58,637 epoch 5 - iter 27/275 - loss 0.03583915 - time (sec): 1.23 - samples/sec: 1934.21 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:50:59,864 epoch 5 - iter 54/275 - loss 0.05038324 - time (sec): 2.46 - samples/sec: 1863.62 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:51:01,072 epoch 5 - iter 81/275 - loss 0.06314964 - time (sec): 3.66 - samples/sec: 1809.70 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:51:02,316 epoch 5 - iter 108/275 - loss 0.06525363 - time (sec): 4.91 - samples/sec: 1847.91 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:51:03,592 epoch 5 - iter 135/275 - loss 0.05861941 - time (sec): 6.18 - samples/sec: 1848.08 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:51:04,844 epoch 5 - iter 162/275 - loss 0.05839377 - time (sec): 7.43 - samples/sec: 1805.68 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:51:06,076 epoch 5 - iter 189/275 - loss 0.05582493 - time (sec): 8.67 - samples/sec: 1817.74 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:51:07,365 epoch 5 - iter 216/275 - loss 0.06190340 - time (sec): 9.96 - samples/sec: 1804.07 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:51:08,651 epoch 5 - iter 243/275 - loss 0.06255433 - time (sec): 11.24 - samples/sec: 1804.27 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:51:09,868 epoch 5 - iter 270/275 - loss 0.05876610 - time (sec): 12.46 - samples/sec: 1798.45 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:51:10,093 ----------------------------------------------------------------------------------------------------
2023-10-13 08:51:10,093 EPOCH 5 done: loss 0.0580 - lr: 0.000017
2023-10-13 08:51:10,766 DEV : loss 0.17545810341835022 - f1-score (micro avg) 0.8642
2023-10-13 08:51:10,770 saving best model
2023-10-13 08:51:11,298 ----------------------------------------------------------------------------------------------------
2023-10-13 08:51:12,492 epoch 6 - iter 27/275 - loss 0.05239298 - time (sec): 1.18 - samples/sec: 1904.41 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:51:13,699 epoch 6 - iter 54/275 - loss 0.05168950 - time (sec): 2.39 - samples/sec: 1782.35 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:51:14,902 epoch 6 - iter 81/275 - loss 0.06269989 - time (sec): 3.59 - samples/sec: 1766.63 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:51:16,121 epoch 6 - iter 108/275 - loss 0.05317394 - time (sec): 4.81 - samples/sec: 1798.04 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:51:17,315 epoch 6 - iter 135/275 - loss 0.05372513 - time (sec): 6.01 - samples/sec: 1786.78 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:51:18,506 epoch 6 - iter 162/275 - loss 0.05411059 - time (sec): 7.20 - samples/sec: 1781.64 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:51:19,732 epoch 6 - iter 189/275 - loss 0.05236838 - time (sec): 8.42 - samples/sec: 1801.09 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:51:20,898 epoch 6 - iter 216/275 - loss 0.05119018 - time (sec): 9.59 - samples/sec: 1827.68 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:51:22,082 epoch 6 - iter 243/275 - loss 0.04738262 - time (sec): 10.77 - samples/sec: 1859.14 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:51:23,273 epoch 6 - iter 270/275 - loss 0.04608281 - time (sec): 11.97 - samples/sec: 1878.38 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:51:23,483 ----------------------------------------------------------------------------------------------------
2023-10-13 08:51:23,483 EPOCH 6 done: loss 0.0463 - lr: 0.000013
2023-10-13 08:51:24,131 DEV : loss 0.17988666892051697 - f1-score (micro avg) 0.864
2023-10-13 08:51:24,136 ----------------------------------------------------------------------------------------------------
2023-10-13 08:51:25,333 epoch 7 - iter 27/275 - loss 0.02589880 - time (sec): 1.20 - samples/sec: 1865.96 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:51:26,516 epoch 7 - iter 54/275 - loss 0.03031099 - time (sec): 2.38 - samples/sec: 1929.73 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:51:27,690 epoch 7 - iter 81/275 - loss 0.03342662 - time (sec): 3.55 - samples/sec: 1846.57 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:51:28,873 epoch 7 - iter 108/275 - loss 0.03723331 - time (sec): 4.74 - samples/sec: 1908.80 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:51:30,058 epoch 7 - iter 135/275 - loss 0.03485540 - time (sec): 5.92 - samples/sec: 1871.16 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:51:31,247 epoch 7 - iter 162/275 - loss 0.03562915 - time (sec): 7.11 - samples/sec: 1849.76 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:51:32,433 epoch 7 - iter 189/275 - loss 0.03341349 - time (sec): 8.30 - samples/sec: 1853.46 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:51:33,607 epoch 7 - iter 216/275 - loss 0.03492135 - time (sec): 9.47 - samples/sec: 1834.75 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:51:34,799 epoch 7 - iter 243/275 - loss 0.03638176 - time (sec): 10.66 - samples/sec: 1860.31 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:51:35,984 epoch 7 - iter 270/275 - loss 0.03517408 - time (sec): 11.85 - samples/sec: 1878.78 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:51:36,220 ----------------------------------------------------------------------------------------------------
2023-10-13 08:51:36,220 EPOCH 7 done: loss 0.0353 - lr: 0.000010
2023-10-13 08:51:36,891 DEV : loss 0.16233113408088684 - f1-score (micro avg) 0.8772
2023-10-13 08:51:36,895 saving best model
2023-10-13 08:51:37,357 ----------------------------------------------------------------------------------------------------
2023-10-13 08:51:38,552 epoch 8 - iter 27/275 - loss 0.04415221 - time (sec): 1.19 - samples/sec: 1885.22 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:51:39,939 epoch 8 - iter 54/275 - loss 0.03910510 - time (sec): 2.58 - samples/sec: 1807.48 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:51:41,321 epoch 8 - iter 81/275 - loss 0.03450106 - time (sec): 3.96 - samples/sec: 1763.73 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:51:42,682 epoch 8 - iter 108/275 - loss 0.03322144 - time (sec): 5.32 - samples/sec: 1711.64 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:51:43,883 epoch 8 - iter 135/275 - loss 0.03863140 - time (sec): 6.52 - samples/sec: 1706.65 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:51:45,051 epoch 8 - iter 162/275 - loss 0.03522197 - time (sec): 7.69 - samples/sec: 1732.02 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:51:46,264 epoch 8 - iter 189/275 - loss 0.03190950 - time (sec): 8.91 - samples/sec: 1713.96 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:51:47,536 epoch 8 - iter 216/275 - loss 0.03309587 - time (sec): 10.18 - samples/sec: 1733.68 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:51:48,783 epoch 8 - iter 243/275 - loss 0.03109874 - time (sec): 11.42 - samples/sec: 1748.54 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:51:50,021 epoch 8 - iter 270/275 - loss 0.02852667 - time (sec): 12.66 - samples/sec: 1763.71 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:51:50,253 ----------------------------------------------------------------------------------------------------
2023-10-13 08:51:50,253 EPOCH 8 done: loss 0.0287 - lr: 0.000007
2023-10-13 08:51:50,944 DEV : loss 0.1571986824274063 - f1-score (micro avg) 0.8717
2023-10-13 08:51:50,949 ----------------------------------------------------------------------------------------------------
2023-10-13 08:51:52,202 epoch 9 - iter 27/275 - loss 0.00898256 - time (sec): 1.25 - samples/sec: 1911.20 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:51:53,512 epoch 9 - iter 54/275 - loss 0.01148626 - time (sec): 2.56 - samples/sec: 1804.78 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:51:54,729 epoch 9 - iter 81/275 - loss 0.01045942 - time (sec): 3.78 - samples/sec: 1802.75 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:51:55,911 epoch 9 - iter 108/275 - loss 0.02500147 - time (sec): 4.96 - samples/sec: 1866.79 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:51:57,227 epoch 9 - iter 135/275 - loss 0.02624988 - time (sec): 6.28 - samples/sec: 1820.88 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:51:58,610 epoch 9 - iter 162/275 - loss 0.02513813 - time (sec): 7.66 - samples/sec: 1793.48 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:51:59,992 epoch 9 - iter 189/275 - loss 0.02662093 - time (sec): 9.04 - samples/sec: 1760.08 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:52:01,384 epoch 9 - iter 216/275 - loss 0.02526243 - time (sec): 10.43 - samples/sec: 1735.36 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:52:02,777 epoch 9 - iter 243/275 - loss 0.02403872 - time (sec): 11.83 - samples/sec: 1723.98 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:52:04,099 epoch 9 - iter 270/275 - loss 0.02364584 - time (sec): 13.15 - samples/sec: 1698.66 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:52:04,342 ----------------------------------------------------------------------------------------------------
2023-10-13 08:52:04,342 EPOCH 9 done: loss 0.0245 - lr: 0.000003
2023-10-13 08:52:05,007 DEV : loss 0.1661010980606079 - f1-score (micro avg) 0.8759
2023-10-13 08:52:05,012 ----------------------------------------------------------------------------------------------------
2023-10-13 08:52:06,284 epoch 10 - iter 27/275 - loss 0.04455625 - time (sec): 1.27 - samples/sec: 1725.22 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:52:07,467 epoch 10 - iter 54/275 - loss 0.04334836 - time (sec): 2.45 - samples/sec: 1846.71 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:52:08,648 epoch 10 - iter 81/275 - loss 0.03198027 - time (sec): 3.64 - samples/sec: 1869.62 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:52:09,837 epoch 10 - iter 108/275 - loss 0.02643589 - time (sec): 4.82 - samples/sec: 1886.01 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:52:11,031 epoch 10 - iter 135/275 - loss 0.02498862 - time (sec): 6.02 - samples/sec: 1889.90 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:52:12,224 epoch 10 - iter 162/275 - loss 0.02255539 - time (sec): 7.21 - samples/sec: 1901.99 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:52:13,464 epoch 10 - iter 189/275 - loss 0.02401308 - time (sec): 8.45 - samples/sec: 1878.69 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:52:14,648 epoch 10 - iter 216/275 - loss 0.02170947 - time (sec): 9.63 - samples/sec: 1886.06 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:52:15,829 epoch 10 - iter 243/275 - loss 0.02142910 - time (sec): 10.82 - samples/sec: 1889.66 - lr: 0.000000 - momentum: 0.000000
2023-10-13 08:52:16,995 epoch 10 - iter 270/275 - loss 0.02083004 - time (sec): 11.98 - samples/sec: 1865.83 - lr: 0.000000 - momentum: 0.000000
2023-10-13 08:52:17,210 ----------------------------------------------------------------------------------------------------
2023-10-13 08:52:17,210 EPOCH 10 done: loss 0.0205 - lr: 0.000000
2023-10-13 08:52:17,860 DEV : loss 0.16067276895046234 - f1-score (micro avg) 0.8732
2023-10-13 08:52:18,259 ----------------------------------------------------------------------------------------------------
2023-10-13 08:52:18,261 Loading model from best epoch ...
2023-10-13 08:52:19,947 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-13 08:52:20,758
Results:
- F-score (micro) 0.9105
- F-score (macro) 0.6799
- Accuracy 0.8439
By class:
precision recall f1-score support
scope 0.8977 0.8977 0.8977 176
pers 0.9606 0.9531 0.9569 128
work 0.8784 0.8784 0.8784 74
loc 1.0000 0.5000 0.6667 2
object 0.0000 0.0000 0.0000 2
micro avg 0.9153 0.9058 0.9105 382
macro avg 0.7473 0.6458 0.6799 382
weighted avg 0.9109 0.9058 0.9079 382
2023-10-13 08:52:20,758 ----------------------------------------------------------------------------------------------------