stefan-it's picture
Upload folder using huggingface_hub
006b2dc
2023-10-20 00:26:09,861 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:09,862 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-20 00:26:09,862 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:09,862 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-20 00:26:09,862 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:09,862 Train: 1085 sentences
2023-10-20 00:26:09,862 (train_with_dev=False, train_with_test=False)
2023-10-20 00:26:09,862 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:09,862 Training Params:
2023-10-20 00:26:09,862 - learning_rate: "3e-05"
2023-10-20 00:26:09,862 - mini_batch_size: "8"
2023-10-20 00:26:09,862 - max_epochs: "10"
2023-10-20 00:26:09,862 - shuffle: "True"
2023-10-20 00:26:09,862 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:09,862 Plugins:
2023-10-20 00:26:09,862 - TensorboardLogger
2023-10-20 00:26:09,862 - LinearScheduler | warmup_fraction: '0.1'
2023-10-20 00:26:09,862 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:09,862 Final evaluation on model from best epoch (best-model.pt)
2023-10-20 00:26:09,862 - metric: "('micro avg', 'f1-score')"
2023-10-20 00:26:09,863 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:09,863 Computation:
2023-10-20 00:26:09,863 - compute on device: cuda:0
2023-10-20 00:26:09,863 - embedding storage: none
2023-10-20 00:26:09,863 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:09,863 Model training base path: "hmbench-newseye/sv-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-20 00:26:09,863 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:09,863 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:09,863 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-20 00:26:10,210 epoch 1 - iter 13/136 - loss 2.99665919 - time (sec): 0.35 - samples/sec: 15121.83 - lr: 0.000003 - momentum: 0.000000
2023-10-20 00:26:10,561 epoch 1 - iter 26/136 - loss 2.98449998 - time (sec): 0.70 - samples/sec: 14134.00 - lr: 0.000006 - momentum: 0.000000
2023-10-20 00:26:10,907 epoch 1 - iter 39/136 - loss 2.89043289 - time (sec): 1.04 - samples/sec: 13984.31 - lr: 0.000008 - momentum: 0.000000
2023-10-20 00:26:11,270 epoch 1 - iter 52/136 - loss 2.85357802 - time (sec): 1.41 - samples/sec: 13781.60 - lr: 0.000011 - momentum: 0.000000
2023-10-20 00:26:11,614 epoch 1 - iter 65/136 - loss 2.75183850 - time (sec): 1.75 - samples/sec: 13924.46 - lr: 0.000014 - momentum: 0.000000
2023-10-20 00:26:11,952 epoch 1 - iter 78/136 - loss 2.64363232 - time (sec): 2.09 - samples/sec: 14180.22 - lr: 0.000017 - momentum: 0.000000
2023-10-20 00:26:12,327 epoch 1 - iter 91/136 - loss 2.53903376 - time (sec): 2.46 - samples/sec: 14420.25 - lr: 0.000020 - momentum: 0.000000
2023-10-20 00:26:12,662 epoch 1 - iter 104/136 - loss 2.46104683 - time (sec): 2.80 - samples/sec: 14082.63 - lr: 0.000023 - momentum: 0.000000
2023-10-20 00:26:13,028 epoch 1 - iter 117/136 - loss 2.34908798 - time (sec): 3.17 - samples/sec: 13958.98 - lr: 0.000026 - momentum: 0.000000
2023-10-20 00:26:13,383 epoch 1 - iter 130/136 - loss 2.19851671 - time (sec): 3.52 - samples/sec: 14090.33 - lr: 0.000028 - momentum: 0.000000
2023-10-20 00:26:13,551 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:13,552 EPOCH 1 done: loss 2.1422 - lr: 0.000028
2023-10-20 00:26:13,822 DEV : loss 0.5441645383834839 - f1-score (micro avg) 0.0
2023-10-20 00:26:13,826 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:14,196 epoch 2 - iter 13/136 - loss 0.80962263 - time (sec): 0.37 - samples/sec: 15895.25 - lr: 0.000030 - momentum: 0.000000
2023-10-20 00:26:14,534 epoch 2 - iter 26/136 - loss 0.80754832 - time (sec): 0.71 - samples/sec: 14377.71 - lr: 0.000029 - momentum: 0.000000
2023-10-20 00:26:14,877 epoch 2 - iter 39/136 - loss 0.83588442 - time (sec): 1.05 - samples/sec: 13875.51 - lr: 0.000029 - momentum: 0.000000
2023-10-20 00:26:15,214 epoch 2 - iter 52/136 - loss 0.76853305 - time (sec): 1.39 - samples/sec: 14236.28 - lr: 0.000029 - momentum: 0.000000
2023-10-20 00:26:15,545 epoch 2 - iter 65/136 - loss 0.73432052 - time (sec): 1.72 - samples/sec: 14525.79 - lr: 0.000028 - momentum: 0.000000
2023-10-20 00:26:15,855 epoch 2 - iter 78/136 - loss 0.71393397 - time (sec): 2.03 - samples/sec: 14640.23 - lr: 0.000028 - momentum: 0.000000
2023-10-20 00:26:16,199 epoch 2 - iter 91/136 - loss 0.72380019 - time (sec): 2.37 - samples/sec: 14811.50 - lr: 0.000028 - momentum: 0.000000
2023-10-20 00:26:16,537 epoch 2 - iter 104/136 - loss 0.71622364 - time (sec): 2.71 - samples/sec: 14663.57 - lr: 0.000027 - momentum: 0.000000
2023-10-20 00:26:16,907 epoch 2 - iter 117/136 - loss 0.70749680 - time (sec): 3.08 - samples/sec: 14619.10 - lr: 0.000027 - momentum: 0.000000
2023-10-20 00:26:17,247 epoch 2 - iter 130/136 - loss 0.70273919 - time (sec): 3.42 - samples/sec: 14563.18 - lr: 0.000027 - momentum: 0.000000
2023-10-20 00:26:17,407 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:17,407 EPOCH 2 done: loss 0.6946 - lr: 0.000027
2023-10-20 00:26:18,182 DEV : loss 0.43850278854370117 - f1-score (micro avg) 0.0
2023-10-20 00:26:18,186 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:18,735 epoch 3 - iter 13/136 - loss 0.52277741 - time (sec): 0.55 - samples/sec: 9947.25 - lr: 0.000026 - momentum: 0.000000
2023-10-20 00:26:19,065 epoch 3 - iter 26/136 - loss 0.58588897 - time (sec): 0.88 - samples/sec: 11437.94 - lr: 0.000026 - momentum: 0.000000
2023-10-20 00:26:19,405 epoch 3 - iter 39/136 - loss 0.58906191 - time (sec): 1.22 - samples/sec: 12078.54 - lr: 0.000026 - momentum: 0.000000
2023-10-20 00:26:19,758 epoch 3 - iter 52/136 - loss 0.55673249 - time (sec): 1.57 - samples/sec: 12046.72 - lr: 0.000025 - momentum: 0.000000
2023-10-20 00:26:20,128 epoch 3 - iter 65/136 - loss 0.57035076 - time (sec): 1.94 - samples/sec: 12361.01 - lr: 0.000025 - momentum: 0.000000
2023-10-20 00:26:20,531 epoch 3 - iter 78/136 - loss 0.56576822 - time (sec): 2.34 - samples/sec: 12895.07 - lr: 0.000025 - momentum: 0.000000
2023-10-20 00:26:20,904 epoch 3 - iter 91/136 - loss 0.54724953 - time (sec): 2.72 - samples/sec: 13498.02 - lr: 0.000024 - momentum: 0.000000
2023-10-20 00:26:21,257 epoch 3 - iter 104/136 - loss 0.54406757 - time (sec): 3.07 - samples/sec: 13490.92 - lr: 0.000024 - momentum: 0.000000
2023-10-20 00:26:21,618 epoch 3 - iter 117/136 - loss 0.54369216 - time (sec): 3.43 - samples/sec: 13313.69 - lr: 0.000024 - momentum: 0.000000
2023-10-20 00:26:21,967 epoch 3 - iter 130/136 - loss 0.54215594 - time (sec): 3.78 - samples/sec: 13275.25 - lr: 0.000024 - momentum: 0.000000
2023-10-20 00:26:22,121 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:22,121 EPOCH 3 done: loss 0.5394 - lr: 0.000024
2023-10-20 00:26:22,876 DEV : loss 0.38147714734077454 - f1-score (micro avg) 0.0142
2023-10-20 00:26:22,881 saving best model
2023-10-20 00:26:22,910 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:23,233 epoch 4 - iter 13/136 - loss 0.52978791 - time (sec): 0.32 - samples/sec: 13672.32 - lr: 0.000023 - momentum: 0.000000
2023-10-20 00:26:23,574 epoch 4 - iter 26/136 - loss 0.47230636 - time (sec): 0.66 - samples/sec: 14382.16 - lr: 0.000023 - momentum: 0.000000
2023-10-20 00:26:23,913 epoch 4 - iter 39/136 - loss 0.47590949 - time (sec): 1.00 - samples/sec: 14642.28 - lr: 0.000022 - momentum: 0.000000
2023-10-20 00:26:24,303 epoch 4 - iter 52/136 - loss 0.47117945 - time (sec): 1.39 - samples/sec: 14811.71 - lr: 0.000022 - momentum: 0.000000
2023-10-20 00:26:24,657 epoch 4 - iter 65/136 - loss 0.47644536 - time (sec): 1.75 - samples/sec: 14692.50 - lr: 0.000022 - momentum: 0.000000
2023-10-20 00:26:25,009 epoch 4 - iter 78/136 - loss 0.47178728 - time (sec): 2.10 - samples/sec: 14508.13 - lr: 0.000021 - momentum: 0.000000
2023-10-20 00:26:25,355 epoch 4 - iter 91/136 - loss 0.48317000 - time (sec): 2.44 - samples/sec: 14580.25 - lr: 0.000021 - momentum: 0.000000
2023-10-20 00:26:25,698 epoch 4 - iter 104/136 - loss 0.48174032 - time (sec): 2.79 - samples/sec: 14507.93 - lr: 0.000021 - momentum: 0.000000
2023-10-20 00:26:26,024 epoch 4 - iter 117/136 - loss 0.48549408 - time (sec): 3.11 - samples/sec: 14283.01 - lr: 0.000021 - momentum: 0.000000
2023-10-20 00:26:26,378 epoch 4 - iter 130/136 - loss 0.48274212 - time (sec): 3.47 - samples/sec: 14381.63 - lr: 0.000020 - momentum: 0.000000
2023-10-20 00:26:26,542 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:26,542 EPOCH 4 done: loss 0.4827 - lr: 0.000020
2023-10-20 00:26:27,303 DEV : loss 0.36432167887687683 - f1-score (micro avg) 0.0
2023-10-20 00:26:27,308 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:27,677 epoch 5 - iter 13/136 - loss 0.47913941 - time (sec): 0.37 - samples/sec: 13765.02 - lr: 0.000020 - momentum: 0.000000
2023-10-20 00:26:28,041 epoch 5 - iter 26/136 - loss 0.43933384 - time (sec): 0.73 - samples/sec: 13207.64 - lr: 0.000019 - momentum: 0.000000
2023-10-20 00:26:28,390 epoch 5 - iter 39/136 - loss 0.45975437 - time (sec): 1.08 - samples/sec: 13365.86 - lr: 0.000019 - momentum: 0.000000
2023-10-20 00:26:28,750 epoch 5 - iter 52/136 - loss 0.46104503 - time (sec): 1.44 - samples/sec: 13503.49 - lr: 0.000019 - momentum: 0.000000
2023-10-20 00:26:29,114 epoch 5 - iter 65/136 - loss 0.46501693 - time (sec): 1.81 - samples/sec: 13818.84 - lr: 0.000018 - momentum: 0.000000
2023-10-20 00:26:29,469 epoch 5 - iter 78/136 - loss 0.45833541 - time (sec): 2.16 - samples/sec: 13670.31 - lr: 0.000018 - momentum: 0.000000
2023-10-20 00:26:29,829 epoch 5 - iter 91/136 - loss 0.44943291 - time (sec): 2.52 - samples/sec: 13757.89 - lr: 0.000018 - momentum: 0.000000
2023-10-20 00:26:30,180 epoch 5 - iter 104/136 - loss 0.44174929 - time (sec): 2.87 - samples/sec: 13812.30 - lr: 0.000018 - momentum: 0.000000
2023-10-20 00:26:30,528 epoch 5 - iter 117/136 - loss 0.45142578 - time (sec): 3.22 - samples/sec: 13756.39 - lr: 0.000017 - momentum: 0.000000
2023-10-20 00:26:30,890 epoch 5 - iter 130/136 - loss 0.45187743 - time (sec): 3.58 - samples/sec: 13844.00 - lr: 0.000017 - momentum: 0.000000
2023-10-20 00:26:31,204 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:31,205 EPOCH 5 done: loss 0.4502 - lr: 0.000017
2023-10-20 00:26:31,947 DEV : loss 0.3404705226421356 - f1-score (micro avg) 0.0528
2023-10-20 00:26:31,951 saving best model
2023-10-20 00:26:31,980 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:32,338 epoch 6 - iter 13/136 - loss 0.50175299 - time (sec): 0.36 - samples/sec: 12501.55 - lr: 0.000016 - momentum: 0.000000
2023-10-20 00:26:32,697 epoch 6 - iter 26/136 - loss 0.49982765 - time (sec): 0.72 - samples/sec: 13040.53 - lr: 0.000016 - momentum: 0.000000
2023-10-20 00:26:33,053 epoch 6 - iter 39/136 - loss 0.45027033 - time (sec): 1.07 - samples/sec: 13951.57 - lr: 0.000016 - momentum: 0.000000
2023-10-20 00:26:33,417 epoch 6 - iter 52/136 - loss 0.45574613 - time (sec): 1.44 - samples/sec: 14000.59 - lr: 0.000015 - momentum: 0.000000
2023-10-20 00:26:33,777 epoch 6 - iter 65/136 - loss 0.45240642 - time (sec): 1.80 - samples/sec: 13905.97 - lr: 0.000015 - momentum: 0.000000
2023-10-20 00:26:34,132 epoch 6 - iter 78/136 - loss 0.43718518 - time (sec): 2.15 - samples/sec: 14091.94 - lr: 0.000015 - momentum: 0.000000
2023-10-20 00:26:34,492 epoch 6 - iter 91/136 - loss 0.42547239 - time (sec): 2.51 - samples/sec: 14165.45 - lr: 0.000015 - momentum: 0.000000
2023-10-20 00:26:34,836 epoch 6 - iter 104/136 - loss 0.43048018 - time (sec): 2.86 - samples/sec: 14021.86 - lr: 0.000014 - momentum: 0.000000
2023-10-20 00:26:35,181 epoch 6 - iter 117/136 - loss 0.42841402 - time (sec): 3.20 - samples/sec: 13965.90 - lr: 0.000014 - momentum: 0.000000
2023-10-20 00:26:35,534 epoch 6 - iter 130/136 - loss 0.43120978 - time (sec): 3.55 - samples/sec: 14001.95 - lr: 0.000014 - momentum: 0.000000
2023-10-20 00:26:35,699 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:35,699 EPOCH 6 done: loss 0.4306 - lr: 0.000014
2023-10-20 00:26:36,462 DEV : loss 0.3289472460746765 - f1-score (micro avg) 0.0831
2023-10-20 00:26:36,465 saving best model
2023-10-20 00:26:36,494 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:36,854 epoch 7 - iter 13/136 - loss 0.45984081 - time (sec): 0.36 - samples/sec: 14860.40 - lr: 0.000013 - momentum: 0.000000
2023-10-20 00:26:37,212 epoch 7 - iter 26/136 - loss 0.44705805 - time (sec): 0.72 - samples/sec: 14789.76 - lr: 0.000013 - momentum: 0.000000
2023-10-20 00:26:37,577 epoch 7 - iter 39/136 - loss 0.43546882 - time (sec): 1.08 - samples/sec: 14894.11 - lr: 0.000012 - momentum: 0.000000
2023-10-20 00:26:37,929 epoch 7 - iter 52/136 - loss 0.41545710 - time (sec): 1.43 - samples/sec: 14654.94 - lr: 0.000012 - momentum: 0.000000
2023-10-20 00:26:38,268 epoch 7 - iter 65/136 - loss 0.41513872 - time (sec): 1.77 - samples/sec: 14218.19 - lr: 0.000012 - momentum: 0.000000
2023-10-20 00:26:38,630 epoch 7 - iter 78/136 - loss 0.40326853 - time (sec): 2.13 - samples/sec: 14374.55 - lr: 0.000012 - momentum: 0.000000
2023-10-20 00:26:38,978 epoch 7 - iter 91/136 - loss 0.40592087 - time (sec): 2.48 - samples/sec: 14354.66 - lr: 0.000011 - momentum: 0.000000
2023-10-20 00:26:39,324 epoch 7 - iter 104/136 - loss 0.40184235 - time (sec): 2.83 - samples/sec: 14327.56 - lr: 0.000011 - momentum: 0.000000
2023-10-20 00:26:39,661 epoch 7 - iter 117/136 - loss 0.40453948 - time (sec): 3.17 - samples/sec: 14140.94 - lr: 0.000011 - momentum: 0.000000
2023-10-20 00:26:40,013 epoch 7 - iter 130/136 - loss 0.40372435 - time (sec): 3.52 - samples/sec: 14226.77 - lr: 0.000010 - momentum: 0.000000
2023-10-20 00:26:40,164 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:40,164 EPOCH 7 done: loss 0.4053 - lr: 0.000010
2023-10-20 00:26:40,916 DEV : loss 0.31105130910873413 - f1-score (micro avg) 0.1194
2023-10-20 00:26:40,919 saving best model
2023-10-20 00:26:40,949 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:41,317 epoch 8 - iter 13/136 - loss 0.31696442 - time (sec): 0.37 - samples/sec: 13473.49 - lr: 0.000010 - momentum: 0.000000
2023-10-20 00:26:41,679 epoch 8 - iter 26/136 - loss 0.37699195 - time (sec): 0.73 - samples/sec: 13792.15 - lr: 0.000009 - momentum: 0.000000
2023-10-20 00:26:42,209 epoch 8 - iter 39/136 - loss 0.39759404 - time (sec): 1.26 - samples/sec: 13162.24 - lr: 0.000009 - momentum: 0.000000
2023-10-20 00:26:42,560 epoch 8 - iter 52/136 - loss 0.39396828 - time (sec): 1.61 - samples/sec: 12838.80 - lr: 0.000009 - momentum: 0.000000
2023-10-20 00:26:42,926 epoch 8 - iter 65/136 - loss 0.40850508 - time (sec): 1.98 - samples/sec: 13199.17 - lr: 0.000009 - momentum: 0.000000
2023-10-20 00:26:43,273 epoch 8 - iter 78/136 - loss 0.41490364 - time (sec): 2.32 - samples/sec: 13452.86 - lr: 0.000008 - momentum: 0.000000
2023-10-20 00:26:43,655 epoch 8 - iter 91/136 - loss 0.40558301 - time (sec): 2.71 - samples/sec: 13657.84 - lr: 0.000008 - momentum: 0.000000
2023-10-20 00:26:43,992 epoch 8 - iter 104/136 - loss 0.40220342 - time (sec): 3.04 - samples/sec: 13476.75 - lr: 0.000008 - momentum: 0.000000
2023-10-20 00:26:44,343 epoch 8 - iter 117/136 - loss 0.40353876 - time (sec): 3.39 - samples/sec: 13427.45 - lr: 0.000007 - momentum: 0.000000
2023-10-20 00:26:44,680 epoch 8 - iter 130/136 - loss 0.41196536 - time (sec): 3.73 - samples/sec: 13411.49 - lr: 0.000007 - momentum: 0.000000
2023-10-20 00:26:44,842 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:44,842 EPOCH 8 done: loss 0.4095 - lr: 0.000007
2023-10-20 00:26:45,596 DEV : loss 0.31313076615333557 - f1-score (micro avg) 0.1201
2023-10-20 00:26:45,600 saving best model
2023-10-20 00:26:45,629 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:45,969 epoch 9 - iter 13/136 - loss 0.43448810 - time (sec): 0.34 - samples/sec: 14081.82 - lr: 0.000006 - momentum: 0.000000
2023-10-20 00:26:46,337 epoch 9 - iter 26/136 - loss 0.42036923 - time (sec): 0.71 - samples/sec: 14653.60 - lr: 0.000006 - momentum: 0.000000
2023-10-20 00:26:46,699 epoch 9 - iter 39/136 - loss 0.44891315 - time (sec): 1.07 - samples/sec: 14429.92 - lr: 0.000006 - momentum: 0.000000
2023-10-20 00:26:47,084 epoch 9 - iter 52/136 - loss 0.41622964 - time (sec): 1.45 - samples/sec: 14667.10 - lr: 0.000006 - momentum: 0.000000
2023-10-20 00:26:47,461 epoch 9 - iter 65/136 - loss 0.40763375 - time (sec): 1.83 - samples/sec: 14216.77 - lr: 0.000005 - momentum: 0.000000
2023-10-20 00:26:47,817 epoch 9 - iter 78/136 - loss 0.41340501 - time (sec): 2.19 - samples/sec: 13746.41 - lr: 0.000005 - momentum: 0.000000
2023-10-20 00:26:48,204 epoch 9 - iter 91/136 - loss 0.40735232 - time (sec): 2.57 - samples/sec: 13774.69 - lr: 0.000005 - momentum: 0.000000
2023-10-20 00:26:48,568 epoch 9 - iter 104/136 - loss 0.39881921 - time (sec): 2.94 - samples/sec: 13795.60 - lr: 0.000004 - momentum: 0.000000
2023-10-20 00:26:48,953 epoch 9 - iter 117/136 - loss 0.40044632 - time (sec): 3.32 - samples/sec: 13851.86 - lr: 0.000004 - momentum: 0.000000
2023-10-20 00:26:49,322 epoch 9 - iter 130/136 - loss 0.40023516 - time (sec): 3.69 - samples/sec: 13655.76 - lr: 0.000004 - momentum: 0.000000
2023-10-20 00:26:49,482 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:49,482 EPOCH 9 done: loss 0.3993 - lr: 0.000004
2023-10-20 00:26:50,255 DEV : loss 0.31110045313835144 - f1-score (micro avg) 0.1349
2023-10-20 00:26:50,259 saving best model
2023-10-20 00:26:50,289 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:50,686 epoch 10 - iter 13/136 - loss 0.43622969 - time (sec): 0.40 - samples/sec: 14217.95 - lr: 0.000003 - momentum: 0.000000
2023-10-20 00:26:51,076 epoch 10 - iter 26/136 - loss 0.39387330 - time (sec): 0.79 - samples/sec: 13004.00 - lr: 0.000003 - momentum: 0.000000
2023-10-20 00:26:51,448 epoch 10 - iter 39/136 - loss 0.38967037 - time (sec): 1.16 - samples/sec: 13242.34 - lr: 0.000003 - momentum: 0.000000
2023-10-20 00:26:51,810 epoch 10 - iter 52/136 - loss 0.38697306 - time (sec): 1.52 - samples/sec: 13404.58 - lr: 0.000002 - momentum: 0.000000
2023-10-20 00:26:52,169 epoch 10 - iter 65/136 - loss 0.37368761 - time (sec): 1.88 - samples/sec: 13531.42 - lr: 0.000002 - momentum: 0.000000
2023-10-20 00:26:52,513 epoch 10 - iter 78/136 - loss 0.37859373 - time (sec): 2.22 - samples/sec: 13428.23 - lr: 0.000002 - momentum: 0.000000
2023-10-20 00:26:52,868 epoch 10 - iter 91/136 - loss 0.37944359 - time (sec): 2.58 - samples/sec: 13472.23 - lr: 0.000001 - momentum: 0.000000
2023-10-20 00:26:53,223 epoch 10 - iter 104/136 - loss 0.39037495 - time (sec): 2.93 - samples/sec: 13526.05 - lr: 0.000001 - momentum: 0.000000
2023-10-20 00:26:53,576 epoch 10 - iter 117/136 - loss 0.38402079 - time (sec): 3.29 - samples/sec: 13590.61 - lr: 0.000001 - momentum: 0.000000
2023-10-20 00:26:54,080 epoch 10 - iter 130/136 - loss 0.38923555 - time (sec): 3.79 - samples/sec: 13176.70 - lr: 0.000000 - momentum: 0.000000
2023-10-20 00:26:54,226 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:54,226 EPOCH 10 done: loss 0.3892 - lr: 0.000000
2023-10-20 00:26:54,980 DEV : loss 0.3083120286464691 - f1-score (micro avg) 0.1399
2023-10-20 00:26:54,984 saving best model
2023-10-20 00:26:55,039 ----------------------------------------------------------------------------------------------------
2023-10-20 00:26:55,040 Loading model from best epoch ...
2023-10-20 00:26:55,115 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-20 00:26:55,900
Results:
- F-score (micro) 0.1263
- F-score (macro) 0.0633
- Accuracy 0.07
By class:
precision recall f1-score support
PER 0.1901 0.2212 0.2044 208
LOC 0.5000 0.0256 0.0488 312
ORG 0.0000 0.0000 0.0000 55
HumanProd 0.0000 0.0000 0.0000 22
micro avg 0.2093 0.0905 0.1263 597
macro avg 0.1725 0.0617 0.0633 597
weighted avg 0.3275 0.0905 0.0967 597
2023-10-20 00:26:55,900 ----------------------------------------------------------------------------------------------------