|
2023-10-20 00:26:09,861 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:09,862 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-20 00:26:09,862 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:09,862 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-20 00:26:09,862 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:09,862 Train: 1085 sentences |
|
2023-10-20 00:26:09,862 (train_with_dev=False, train_with_test=False) |
|
2023-10-20 00:26:09,862 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:09,862 Training Params: |
|
2023-10-20 00:26:09,862 - learning_rate: "3e-05" |
|
2023-10-20 00:26:09,862 - mini_batch_size: "8" |
|
2023-10-20 00:26:09,862 - max_epochs: "10" |
|
2023-10-20 00:26:09,862 - shuffle: "True" |
|
2023-10-20 00:26:09,862 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:09,862 Plugins: |
|
2023-10-20 00:26:09,862 - TensorboardLogger |
|
2023-10-20 00:26:09,862 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-20 00:26:09,862 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:09,862 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-20 00:26:09,862 - metric: "('micro avg', 'f1-score')" |
|
2023-10-20 00:26:09,863 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:09,863 Computation: |
|
2023-10-20 00:26:09,863 - compute on device: cuda:0 |
|
2023-10-20 00:26:09,863 - embedding storage: none |
|
2023-10-20 00:26:09,863 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:09,863 Model training base path: "hmbench-newseye/sv-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-20 00:26:09,863 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:09,863 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:09,863 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-20 00:26:10,210 epoch 1 - iter 13/136 - loss 2.99665919 - time (sec): 0.35 - samples/sec: 15121.83 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:26:10,561 epoch 1 - iter 26/136 - loss 2.98449998 - time (sec): 0.70 - samples/sec: 14134.00 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:26:10,907 epoch 1 - iter 39/136 - loss 2.89043289 - time (sec): 1.04 - samples/sec: 13984.31 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:26:11,270 epoch 1 - iter 52/136 - loss 2.85357802 - time (sec): 1.41 - samples/sec: 13781.60 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:26:11,614 epoch 1 - iter 65/136 - loss 2.75183850 - time (sec): 1.75 - samples/sec: 13924.46 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:26:11,952 epoch 1 - iter 78/136 - loss 2.64363232 - time (sec): 2.09 - samples/sec: 14180.22 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-20 00:26:12,327 epoch 1 - iter 91/136 - loss 2.53903376 - time (sec): 2.46 - samples/sec: 14420.25 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-20 00:26:12,662 epoch 1 - iter 104/136 - loss 2.46104683 - time (sec): 2.80 - samples/sec: 14082.63 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-20 00:26:13,028 epoch 1 - iter 117/136 - loss 2.34908798 - time (sec): 3.17 - samples/sec: 13958.98 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 00:26:13,383 epoch 1 - iter 130/136 - loss 2.19851671 - time (sec): 3.52 - samples/sec: 14090.33 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 00:26:13,551 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:13,552 EPOCH 1 done: loss 2.1422 - lr: 0.000028 |
|
2023-10-20 00:26:13,822 DEV : loss 0.5441645383834839 - f1-score (micro avg) 0.0 |
|
2023-10-20 00:26:13,826 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:14,196 epoch 2 - iter 13/136 - loss 0.80962263 - time (sec): 0.37 - samples/sec: 15895.25 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-20 00:26:14,534 epoch 2 - iter 26/136 - loss 0.80754832 - time (sec): 0.71 - samples/sec: 14377.71 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-20 00:26:14,877 epoch 2 - iter 39/136 - loss 0.83588442 - time (sec): 1.05 - samples/sec: 13875.51 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-20 00:26:15,214 epoch 2 - iter 52/136 - loss 0.76853305 - time (sec): 1.39 - samples/sec: 14236.28 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-20 00:26:15,545 epoch 2 - iter 65/136 - loss 0.73432052 - time (sec): 1.72 - samples/sec: 14525.79 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 00:26:15,855 epoch 2 - iter 78/136 - loss 0.71393397 - time (sec): 2.03 - samples/sec: 14640.23 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 00:26:16,199 epoch 2 - iter 91/136 - loss 0.72380019 - time (sec): 2.37 - samples/sec: 14811.50 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-20 00:26:16,537 epoch 2 - iter 104/136 - loss 0.71622364 - time (sec): 2.71 - samples/sec: 14663.57 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-20 00:26:16,907 epoch 2 - iter 117/136 - loss 0.70749680 - time (sec): 3.08 - samples/sec: 14619.10 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-20 00:26:17,247 epoch 2 - iter 130/136 - loss 0.70273919 - time (sec): 3.42 - samples/sec: 14563.18 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-20 00:26:17,407 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:17,407 EPOCH 2 done: loss 0.6946 - lr: 0.000027 |
|
2023-10-20 00:26:18,182 DEV : loss 0.43850278854370117 - f1-score (micro avg) 0.0 |
|
2023-10-20 00:26:18,186 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:18,735 epoch 3 - iter 13/136 - loss 0.52277741 - time (sec): 0.55 - samples/sec: 9947.25 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 00:26:19,065 epoch 3 - iter 26/136 - loss 0.58588897 - time (sec): 0.88 - samples/sec: 11437.94 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 00:26:19,405 epoch 3 - iter 39/136 - loss 0.58906191 - time (sec): 1.22 - samples/sec: 12078.54 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-20 00:26:19,758 epoch 3 - iter 52/136 - loss 0.55673249 - time (sec): 1.57 - samples/sec: 12046.72 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-20 00:26:20,128 epoch 3 - iter 65/136 - loss 0.57035076 - time (sec): 1.94 - samples/sec: 12361.01 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-20 00:26:20,531 epoch 3 - iter 78/136 - loss 0.56576822 - time (sec): 2.34 - samples/sec: 12895.07 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-20 00:26:20,904 epoch 3 - iter 91/136 - loss 0.54724953 - time (sec): 2.72 - samples/sec: 13498.02 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:26:21,257 epoch 3 - iter 104/136 - loss 0.54406757 - time (sec): 3.07 - samples/sec: 13490.92 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:26:21,618 epoch 3 - iter 117/136 - loss 0.54369216 - time (sec): 3.43 - samples/sec: 13313.69 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:26:21,967 epoch 3 - iter 130/136 - loss 0.54215594 - time (sec): 3.78 - samples/sec: 13275.25 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-20 00:26:22,121 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:22,121 EPOCH 3 done: loss 0.5394 - lr: 0.000024 |
|
2023-10-20 00:26:22,876 DEV : loss 0.38147714734077454 - f1-score (micro avg) 0.0142 |
|
2023-10-20 00:26:22,881 saving best model |
|
2023-10-20 00:26:22,910 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:23,233 epoch 4 - iter 13/136 - loss 0.52978791 - time (sec): 0.32 - samples/sec: 13672.32 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-20 00:26:23,574 epoch 4 - iter 26/136 - loss 0.47230636 - time (sec): 0.66 - samples/sec: 14382.16 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-20 00:26:23,913 epoch 4 - iter 39/136 - loss 0.47590949 - time (sec): 1.00 - samples/sec: 14642.28 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-20 00:26:24,303 epoch 4 - iter 52/136 - loss 0.47117945 - time (sec): 1.39 - samples/sec: 14811.71 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-20 00:26:24,657 epoch 4 - iter 65/136 - loss 0.47644536 - time (sec): 1.75 - samples/sec: 14692.50 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-20 00:26:25,009 epoch 4 - iter 78/136 - loss 0.47178728 - time (sec): 2.10 - samples/sec: 14508.13 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 00:26:25,355 epoch 4 - iter 91/136 - loss 0.48317000 - time (sec): 2.44 - samples/sec: 14580.25 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 00:26:25,698 epoch 4 - iter 104/136 - loss 0.48174032 - time (sec): 2.79 - samples/sec: 14507.93 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 00:26:26,024 epoch 4 - iter 117/136 - loss 0.48549408 - time (sec): 3.11 - samples/sec: 14283.01 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-20 00:26:26,378 epoch 4 - iter 130/136 - loss 0.48274212 - time (sec): 3.47 - samples/sec: 14381.63 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-20 00:26:26,542 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:26,542 EPOCH 4 done: loss 0.4827 - lr: 0.000020 |
|
2023-10-20 00:26:27,303 DEV : loss 0.36432167887687683 - f1-score (micro avg) 0.0 |
|
2023-10-20 00:26:27,308 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:27,677 epoch 5 - iter 13/136 - loss 0.47913941 - time (sec): 0.37 - samples/sec: 13765.02 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-20 00:26:28,041 epoch 5 - iter 26/136 - loss 0.43933384 - time (sec): 0.73 - samples/sec: 13207.64 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-20 00:26:28,390 epoch 5 - iter 39/136 - loss 0.45975437 - time (sec): 1.08 - samples/sec: 13365.86 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-20 00:26:28,750 epoch 5 - iter 52/136 - loss 0.46104503 - time (sec): 1.44 - samples/sec: 13503.49 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-20 00:26:29,114 epoch 5 - iter 65/136 - loss 0.46501693 - time (sec): 1.81 - samples/sec: 13818.84 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 00:26:29,469 epoch 5 - iter 78/136 - loss 0.45833541 - time (sec): 2.16 - samples/sec: 13670.31 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 00:26:29,829 epoch 5 - iter 91/136 - loss 0.44943291 - time (sec): 2.52 - samples/sec: 13757.89 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 00:26:30,180 epoch 5 - iter 104/136 - loss 0.44174929 - time (sec): 2.87 - samples/sec: 13812.30 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-20 00:26:30,528 epoch 5 - iter 117/136 - loss 0.45142578 - time (sec): 3.22 - samples/sec: 13756.39 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-20 00:26:30,890 epoch 5 - iter 130/136 - loss 0.45187743 - time (sec): 3.58 - samples/sec: 13844.00 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-20 00:26:31,204 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:31,205 EPOCH 5 done: loss 0.4502 - lr: 0.000017 |
|
2023-10-20 00:26:31,947 DEV : loss 0.3404705226421356 - f1-score (micro avg) 0.0528 |
|
2023-10-20 00:26:31,951 saving best model |
|
2023-10-20 00:26:31,980 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:32,338 epoch 6 - iter 13/136 - loss 0.50175299 - time (sec): 0.36 - samples/sec: 12501.55 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-20 00:26:32,697 epoch 6 - iter 26/136 - loss 0.49982765 - time (sec): 0.72 - samples/sec: 13040.53 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-20 00:26:33,053 epoch 6 - iter 39/136 - loss 0.45027033 - time (sec): 1.07 - samples/sec: 13951.57 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-20 00:26:33,417 epoch 6 - iter 52/136 - loss 0.45574613 - time (sec): 1.44 - samples/sec: 14000.59 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:26:33,777 epoch 6 - iter 65/136 - loss 0.45240642 - time (sec): 1.80 - samples/sec: 13905.97 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:26:34,132 epoch 6 - iter 78/136 - loss 0.43718518 - time (sec): 2.15 - samples/sec: 14091.94 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:26:34,492 epoch 6 - iter 91/136 - loss 0.42547239 - time (sec): 2.51 - samples/sec: 14165.45 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-20 00:26:34,836 epoch 6 - iter 104/136 - loss 0.43048018 - time (sec): 2.86 - samples/sec: 14021.86 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:26:35,181 epoch 6 - iter 117/136 - loss 0.42841402 - time (sec): 3.20 - samples/sec: 13965.90 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:26:35,534 epoch 6 - iter 130/136 - loss 0.43120978 - time (sec): 3.55 - samples/sec: 14001.95 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-20 00:26:35,699 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:35,699 EPOCH 6 done: loss 0.4306 - lr: 0.000014 |
|
2023-10-20 00:26:36,462 DEV : loss 0.3289472460746765 - f1-score (micro avg) 0.0831 |
|
2023-10-20 00:26:36,465 saving best model |
|
2023-10-20 00:26:36,494 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:36,854 epoch 7 - iter 13/136 - loss 0.45984081 - time (sec): 0.36 - samples/sec: 14860.40 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-20 00:26:37,212 epoch 7 - iter 26/136 - loss 0.44705805 - time (sec): 0.72 - samples/sec: 14789.76 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-20 00:26:37,577 epoch 7 - iter 39/136 - loss 0.43546882 - time (sec): 1.08 - samples/sec: 14894.11 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:26:37,929 epoch 7 - iter 52/136 - loss 0.41545710 - time (sec): 1.43 - samples/sec: 14654.94 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:26:38,268 epoch 7 - iter 65/136 - loss 0.41513872 - time (sec): 1.77 - samples/sec: 14218.19 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:26:38,630 epoch 7 - iter 78/136 - loss 0.40326853 - time (sec): 2.13 - samples/sec: 14374.55 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-20 00:26:38,978 epoch 7 - iter 91/136 - loss 0.40592087 - time (sec): 2.48 - samples/sec: 14354.66 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:26:39,324 epoch 7 - iter 104/136 - loss 0.40184235 - time (sec): 2.83 - samples/sec: 14327.56 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:26:39,661 epoch 7 - iter 117/136 - loss 0.40453948 - time (sec): 3.17 - samples/sec: 14140.94 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-20 00:26:40,013 epoch 7 - iter 130/136 - loss 0.40372435 - time (sec): 3.52 - samples/sec: 14226.77 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-20 00:26:40,164 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:40,164 EPOCH 7 done: loss 0.4053 - lr: 0.000010 |
|
2023-10-20 00:26:40,916 DEV : loss 0.31105130910873413 - f1-score (micro avg) 0.1194 |
|
2023-10-20 00:26:40,919 saving best model |
|
2023-10-20 00:26:40,949 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:41,317 epoch 8 - iter 13/136 - loss 0.31696442 - time (sec): 0.37 - samples/sec: 13473.49 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-20 00:26:41,679 epoch 8 - iter 26/136 - loss 0.37699195 - time (sec): 0.73 - samples/sec: 13792.15 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:26:42,209 epoch 8 - iter 39/136 - loss 0.39759404 - time (sec): 1.26 - samples/sec: 13162.24 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:26:42,560 epoch 8 - iter 52/136 - loss 0.39396828 - time (sec): 1.61 - samples/sec: 12838.80 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:26:42,926 epoch 8 - iter 65/136 - loss 0.40850508 - time (sec): 1.98 - samples/sec: 13199.17 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-20 00:26:43,273 epoch 8 - iter 78/136 - loss 0.41490364 - time (sec): 2.32 - samples/sec: 13452.86 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:26:43,655 epoch 8 - iter 91/136 - loss 0.40558301 - time (sec): 2.71 - samples/sec: 13657.84 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:26:43,992 epoch 8 - iter 104/136 - loss 0.40220342 - time (sec): 3.04 - samples/sec: 13476.75 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-20 00:26:44,343 epoch 8 - iter 117/136 - loss 0.40353876 - time (sec): 3.39 - samples/sec: 13427.45 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-20 00:26:44,680 epoch 8 - iter 130/136 - loss 0.41196536 - time (sec): 3.73 - samples/sec: 13411.49 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-20 00:26:44,842 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:44,842 EPOCH 8 done: loss 0.4095 - lr: 0.000007 |
|
2023-10-20 00:26:45,596 DEV : loss 0.31313076615333557 - f1-score (micro avg) 0.1201 |
|
2023-10-20 00:26:45,600 saving best model |
|
2023-10-20 00:26:45,629 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:45,969 epoch 9 - iter 13/136 - loss 0.43448810 - time (sec): 0.34 - samples/sec: 14081.82 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:26:46,337 epoch 9 - iter 26/136 - loss 0.42036923 - time (sec): 0.71 - samples/sec: 14653.60 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:26:46,699 epoch 9 - iter 39/136 - loss 0.44891315 - time (sec): 1.07 - samples/sec: 14429.92 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:26:47,084 epoch 9 - iter 52/136 - loss 0.41622964 - time (sec): 1.45 - samples/sec: 14667.10 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-20 00:26:47,461 epoch 9 - iter 65/136 - loss 0.40763375 - time (sec): 1.83 - samples/sec: 14216.77 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 00:26:47,817 epoch 9 - iter 78/136 - loss 0.41340501 - time (sec): 2.19 - samples/sec: 13746.41 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 00:26:48,204 epoch 9 - iter 91/136 - loss 0.40735232 - time (sec): 2.57 - samples/sec: 13774.69 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-20 00:26:48,568 epoch 9 - iter 104/136 - loss 0.39881921 - time (sec): 2.94 - samples/sec: 13795.60 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:26:48,953 epoch 9 - iter 117/136 - loss 0.40044632 - time (sec): 3.32 - samples/sec: 13851.86 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:26:49,322 epoch 9 - iter 130/136 - loss 0.40023516 - time (sec): 3.69 - samples/sec: 13655.76 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-20 00:26:49,482 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:49,482 EPOCH 9 done: loss 0.3993 - lr: 0.000004 |
|
2023-10-20 00:26:50,255 DEV : loss 0.31110045313835144 - f1-score (micro avg) 0.1349 |
|
2023-10-20 00:26:50,259 saving best model |
|
2023-10-20 00:26:50,289 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:50,686 epoch 10 - iter 13/136 - loss 0.43622969 - time (sec): 0.40 - samples/sec: 14217.95 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:26:51,076 epoch 10 - iter 26/136 - loss 0.39387330 - time (sec): 0.79 - samples/sec: 13004.00 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:26:51,448 epoch 10 - iter 39/136 - loss 0.38967037 - time (sec): 1.16 - samples/sec: 13242.34 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-20 00:26:51,810 epoch 10 - iter 52/136 - loss 0.38697306 - time (sec): 1.52 - samples/sec: 13404.58 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 00:26:52,169 epoch 10 - iter 65/136 - loss 0.37368761 - time (sec): 1.88 - samples/sec: 13531.42 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 00:26:52,513 epoch 10 - iter 78/136 - loss 0.37859373 - time (sec): 2.22 - samples/sec: 13428.23 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-20 00:26:52,868 epoch 10 - iter 91/136 - loss 0.37944359 - time (sec): 2.58 - samples/sec: 13472.23 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 00:26:53,223 epoch 10 - iter 104/136 - loss 0.39037495 - time (sec): 2.93 - samples/sec: 13526.05 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 00:26:53,576 epoch 10 - iter 117/136 - loss 0.38402079 - time (sec): 3.29 - samples/sec: 13590.61 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-20 00:26:54,080 epoch 10 - iter 130/136 - loss 0.38923555 - time (sec): 3.79 - samples/sec: 13176.70 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-20 00:26:54,226 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:54,226 EPOCH 10 done: loss 0.3892 - lr: 0.000000 |
|
2023-10-20 00:26:54,980 DEV : loss 0.3083120286464691 - f1-score (micro avg) 0.1399 |
|
2023-10-20 00:26:54,984 saving best model |
|
2023-10-20 00:26:55,039 ---------------------------------------------------------------------------------------------------- |
|
2023-10-20 00:26:55,040 Loading model from best epoch ... |
|
2023-10-20 00:26:55,115 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-20 00:26:55,900 |
|
Results: |
|
- F-score (micro) 0.1263 |
|
- F-score (macro) 0.0633 |
|
- Accuracy 0.07 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.1901 0.2212 0.2044 208 |
|
LOC 0.5000 0.0256 0.0488 312 |
|
ORG 0.0000 0.0000 0.0000 55 |
|
HumanProd 0.0000 0.0000 0.0000 22 |
|
|
|
micro avg 0.2093 0.0905 0.1263 597 |
|
macro avg 0.1725 0.0617 0.0633 597 |
|
weighted avg 0.3275 0.0905 0.0967 597 |
|
|
|
2023-10-20 00:26:55,900 ---------------------------------------------------------------------------------------------------- |
|
|