2023-10-20 00:14:53,713 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:14:53,713 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-20 00:14:53,713 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:14:53,713 MultiCorpus: 1085 train + 148 dev + 364 test sentences - NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator 2023-10-20 00:14:53,713 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:14:53,714 Train: 1085 sentences 2023-10-20 00:14:53,714 (train_with_dev=False, train_with_test=False) 2023-10-20 00:14:53,714 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:14:53,714 Training Params: 2023-10-20 00:14:53,714 - learning_rate: "3e-05" 2023-10-20 00:14:53,714 - mini_batch_size: "8" 2023-10-20 00:14:53,714 - max_epochs: "10" 2023-10-20 00:14:53,714 - shuffle: "True" 2023-10-20 00:14:53,714 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:14:53,714 Plugins: 2023-10-20 00:14:53,714 - TensorboardLogger 2023-10-20 00:14:53,714 - LinearScheduler | warmup_fraction: '0.1' 2023-10-20 00:14:53,714 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:14:53,714 Final evaluation on model from best epoch (best-model.pt) 2023-10-20 00:14:53,714 - metric: "('micro avg', 'f1-score')" 2023-10-20 00:14:53,714 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:14:53,714 Computation: 2023-10-20 00:14:53,714 - compute on device: cuda:0 2023-10-20 00:14:53,714 - embedding storage: none 2023-10-20 00:14:53,714 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:14:53,714 Model training base path: "hmbench-newseye/sv-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-20 00:14:53,714 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:14:53,714 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:14:53,714 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-20 00:14:54,037 epoch 1 - iter 13/136 - loss 2.72152588 - time (sec): 0.32 - samples/sec: 14670.99 - lr: 0.000003 - momentum: 0.000000 2023-10-20 00:14:54,392 epoch 1 - iter 26/136 - loss 2.78487699 - time (sec): 0.68 - samples/sec: 15197.47 - lr: 0.000006 - momentum: 0.000000 2023-10-20 00:14:54,714 epoch 1 - iter 39/136 - loss 2.77142009 - time (sec): 1.00 - samples/sec: 13959.57 - lr: 0.000008 - momentum: 0.000000 2023-10-20 00:14:55,063 epoch 1 - iter 52/136 - loss 2.70481600 - time (sec): 1.35 - samples/sec: 13540.38 - lr: 0.000011 - momentum: 0.000000 2023-10-20 00:14:55,437 epoch 1 - iter 65/136 - loss 2.61063044 - time (sec): 1.72 - samples/sec: 13659.15 - lr: 0.000014 - momentum: 0.000000 2023-10-20 00:14:55,789 epoch 1 - iter 78/136 - loss 2.51998300 - time (sec): 2.07 - samples/sec: 13812.47 - lr: 0.000017 - momentum: 0.000000 2023-10-20 00:14:56,150 epoch 1 - iter 91/136 - loss 2.40568426 - time (sec): 2.44 - samples/sec: 14092.89 - lr: 0.000020 - momentum: 0.000000 2023-10-20 00:14:56,497 epoch 1 - iter 104/136 - loss 2.29489111 - time (sec): 2.78 - samples/sec: 14204.10 - lr: 0.000023 - momentum: 0.000000 2023-10-20 00:14:56,871 epoch 1 - iter 117/136 - loss 2.14436655 - time (sec): 3.16 - samples/sec: 14496.12 - lr: 0.000026 - momentum: 0.000000 2023-10-20 00:14:57,217 epoch 1 - iter 130/136 - loss 2.03866895 - time (sec): 3.50 - samples/sec: 14314.43 - lr: 0.000028 - momentum: 0.000000 2023-10-20 00:14:57,370 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:14:57,370 EPOCH 1 done: loss 2.0027 - lr: 0.000028 2023-10-20 00:14:57,644 DEV : loss 0.5147674083709717 - f1-score (micro avg) 0.0 2023-10-20 00:14:57,647 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:14:57,998 epoch 2 - iter 13/136 - loss 0.76075803 - time (sec): 0.35 - samples/sec: 13093.46 - lr: 0.000030 - momentum: 0.000000 2023-10-20 00:14:58,358 epoch 2 - iter 26/136 - loss 0.68474176 - time (sec): 0.71 - samples/sec: 13609.77 - lr: 0.000029 - momentum: 0.000000 2023-10-20 00:14:58,720 epoch 2 - iter 39/136 - loss 0.67307661 - time (sec): 1.07 - samples/sec: 14304.35 - lr: 0.000029 - momentum: 0.000000 2023-10-20 00:14:59,064 epoch 2 - iter 52/136 - loss 0.67974143 - time (sec): 1.42 - samples/sec: 14231.98 - lr: 0.000029 - momentum: 0.000000 2023-10-20 00:14:59,438 epoch 2 - iter 65/136 - loss 0.66999883 - time (sec): 1.79 - samples/sec: 14118.85 - lr: 0.000028 - momentum: 0.000000 2023-10-20 00:14:59,787 epoch 2 - iter 78/136 - loss 0.64717688 - time (sec): 2.14 - samples/sec: 14069.74 - lr: 0.000028 - momentum: 0.000000 2023-10-20 00:15:00,127 epoch 2 - iter 91/136 - loss 0.64394003 - time (sec): 2.48 - samples/sec: 14098.17 - lr: 0.000028 - momentum: 0.000000 2023-10-20 00:15:00,476 epoch 2 - iter 104/136 - loss 0.64957491 - time (sec): 2.83 - samples/sec: 14269.41 - lr: 0.000027 - momentum: 0.000000 2023-10-20 00:15:00,819 epoch 2 - iter 117/136 - loss 0.65572779 - time (sec): 3.17 - samples/sec: 14018.30 - lr: 0.000027 - momentum: 0.000000 2023-10-20 00:15:01,171 epoch 2 - iter 130/136 - loss 0.64649843 - time (sec): 3.52 - samples/sec: 14109.96 - lr: 0.000027 - momentum: 0.000000 2023-10-20 00:15:01,334 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:01,334 EPOCH 2 done: loss 0.6472 - lr: 0.000027 2023-10-20 00:15:02,256 DEV : loss 0.4484618604183197 - f1-score (micro avg) 0.0 2023-10-20 00:15:02,261 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:02,614 epoch 3 - iter 13/136 - loss 0.52962342 - time (sec): 0.35 - samples/sec: 13201.03 - lr: 0.000026 - momentum: 0.000000 2023-10-20 00:15:02,957 epoch 3 - iter 26/136 - loss 0.57354178 - time (sec): 0.70 - samples/sec: 12403.69 - lr: 0.000026 - momentum: 0.000000 2023-10-20 00:15:03,296 epoch 3 - iter 39/136 - loss 0.59345434 - time (sec): 1.03 - samples/sec: 12821.34 - lr: 0.000026 - momentum: 0.000000 2023-10-20 00:15:03,639 epoch 3 - iter 52/136 - loss 0.56801542 - time (sec): 1.38 - samples/sec: 13196.90 - lr: 0.000025 - momentum: 0.000000 2023-10-20 00:15:03,996 epoch 3 - iter 65/136 - loss 0.56542056 - time (sec): 1.73 - samples/sec: 13651.98 - lr: 0.000025 - momentum: 0.000000 2023-10-20 00:15:04,353 epoch 3 - iter 78/136 - loss 0.57839887 - time (sec): 2.09 - samples/sec: 13666.00 - lr: 0.000025 - momentum: 0.000000 2023-10-20 00:15:04,710 epoch 3 - iter 91/136 - loss 0.57167273 - time (sec): 2.45 - samples/sec: 13820.74 - lr: 0.000024 - momentum: 0.000000 2023-10-20 00:15:05,077 epoch 3 - iter 104/136 - loss 0.56833936 - time (sec): 2.82 - samples/sec: 14255.05 - lr: 0.000024 - momentum: 0.000000 2023-10-20 00:15:05,443 epoch 3 - iter 117/136 - loss 0.55811131 - time (sec): 3.18 - samples/sec: 14123.93 - lr: 0.000024 - momentum: 0.000000 2023-10-20 00:15:05,795 epoch 3 - iter 130/136 - loss 0.55001629 - time (sec): 3.53 - samples/sec: 14209.42 - lr: 0.000024 - momentum: 0.000000 2023-10-20 00:15:05,939 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:05,939 EPOCH 3 done: loss 0.5457 - lr: 0.000024 2023-10-20 00:15:06,696 DEV : loss 0.38889962434768677 - f1-score (micro avg) 0.0 2023-10-20 00:15:06,700 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:07,059 epoch 4 - iter 13/136 - loss 0.50154065 - time (sec): 0.36 - samples/sec: 14223.36 - lr: 0.000023 - momentum: 0.000000 2023-10-20 00:15:07,433 epoch 4 - iter 26/136 - loss 0.50606387 - time (sec): 0.73 - samples/sec: 14296.91 - lr: 0.000023 - momentum: 0.000000 2023-10-20 00:15:07,798 epoch 4 - iter 39/136 - loss 0.49365564 - time (sec): 1.10 - samples/sec: 14267.18 - lr: 0.000022 - momentum: 0.000000 2023-10-20 00:15:08,136 epoch 4 - iter 52/136 - loss 0.49818870 - time (sec): 1.44 - samples/sec: 13973.64 - lr: 0.000022 - momentum: 0.000000 2023-10-20 00:15:08,477 epoch 4 - iter 65/136 - loss 0.50072732 - time (sec): 1.78 - samples/sec: 13848.54 - lr: 0.000022 - momentum: 0.000000 2023-10-20 00:15:08,829 epoch 4 - iter 78/136 - loss 0.50225318 - time (sec): 2.13 - samples/sec: 13777.37 - lr: 0.000021 - momentum: 0.000000 2023-10-20 00:15:09,189 epoch 4 - iter 91/136 - loss 0.49964287 - time (sec): 2.49 - samples/sec: 13999.68 - lr: 0.000021 - momentum: 0.000000 2023-10-20 00:15:09,563 epoch 4 - iter 104/136 - loss 0.48388305 - time (sec): 2.86 - samples/sec: 14128.12 - lr: 0.000021 - momentum: 0.000000 2023-10-20 00:15:09,899 epoch 4 - iter 117/136 - loss 0.48686068 - time (sec): 3.20 - samples/sec: 14056.48 - lr: 0.000021 - momentum: 0.000000 2023-10-20 00:15:10,249 epoch 4 - iter 130/136 - loss 0.49113062 - time (sec): 3.55 - samples/sec: 14245.09 - lr: 0.000020 - momentum: 0.000000 2023-10-20 00:15:10,390 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:10,390 EPOCH 4 done: loss 0.4923 - lr: 0.000020 2023-10-20 00:15:11,161 DEV : loss 0.36093470454216003 - f1-score (micro avg) 0.0138 2023-10-20 00:15:11,165 saving best model 2023-10-20 00:15:11,191 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:11,535 epoch 5 - iter 13/136 - loss 0.48685833 - time (sec): 0.34 - samples/sec: 11961.22 - lr: 0.000020 - momentum: 0.000000 2023-10-20 00:15:11,874 epoch 5 - iter 26/136 - loss 0.48360129 - time (sec): 0.68 - samples/sec: 13230.05 - lr: 0.000019 - momentum: 0.000000 2023-10-20 00:15:12,217 epoch 5 - iter 39/136 - loss 0.46610858 - time (sec): 1.03 - samples/sec: 13717.31 - lr: 0.000019 - momentum: 0.000000 2023-10-20 00:15:12,525 epoch 5 - iter 52/136 - loss 0.45470638 - time (sec): 1.33 - samples/sec: 14260.50 - lr: 0.000019 - momentum: 0.000000 2023-10-20 00:15:12,824 epoch 5 - iter 65/136 - loss 0.46582992 - time (sec): 1.63 - samples/sec: 15190.07 - lr: 0.000018 - momentum: 0.000000 2023-10-20 00:15:13,106 epoch 5 - iter 78/136 - loss 0.47319200 - time (sec): 1.91 - samples/sec: 15193.34 - lr: 0.000018 - momentum: 0.000000 2023-10-20 00:15:13,405 epoch 5 - iter 91/136 - loss 0.47193998 - time (sec): 2.21 - samples/sec: 15268.23 - lr: 0.000018 - momentum: 0.000000 2023-10-20 00:15:13,875 epoch 5 - iter 104/136 - loss 0.47051975 - time (sec): 2.68 - samples/sec: 14771.05 - lr: 0.000018 - momentum: 0.000000 2023-10-20 00:15:14,181 epoch 5 - iter 117/136 - loss 0.46638781 - time (sec): 2.99 - samples/sec: 14963.65 - lr: 0.000017 - momentum: 0.000000 2023-10-20 00:15:14,471 epoch 5 - iter 130/136 - loss 0.46766506 - time (sec): 3.28 - samples/sec: 15010.97 - lr: 0.000017 - momentum: 0.000000 2023-10-20 00:15:14,621 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:14,621 EPOCH 5 done: loss 0.4669 - lr: 0.000017 2023-10-20 00:15:15,381 DEV : loss 0.33118170499801636 - f1-score (micro avg) 0.0583 2023-10-20 00:15:15,385 saving best model 2023-10-20 00:15:15,415 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:15,713 epoch 6 - iter 13/136 - loss 0.42746417 - time (sec): 0.30 - samples/sec: 17301.41 - lr: 0.000016 - momentum: 0.000000 2023-10-20 00:15:16,005 epoch 6 - iter 26/136 - loss 0.40067902 - time (sec): 0.59 - samples/sec: 17521.70 - lr: 0.000016 - momentum: 0.000000 2023-10-20 00:15:16,294 epoch 6 - iter 39/136 - loss 0.39703205 - time (sec): 0.88 - samples/sec: 17066.06 - lr: 0.000016 - momentum: 0.000000 2023-10-20 00:15:16,598 epoch 6 - iter 52/136 - loss 0.41625900 - time (sec): 1.18 - samples/sec: 16871.78 - lr: 0.000015 - momentum: 0.000000 2023-10-20 00:15:16,918 epoch 6 - iter 65/136 - loss 0.43477326 - time (sec): 1.50 - samples/sec: 16945.81 - lr: 0.000015 - momentum: 0.000000 2023-10-20 00:15:17,231 epoch 6 - iter 78/136 - loss 0.44587471 - time (sec): 1.82 - samples/sec: 16725.04 - lr: 0.000015 - momentum: 0.000000 2023-10-20 00:15:17,526 epoch 6 - iter 91/136 - loss 0.44234516 - time (sec): 2.11 - samples/sec: 16608.76 - lr: 0.000015 - momentum: 0.000000 2023-10-20 00:15:17,820 epoch 6 - iter 104/136 - loss 0.43939936 - time (sec): 2.40 - samples/sec: 16605.70 - lr: 0.000014 - momentum: 0.000000 2023-10-20 00:15:18,115 epoch 6 - iter 117/136 - loss 0.43863542 - time (sec): 2.70 - samples/sec: 16716.66 - lr: 0.000014 - momentum: 0.000000 2023-10-20 00:15:18,398 epoch 6 - iter 130/136 - loss 0.43744152 - time (sec): 2.98 - samples/sec: 16595.78 - lr: 0.000014 - momentum: 0.000000 2023-10-20 00:15:18,540 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:18,540 EPOCH 6 done: loss 0.4393 - lr: 0.000014 2023-10-20 00:15:19,304 DEV : loss 0.3201183080673218 - f1-score (micro avg) 0.0688 2023-10-20 00:15:19,308 saving best model 2023-10-20 00:15:19,338 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:19,640 epoch 7 - iter 13/136 - loss 0.47262970 - time (sec): 0.30 - samples/sec: 16933.74 - lr: 0.000013 - momentum: 0.000000 2023-10-20 00:15:19,941 epoch 7 - iter 26/136 - loss 0.46150833 - time (sec): 0.60 - samples/sec: 17857.65 - lr: 0.000013 - momentum: 0.000000 2023-10-20 00:15:20,217 epoch 7 - iter 39/136 - loss 0.45981501 - time (sec): 0.88 - samples/sec: 16783.72 - lr: 0.000012 - momentum: 0.000000 2023-10-20 00:15:20,507 epoch 7 - iter 52/136 - loss 0.45397364 - time (sec): 1.17 - samples/sec: 16965.94 - lr: 0.000012 - momentum: 0.000000 2023-10-20 00:15:20,814 epoch 7 - iter 65/136 - loss 0.43705463 - time (sec): 1.47 - samples/sec: 17092.66 - lr: 0.000012 - momentum: 0.000000 2023-10-20 00:15:21,105 epoch 7 - iter 78/136 - loss 0.42748089 - time (sec): 1.77 - samples/sec: 16888.56 - lr: 0.000012 - momentum: 0.000000 2023-10-20 00:15:21,421 epoch 7 - iter 91/136 - loss 0.42287302 - time (sec): 2.08 - samples/sec: 16908.57 - lr: 0.000011 - momentum: 0.000000 2023-10-20 00:15:21,729 epoch 7 - iter 104/136 - loss 0.42334313 - time (sec): 2.39 - samples/sec: 16828.07 - lr: 0.000011 - momentum: 0.000000 2023-10-20 00:15:22,020 epoch 7 - iter 117/136 - loss 0.42227118 - time (sec): 2.68 - samples/sec: 16714.39 - lr: 0.000011 - momentum: 0.000000 2023-10-20 00:15:22,309 epoch 7 - iter 130/136 - loss 0.42414636 - time (sec): 2.97 - samples/sec: 16698.50 - lr: 0.000010 - momentum: 0.000000 2023-10-20 00:15:22,453 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:22,453 EPOCH 7 done: loss 0.4222 - lr: 0.000010 2023-10-20 00:15:23,228 DEV : loss 0.3113357424736023 - f1-score (micro avg) 0.1068 2023-10-20 00:15:23,232 saving best model 2023-10-20 00:15:23,264 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:23,593 epoch 8 - iter 13/136 - loss 0.34479809 - time (sec): 0.33 - samples/sec: 17802.61 - lr: 0.000010 - momentum: 0.000000 2023-10-20 00:15:23,908 epoch 8 - iter 26/136 - loss 0.39009149 - time (sec): 0.64 - samples/sec: 16766.92 - lr: 0.000009 - momentum: 0.000000 2023-10-20 00:15:24,387 epoch 8 - iter 39/136 - loss 0.40721604 - time (sec): 1.12 - samples/sec: 13488.55 - lr: 0.000009 - momentum: 0.000000 2023-10-20 00:15:24,735 epoch 8 - iter 52/136 - loss 0.38471006 - time (sec): 1.47 - samples/sec: 14020.41 - lr: 0.000009 - momentum: 0.000000 2023-10-20 00:15:25,083 epoch 8 - iter 65/136 - loss 0.39657642 - time (sec): 1.82 - samples/sec: 14084.04 - lr: 0.000009 - momentum: 0.000000 2023-10-20 00:15:25,441 epoch 8 - iter 78/136 - loss 0.39140758 - time (sec): 2.18 - samples/sec: 13946.80 - lr: 0.000008 - momentum: 0.000000 2023-10-20 00:15:25,803 epoch 8 - iter 91/136 - loss 0.39569377 - time (sec): 2.54 - samples/sec: 13747.17 - lr: 0.000008 - momentum: 0.000000 2023-10-20 00:15:26,169 epoch 8 - iter 104/136 - loss 0.39722167 - time (sec): 2.90 - samples/sec: 13769.71 - lr: 0.000008 - momentum: 0.000000 2023-10-20 00:15:26,507 epoch 8 - iter 117/136 - loss 0.40225322 - time (sec): 3.24 - samples/sec: 13804.68 - lr: 0.000007 - momentum: 0.000000 2023-10-20 00:15:26,850 epoch 8 - iter 130/136 - loss 0.40087826 - time (sec): 3.59 - samples/sec: 13697.91 - lr: 0.000007 - momentum: 0.000000 2023-10-20 00:15:27,013 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:27,014 EPOCH 8 done: loss 0.4033 - lr: 0.000007 2023-10-20 00:15:27,801 DEV : loss 0.306252121925354 - f1-score (micro avg) 0.1214 2023-10-20 00:15:27,806 saving best model 2023-10-20 00:15:27,837 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:28,208 epoch 9 - iter 13/136 - loss 0.37030615 - time (sec): 0.37 - samples/sec: 13553.37 - lr: 0.000006 - momentum: 0.000000 2023-10-20 00:15:28,588 epoch 9 - iter 26/136 - loss 0.41854927 - time (sec): 0.75 - samples/sec: 13488.88 - lr: 0.000006 - momentum: 0.000000 2023-10-20 00:15:28,952 epoch 9 - iter 39/136 - loss 0.43664378 - time (sec): 1.11 - samples/sec: 13185.13 - lr: 0.000006 - momentum: 0.000000 2023-10-20 00:15:29,344 epoch 9 - iter 52/136 - loss 0.41430370 - time (sec): 1.51 - samples/sec: 13793.08 - lr: 0.000006 - momentum: 0.000000 2023-10-20 00:15:29,711 epoch 9 - iter 65/136 - loss 0.40126303 - time (sec): 1.87 - samples/sec: 13950.75 - lr: 0.000005 - momentum: 0.000000 2023-10-20 00:15:30,073 epoch 9 - iter 78/136 - loss 0.40287344 - time (sec): 2.24 - samples/sec: 13858.82 - lr: 0.000005 - momentum: 0.000000 2023-10-20 00:15:30,423 epoch 9 - iter 91/136 - loss 0.40298450 - time (sec): 2.59 - samples/sec: 13699.40 - lr: 0.000005 - momentum: 0.000000 2023-10-20 00:15:30,761 epoch 9 - iter 104/136 - loss 0.40534819 - time (sec): 2.92 - samples/sec: 13763.92 - lr: 0.000004 - momentum: 0.000000 2023-10-20 00:15:31,103 epoch 9 - iter 117/136 - loss 0.40660709 - time (sec): 3.27 - samples/sec: 13741.73 - lr: 0.000004 - momentum: 0.000000 2023-10-20 00:15:31,430 epoch 9 - iter 130/136 - loss 0.40302480 - time (sec): 3.59 - samples/sec: 13749.44 - lr: 0.000004 - momentum: 0.000000 2023-10-20 00:15:31,600 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:31,600 EPOCH 9 done: loss 0.4046 - lr: 0.000004 2023-10-20 00:15:32,365 DEV : loss 0.3038671314716339 - f1-score (micro avg) 0.142 2023-10-20 00:15:32,369 saving best model 2023-10-20 00:15:32,402 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:32,754 epoch 10 - iter 13/136 - loss 0.46654899 - time (sec): 0.35 - samples/sec: 13147.01 - lr: 0.000003 - momentum: 0.000000 2023-10-20 00:15:33,126 epoch 10 - iter 26/136 - loss 0.39348560 - time (sec): 0.72 - samples/sec: 13843.17 - lr: 0.000003 - momentum: 0.000000 2023-10-20 00:15:33,494 epoch 10 - iter 39/136 - loss 0.37965230 - time (sec): 1.09 - samples/sec: 13304.68 - lr: 0.000003 - momentum: 0.000000 2023-10-20 00:15:33,857 epoch 10 - iter 52/136 - loss 0.40319403 - time (sec): 1.45 - samples/sec: 13658.42 - lr: 0.000002 - momentum: 0.000000 2023-10-20 00:15:34,187 epoch 10 - iter 65/136 - loss 0.38310015 - time (sec): 1.78 - samples/sec: 13803.09 - lr: 0.000002 - momentum: 0.000000 2023-10-20 00:15:34,536 epoch 10 - iter 78/136 - loss 0.39158606 - time (sec): 2.13 - samples/sec: 13825.33 - lr: 0.000002 - momentum: 0.000000 2023-10-20 00:15:34,902 epoch 10 - iter 91/136 - loss 0.39702863 - time (sec): 2.50 - samples/sec: 13852.96 - lr: 0.000001 - momentum: 0.000000 2023-10-20 00:15:35,254 epoch 10 - iter 104/136 - loss 0.40694497 - time (sec): 2.85 - samples/sec: 13624.47 - lr: 0.000001 - momentum: 0.000000 2023-10-20 00:15:35,628 epoch 10 - iter 117/136 - loss 0.39072910 - time (sec): 3.23 - samples/sec: 14022.56 - lr: 0.000001 - momentum: 0.000000 2023-10-20 00:15:36,110 epoch 10 - iter 130/136 - loss 0.39313748 - time (sec): 3.71 - samples/sec: 13420.73 - lr: 0.000000 - momentum: 0.000000 2023-10-20 00:15:36,265 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:36,265 EPOCH 10 done: loss 0.3934 - lr: 0.000000 2023-10-20 00:15:37,034 DEV : loss 0.3026784360408783 - f1-score (micro avg) 0.1521 2023-10-20 00:15:37,038 saving best model 2023-10-20 00:15:37,094 ---------------------------------------------------------------------------------------------------- 2023-10-20 00:15:37,094 Loading model from best epoch ... 2023-10-20 00:15:37,164 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-20 00:15:37,989 Results: - F-score (micro) 0.1433 - F-score (macro) 0.0781 - Accuracy 0.0802 By class: precision recall f1-score support PER 0.1779 0.1779 0.1779 208 LOC 0.3514 0.0833 0.1347 312 ORG 0.0000 0.0000 0.0000 55 HumanProd 0.0000 0.0000 0.0000 22 micro avg 0.2234 0.1055 0.1433 597 macro avg 0.1323 0.0653 0.0781 597 weighted avg 0.2456 0.1055 0.1324 597 2023-10-20 00:15:37,990 ----------------------------------------------------------------------------------------------------