2023-10-19 23:57:29,912 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:29,913 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-19 23:57:29,913 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:29,913 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-19 23:57:29,913 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:29,913 Train: 1166 sentences 2023-10-19 23:57:29,913 (train_with_dev=False, train_with_test=False) 2023-10-19 23:57:29,913 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:29,913 Training Params: 2023-10-19 23:57:29,913 - learning_rate: "3e-05" 2023-10-19 23:57:29,913 - mini_batch_size: "4" 2023-10-19 23:57:29,913 - max_epochs: "10" 2023-10-19 23:57:29,913 - shuffle: "True" 2023-10-19 23:57:29,913 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:29,913 Plugins: 2023-10-19 23:57:29,913 - TensorboardLogger 2023-10-19 23:57:29,913 - LinearScheduler | warmup_fraction: '0.1' 2023-10-19 23:57:29,913 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:29,913 Final evaluation on model from best epoch (best-model.pt) 2023-10-19 23:57:29,913 - metric: "('micro avg', 'f1-score')" 2023-10-19 23:57:29,914 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:29,914 Computation: 2023-10-19 23:57:29,914 - compute on device: cuda:0 2023-10-19 23:57:29,914 - embedding storage: none 2023-10-19 23:57:29,914 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:29,914 Model training base path: "hmbench-newseye/fi-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-19 23:57:29,914 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:29,914 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:29,914 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-19 23:57:30,355 epoch 1 - iter 29/292 - loss 3.15858994 - time (sec): 0.44 - samples/sec: 8983.13 - lr: 0.000003 - momentum: 0.000000 2023-10-19 23:57:30,831 epoch 1 - iter 58/292 - loss 3.12098443 - time (sec): 0.92 - samples/sec: 8389.04 - lr: 0.000006 - momentum: 0.000000 2023-10-19 23:57:31,365 epoch 1 - iter 87/292 - loss 3.09355439 - time (sec): 1.45 - samples/sec: 8367.32 - lr: 0.000009 - momentum: 0.000000 2023-10-19 23:57:31,877 epoch 1 - iter 116/292 - loss 2.97640957 - time (sec): 1.96 - samples/sec: 8214.22 - lr: 0.000012 - momentum: 0.000000 2023-10-19 23:57:32,386 epoch 1 - iter 145/292 - loss 2.82557112 - time (sec): 2.47 - samples/sec: 8349.50 - lr: 0.000015 - momentum: 0.000000 2023-10-19 23:57:32,887 epoch 1 - iter 174/292 - loss 2.65406568 - time (sec): 2.97 - samples/sec: 8321.21 - lr: 0.000018 - momentum: 0.000000 2023-10-19 23:57:33,409 epoch 1 - iter 203/292 - loss 2.39656343 - time (sec): 3.50 - samples/sec: 8608.21 - lr: 0.000021 - momentum: 0.000000 2023-10-19 23:57:33,935 epoch 1 - iter 232/292 - loss 2.21048062 - time (sec): 4.02 - samples/sec: 8588.98 - lr: 0.000024 - momentum: 0.000000 2023-10-19 23:57:34,473 epoch 1 - iter 261/292 - loss 2.03861344 - time (sec): 4.56 - samples/sec: 8670.12 - lr: 0.000027 - momentum: 0.000000 2023-10-19 23:57:34,988 epoch 1 - iter 290/292 - loss 1.90732826 - time (sec): 5.07 - samples/sec: 8730.43 - lr: 0.000030 - momentum: 0.000000 2023-10-19 23:57:35,016 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:35,016 EPOCH 1 done: loss 1.9026 - lr: 0.000030 2023-10-19 23:57:35,275 DEV : loss 0.4726361334323883 - f1-score (micro avg) 0.0 2023-10-19 23:57:35,279 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:35,786 epoch 2 - iter 29/292 - loss 0.88527386 - time (sec): 0.51 - samples/sec: 9692.47 - lr: 0.000030 - momentum: 0.000000 2023-10-19 23:57:36,304 epoch 2 - iter 58/292 - loss 0.82889086 - time (sec): 1.02 - samples/sec: 9394.52 - lr: 0.000029 - momentum: 0.000000 2023-10-19 23:57:36,803 epoch 2 - iter 87/292 - loss 0.78547384 - time (sec): 1.52 - samples/sec: 9030.81 - lr: 0.000029 - momentum: 0.000000 2023-10-19 23:57:37,269 epoch 2 - iter 116/292 - loss 0.78010344 - time (sec): 1.99 - samples/sec: 8830.24 - lr: 0.000029 - momentum: 0.000000 2023-10-19 23:57:37,757 epoch 2 - iter 145/292 - loss 0.75316303 - time (sec): 2.48 - samples/sec: 8742.69 - lr: 0.000028 - momentum: 0.000000 2023-10-19 23:57:38,256 epoch 2 - iter 174/292 - loss 0.73185571 - time (sec): 2.98 - samples/sec: 8727.51 - lr: 0.000028 - momentum: 0.000000 2023-10-19 23:57:38,755 epoch 2 - iter 203/292 - loss 0.71216728 - time (sec): 3.48 - samples/sec: 8641.50 - lr: 0.000028 - momentum: 0.000000 2023-10-19 23:57:39,287 epoch 2 - iter 232/292 - loss 0.67823755 - time (sec): 4.01 - samples/sec: 8867.99 - lr: 0.000027 - momentum: 0.000000 2023-10-19 23:57:39,818 epoch 2 - iter 261/292 - loss 0.66528632 - time (sec): 4.54 - samples/sec: 8918.28 - lr: 0.000027 - momentum: 0.000000 2023-10-19 23:57:40,304 epoch 2 - iter 290/292 - loss 0.66538611 - time (sec): 5.02 - samples/sec: 8778.91 - lr: 0.000027 - momentum: 0.000000 2023-10-19 23:57:40,338 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:40,338 EPOCH 2 done: loss 0.6640 - lr: 0.000027 2023-10-19 23:57:40,964 DEV : loss 0.4068935811519623 - f1-score (micro avg) 0.0 2023-10-19 23:57:40,968 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:41,468 epoch 3 - iter 29/292 - loss 0.50362782 - time (sec): 0.50 - samples/sec: 8760.21 - lr: 0.000026 - momentum: 0.000000 2023-10-19 23:57:41,968 epoch 3 - iter 58/292 - loss 0.52872549 - time (sec): 1.00 - samples/sec: 8770.81 - lr: 0.000026 - momentum: 0.000000 2023-10-19 23:57:42,484 epoch 3 - iter 87/292 - loss 0.55048926 - time (sec): 1.51 - samples/sec: 8999.06 - lr: 0.000026 - momentum: 0.000000 2023-10-19 23:57:43,018 epoch 3 - iter 116/292 - loss 0.59262551 - time (sec): 2.05 - samples/sec: 8752.53 - lr: 0.000025 - momentum: 0.000000 2023-10-19 23:57:43,690 epoch 3 - iter 145/292 - loss 0.58792425 - time (sec): 2.72 - samples/sec: 8189.27 - lr: 0.000025 - momentum: 0.000000 2023-10-19 23:57:44,225 epoch 3 - iter 174/292 - loss 0.57780683 - time (sec): 3.26 - samples/sec: 8332.04 - lr: 0.000025 - momentum: 0.000000 2023-10-19 23:57:44,708 epoch 3 - iter 203/292 - loss 0.57106053 - time (sec): 3.74 - samples/sec: 8318.01 - lr: 0.000024 - momentum: 0.000000 2023-10-19 23:57:45,293 epoch 3 - iter 232/292 - loss 0.56074661 - time (sec): 4.32 - samples/sec: 8263.42 - lr: 0.000024 - momentum: 0.000000 2023-10-19 23:57:45,792 epoch 3 - iter 261/292 - loss 0.55453111 - time (sec): 4.82 - samples/sec: 8194.65 - lr: 0.000024 - momentum: 0.000000 2023-10-19 23:57:46,308 epoch 3 - iter 290/292 - loss 0.55053773 - time (sec): 5.34 - samples/sec: 8260.74 - lr: 0.000023 - momentum: 0.000000 2023-10-19 23:57:46,345 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:46,345 EPOCH 3 done: loss 0.5490 - lr: 0.000023 2023-10-19 23:57:46,967 DEV : loss 0.3729143738746643 - f1-score (micro avg) 0.0 2023-10-19 23:57:46,971 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:47,455 epoch 4 - iter 29/292 - loss 0.44360140 - time (sec): 0.48 - samples/sec: 8111.28 - lr: 0.000023 - momentum: 0.000000 2023-10-19 23:57:47,971 epoch 4 - iter 58/292 - loss 0.46127904 - time (sec): 1.00 - samples/sec: 8078.16 - lr: 0.000023 - momentum: 0.000000 2023-10-19 23:57:48,506 epoch 4 - iter 87/292 - loss 0.45386077 - time (sec): 1.53 - samples/sec: 8329.82 - lr: 0.000022 - momentum: 0.000000 2023-10-19 23:57:49,025 epoch 4 - iter 116/292 - loss 0.45066227 - time (sec): 2.05 - samples/sec: 8285.59 - lr: 0.000022 - momentum: 0.000000 2023-10-19 23:57:49,555 epoch 4 - iter 145/292 - loss 0.45473106 - time (sec): 2.58 - samples/sec: 8211.85 - lr: 0.000022 - momentum: 0.000000 2023-10-19 23:57:50,065 epoch 4 - iter 174/292 - loss 0.45803790 - time (sec): 3.09 - samples/sec: 8220.40 - lr: 0.000021 - momentum: 0.000000 2023-10-19 23:57:50,606 epoch 4 - iter 203/292 - loss 0.47679097 - time (sec): 3.63 - samples/sec: 8472.15 - lr: 0.000021 - momentum: 0.000000 2023-10-19 23:57:51,100 epoch 4 - iter 232/292 - loss 0.47649244 - time (sec): 4.13 - samples/sec: 8360.31 - lr: 0.000021 - momentum: 0.000000 2023-10-19 23:57:51,602 epoch 4 - iter 261/292 - loss 0.47283559 - time (sec): 4.63 - samples/sec: 8370.13 - lr: 0.000020 - momentum: 0.000000 2023-10-19 23:57:52,144 epoch 4 - iter 290/292 - loss 0.47776120 - time (sec): 5.17 - samples/sec: 8569.51 - lr: 0.000020 - momentum: 0.000000 2023-10-19 23:57:52,176 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:52,176 EPOCH 4 done: loss 0.4775 - lr: 0.000020 2023-10-19 23:57:52,799 DEV : loss 0.33535653352737427 - f1-score (micro avg) 0.0368 2023-10-19 23:57:52,803 saving best model 2023-10-19 23:57:52,832 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:53,377 epoch 5 - iter 29/292 - loss 0.45846940 - time (sec): 0.54 - samples/sec: 9610.96 - lr: 0.000020 - momentum: 0.000000 2023-10-19 23:57:53,907 epoch 5 - iter 58/292 - loss 0.51705999 - time (sec): 1.07 - samples/sec: 9124.70 - lr: 0.000019 - momentum: 0.000000 2023-10-19 23:57:54,398 epoch 5 - iter 87/292 - loss 0.48221681 - time (sec): 1.57 - samples/sec: 8759.51 - lr: 0.000019 - momentum: 0.000000 2023-10-19 23:57:54,929 epoch 5 - iter 116/292 - loss 0.46910511 - time (sec): 2.10 - samples/sec: 8518.22 - lr: 0.000019 - momentum: 0.000000 2023-10-19 23:57:55,498 epoch 5 - iter 145/292 - loss 0.46148846 - time (sec): 2.67 - samples/sec: 8533.84 - lr: 0.000018 - momentum: 0.000000 2023-10-19 23:57:56,036 epoch 5 - iter 174/292 - loss 0.44684775 - time (sec): 3.20 - samples/sec: 8422.22 - lr: 0.000018 - momentum: 0.000000 2023-10-19 23:57:56,551 epoch 5 - iter 203/292 - loss 0.44853019 - time (sec): 3.72 - samples/sec: 8311.04 - lr: 0.000018 - momentum: 0.000000 2023-10-19 23:57:57,125 epoch 5 - iter 232/292 - loss 0.46105102 - time (sec): 4.29 - samples/sec: 8270.59 - lr: 0.000017 - momentum: 0.000000 2023-10-19 23:57:57,639 epoch 5 - iter 261/292 - loss 0.45781694 - time (sec): 4.81 - samples/sec: 8194.08 - lr: 0.000017 - momentum: 0.000000 2023-10-19 23:57:58,166 epoch 5 - iter 290/292 - loss 0.44694408 - time (sec): 5.33 - samples/sec: 8314.90 - lr: 0.000017 - momentum: 0.000000 2023-10-19 23:57:58,196 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:58,196 EPOCH 5 done: loss 0.4463 - lr: 0.000017 2023-10-19 23:57:58,831 DEV : loss 0.33983153104782104 - f1-score (micro avg) 0.0687 2023-10-19 23:57:58,835 saving best model 2023-10-19 23:57:58,868 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:57:59,383 epoch 6 - iter 29/292 - loss 0.44689006 - time (sec): 0.51 - samples/sec: 9067.63 - lr: 0.000016 - momentum: 0.000000 2023-10-19 23:57:59,890 epoch 6 - iter 58/292 - loss 0.42022345 - time (sec): 1.02 - samples/sec: 8248.87 - lr: 0.000016 - momentum: 0.000000 2023-10-19 23:58:00,381 epoch 6 - iter 87/292 - loss 0.43547237 - time (sec): 1.51 - samples/sec: 8374.78 - lr: 0.000016 - momentum: 0.000000 2023-10-19 23:58:00,902 epoch 6 - iter 116/292 - loss 0.42767037 - time (sec): 2.03 - samples/sec: 8496.53 - lr: 0.000015 - momentum: 0.000000 2023-10-19 23:58:01,413 epoch 6 - iter 145/292 - loss 0.41360136 - time (sec): 2.54 - samples/sec: 8648.10 - lr: 0.000015 - momentum: 0.000000 2023-10-19 23:58:01,950 epoch 6 - iter 174/292 - loss 0.41434310 - time (sec): 3.08 - samples/sec: 8717.91 - lr: 0.000015 - momentum: 0.000000 2023-10-19 23:58:02,459 epoch 6 - iter 203/292 - loss 0.40041453 - time (sec): 3.59 - samples/sec: 8764.22 - lr: 0.000014 - momentum: 0.000000 2023-10-19 23:58:02,969 epoch 6 - iter 232/292 - loss 0.40130692 - time (sec): 4.10 - samples/sec: 8773.71 - lr: 0.000014 - momentum: 0.000000 2023-10-19 23:58:03,485 epoch 6 - iter 261/292 - loss 0.40438520 - time (sec): 4.62 - samples/sec: 8665.74 - lr: 0.000014 - momentum: 0.000000 2023-10-19 23:58:04,000 epoch 6 - iter 290/292 - loss 0.41311196 - time (sec): 5.13 - samples/sec: 8614.21 - lr: 0.000013 - momentum: 0.000000 2023-10-19 23:58:04,028 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:58:04,028 EPOCH 6 done: loss 0.4149 - lr: 0.000013 2023-10-19 23:58:04,664 DEV : loss 0.327802449464798 - f1-score (micro avg) 0.1477 2023-10-19 23:58:04,668 saving best model 2023-10-19 23:58:04,701 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:58:05,249 epoch 7 - iter 29/292 - loss 0.32104118 - time (sec): 0.55 - samples/sec: 10152.81 - lr: 0.000013 - momentum: 0.000000 2023-10-19 23:58:05,738 epoch 7 - iter 58/292 - loss 0.39113470 - time (sec): 1.04 - samples/sec: 9059.74 - lr: 0.000013 - momentum: 0.000000 2023-10-19 23:58:06,250 epoch 7 - iter 87/292 - loss 0.40749106 - time (sec): 1.55 - samples/sec: 8684.18 - lr: 0.000012 - momentum: 0.000000 2023-10-19 23:58:06,752 epoch 7 - iter 116/292 - loss 0.38645042 - time (sec): 2.05 - samples/sec: 8722.49 - lr: 0.000012 - momentum: 0.000000 2023-10-19 23:58:07,237 epoch 7 - iter 145/292 - loss 0.39391474 - time (sec): 2.53 - samples/sec: 8539.01 - lr: 0.000012 - momentum: 0.000000 2023-10-19 23:58:07,756 epoch 7 - iter 174/292 - loss 0.40980659 - time (sec): 3.05 - samples/sec: 8761.86 - lr: 0.000011 - momentum: 0.000000 2023-10-19 23:58:08,271 epoch 7 - iter 203/292 - loss 0.40209976 - time (sec): 3.57 - samples/sec: 8856.92 - lr: 0.000011 - momentum: 0.000000 2023-10-19 23:58:08,783 epoch 7 - iter 232/292 - loss 0.40744235 - time (sec): 4.08 - samples/sec: 8804.31 - lr: 0.000011 - momentum: 0.000000 2023-10-19 23:58:09,293 epoch 7 - iter 261/292 - loss 0.39712786 - time (sec): 4.59 - samples/sec: 8721.50 - lr: 0.000010 - momentum: 0.000000 2023-10-19 23:58:09,787 epoch 7 - iter 290/292 - loss 0.39302474 - time (sec): 5.09 - samples/sec: 8678.71 - lr: 0.000010 - momentum: 0.000000 2023-10-19 23:58:09,819 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:58:09,819 EPOCH 7 done: loss 0.3938 - lr: 0.000010 2023-10-19 23:58:10,451 DEV : loss 0.31045615673065186 - f1-score (micro avg) 0.1753 2023-10-19 23:58:10,455 saving best model 2023-10-19 23:58:10,487 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:58:10,978 epoch 8 - iter 29/292 - loss 0.36499034 - time (sec): 0.49 - samples/sec: 8858.41 - lr: 0.000010 - momentum: 0.000000 2023-10-19 23:58:11,501 epoch 8 - iter 58/292 - loss 0.39731528 - time (sec): 1.01 - samples/sec: 9008.12 - lr: 0.000009 - momentum: 0.000000 2023-10-19 23:58:12,045 epoch 8 - iter 87/292 - loss 0.35849483 - time (sec): 1.56 - samples/sec: 9379.00 - lr: 0.000009 - momentum: 0.000000 2023-10-19 23:58:12,536 epoch 8 - iter 116/292 - loss 0.36888550 - time (sec): 2.05 - samples/sec: 8977.23 - lr: 0.000009 - momentum: 0.000000 2023-10-19 23:58:13,008 epoch 8 - iter 145/292 - loss 0.37562375 - time (sec): 2.52 - samples/sec: 8673.85 - lr: 0.000008 - momentum: 0.000000 2023-10-19 23:58:13,513 epoch 8 - iter 174/292 - loss 0.38262840 - time (sec): 3.03 - samples/sec: 8619.98 - lr: 0.000008 - momentum: 0.000000 2023-10-19 23:58:14,029 epoch 8 - iter 203/292 - loss 0.37658732 - time (sec): 3.54 - samples/sec: 8551.23 - lr: 0.000008 - momentum: 0.000000 2023-10-19 23:58:14,533 epoch 8 - iter 232/292 - loss 0.38303644 - time (sec): 4.05 - samples/sec: 8498.09 - lr: 0.000007 - momentum: 0.000000 2023-10-19 23:58:15,048 epoch 8 - iter 261/292 - loss 0.37867777 - time (sec): 4.56 - samples/sec: 8501.41 - lr: 0.000007 - momentum: 0.000000 2023-10-19 23:58:15,578 epoch 8 - iter 290/292 - loss 0.39456462 - time (sec): 5.09 - samples/sec: 8667.21 - lr: 0.000007 - momentum: 0.000000 2023-10-19 23:58:15,611 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:58:15,611 EPOCH 8 done: loss 0.3925 - lr: 0.000007 2023-10-19 23:58:16,250 DEV : loss 0.31813567876815796 - f1-score (micro avg) 0.1717 2023-10-19 23:58:16,254 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:58:16,758 epoch 9 - iter 29/292 - loss 0.40428918 - time (sec): 0.50 - samples/sec: 7980.78 - lr: 0.000006 - momentum: 0.000000 2023-10-19 23:58:17,287 epoch 9 - iter 58/292 - loss 0.36836772 - time (sec): 1.03 - samples/sec: 7894.99 - lr: 0.000006 - momentum: 0.000000 2023-10-19 23:58:17,850 epoch 9 - iter 87/292 - loss 0.37138841 - time (sec): 1.60 - samples/sec: 8060.88 - lr: 0.000006 - momentum: 0.000000 2023-10-19 23:58:18,362 epoch 9 - iter 116/292 - loss 0.35584352 - time (sec): 2.11 - samples/sec: 8099.67 - lr: 0.000005 - momentum: 0.000000 2023-10-19 23:58:18,884 epoch 9 - iter 145/292 - loss 0.36577051 - time (sec): 2.63 - samples/sec: 7966.11 - lr: 0.000005 - momentum: 0.000000 2023-10-19 23:58:19,552 epoch 9 - iter 174/292 - loss 0.36762424 - time (sec): 3.30 - samples/sec: 7731.22 - lr: 0.000005 - momentum: 0.000000 2023-10-19 23:58:20,069 epoch 9 - iter 203/292 - loss 0.36914510 - time (sec): 3.82 - samples/sec: 7877.82 - lr: 0.000004 - momentum: 0.000000 2023-10-19 23:58:20,623 epoch 9 - iter 232/292 - loss 0.38112081 - time (sec): 4.37 - samples/sec: 8101.40 - lr: 0.000004 - momentum: 0.000000 2023-10-19 23:58:21,126 epoch 9 - iter 261/292 - loss 0.37727984 - time (sec): 4.87 - samples/sec: 8146.24 - lr: 0.000004 - momentum: 0.000000 2023-10-19 23:58:21,648 epoch 9 - iter 290/292 - loss 0.38320637 - time (sec): 5.39 - samples/sec: 8211.37 - lr: 0.000003 - momentum: 0.000000 2023-10-19 23:58:21,676 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:58:21,676 EPOCH 9 done: loss 0.3842 - lr: 0.000003 2023-10-19 23:58:22,306 DEV : loss 0.31648480892181396 - f1-score (micro avg) 0.1803 2023-10-19 23:58:22,310 saving best model 2023-10-19 23:58:22,343 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:58:22,869 epoch 10 - iter 29/292 - loss 0.31764849 - time (sec): 0.53 - samples/sec: 9812.80 - lr: 0.000003 - momentum: 0.000000 2023-10-19 23:58:23,401 epoch 10 - iter 58/292 - loss 0.35718997 - time (sec): 1.06 - samples/sec: 10149.09 - lr: 0.000003 - momentum: 0.000000 2023-10-19 23:58:23,875 epoch 10 - iter 87/292 - loss 0.37047487 - time (sec): 1.53 - samples/sec: 9426.05 - lr: 0.000002 - momentum: 0.000000 2023-10-19 23:58:24,360 epoch 10 - iter 116/292 - loss 0.36595183 - time (sec): 2.02 - samples/sec: 9236.35 - lr: 0.000002 - momentum: 0.000000 2023-10-19 23:58:24,860 epoch 10 - iter 145/292 - loss 0.36830394 - time (sec): 2.52 - samples/sec: 8935.26 - lr: 0.000002 - momentum: 0.000000 2023-10-19 23:58:25,373 epoch 10 - iter 174/292 - loss 0.37044997 - time (sec): 3.03 - samples/sec: 8789.73 - lr: 0.000001 - momentum: 0.000000 2023-10-19 23:58:25,857 epoch 10 - iter 203/292 - loss 0.36942048 - time (sec): 3.51 - samples/sec: 8685.67 - lr: 0.000001 - momentum: 0.000000 2023-10-19 23:58:26,366 epoch 10 - iter 232/292 - loss 0.37678363 - time (sec): 4.02 - samples/sec: 8658.91 - lr: 0.000001 - momentum: 0.000000 2023-10-19 23:58:26,886 epoch 10 - iter 261/292 - loss 0.37899069 - time (sec): 4.54 - samples/sec: 8531.89 - lr: 0.000000 - momentum: 0.000000 2023-10-19 23:58:27,446 epoch 10 - iter 290/292 - loss 0.37973279 - time (sec): 5.10 - samples/sec: 8676.83 - lr: 0.000000 - momentum: 0.000000 2023-10-19 23:58:27,478 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:58:27,478 EPOCH 10 done: loss 0.3791 - lr: 0.000000 2023-10-19 23:58:28,125 DEV : loss 0.31487029790878296 - f1-score (micro avg) 0.1848 2023-10-19 23:58:28,129 saving best model 2023-10-19 23:58:28,189 ---------------------------------------------------------------------------------------------------- 2023-10-19 23:58:28,190 Loading model from best epoch ... 2023-10-19 23:58:28,270 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-19 23:58:29,173 Results: - F-score (micro) 0.2928 - F-score (macro) 0.1505 - Accuracy 0.1777 By class: precision recall f1-score support PER 0.3594 0.3563 0.3579 348 LOC 0.2658 0.2261 0.2443 261 ORG 0.0000 0.0000 0.0000 52 HumanProd 0.0000 0.0000 0.0000 22 micro avg 0.3228 0.2679 0.2928 683 macro avg 0.1563 0.1456 0.1505 683 weighted avg 0.2847 0.2679 0.2757 683 2023-10-19 23:58:29,173 ----------------------------------------------------------------------------------------------------