2023-10-18 22:40:00,249 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:40:00,249 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 22:40:00,249 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:40:00,249 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-18 22:40:00,249 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:40:00,249 Train: 5777 sentences 2023-10-18 22:40:00,249 (train_with_dev=False, train_with_test=False) 2023-10-18 22:40:00,250 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:40:00,250 Training Params: 2023-10-18 22:40:00,250 - learning_rate: "3e-05" 2023-10-18 22:40:00,250 - mini_batch_size: "8" 2023-10-18 22:40:00,250 - max_epochs: "10" 2023-10-18 22:40:00,250 - shuffle: "True" 2023-10-18 22:40:00,250 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:40:00,250 Plugins: 2023-10-18 22:40:00,250 - TensorboardLogger 2023-10-18 22:40:00,250 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 22:40:00,250 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:40:00,250 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 22:40:00,250 - metric: "('micro avg', 'f1-score')" 2023-10-18 22:40:00,250 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:40:00,250 Computation: 2023-10-18 22:40:00,250 - compute on device: cuda:0 2023-10-18 22:40:00,250 - embedding storage: none 2023-10-18 22:40:00,250 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:40:00,250 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-18 22:40:00,250 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:40:00,250 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:40:00,250 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 22:40:02,113 epoch 1 - iter 72/723 - loss 2.98841079 - time (sec): 1.86 - samples/sec: 9578.47 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:40:04,053 epoch 1 - iter 144/723 - loss 2.79139642 - time (sec): 3.80 - samples/sec: 9516.22 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:40:05,867 epoch 1 - iter 216/723 - loss 2.52716576 - time (sec): 5.62 - samples/sec: 9503.17 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:40:07,667 epoch 1 - iter 288/723 - loss 2.18597493 - time (sec): 7.42 - samples/sec: 9593.00 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:40:09,442 epoch 1 - iter 360/723 - loss 1.85823175 - time (sec): 9.19 - samples/sec: 9731.15 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:40:11,226 epoch 1 - iter 432/723 - loss 1.61794529 - time (sec): 10.98 - samples/sec: 9752.10 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:40:13,025 epoch 1 - iter 504/723 - loss 1.43043055 - time (sec): 12.77 - samples/sec: 9802.57 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:40:14,790 epoch 1 - iter 576/723 - loss 1.29820059 - time (sec): 14.54 - samples/sec: 9794.94 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:40:16,555 epoch 1 - iter 648/723 - loss 1.19724166 - time (sec): 16.30 - samples/sec: 9758.71 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:40:18,265 epoch 1 - iter 720/723 - loss 1.11187930 - time (sec): 18.01 - samples/sec: 9759.03 - lr: 0.000030 - momentum: 0.000000 2023-10-18 22:40:18,320 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:40:18,320 EPOCH 1 done: loss 1.1102 - lr: 0.000030 2023-10-18 22:40:19,582 DEV : loss 0.35777369141578674 - f1-score (micro avg) 0.0 2023-10-18 22:40:19,596 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:40:21,365 epoch 2 - iter 72/723 - loss 0.27341211 - time (sec): 1.77 - samples/sec: 9574.80 - lr: 0.000030 - momentum: 0.000000 2023-10-18 22:40:23,118 epoch 2 - iter 144/723 - loss 0.27156084 - time (sec): 3.52 - samples/sec: 9803.83 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:40:24,968 epoch 2 - iter 216/723 - loss 0.25524188 - time (sec): 5.37 - samples/sec: 9778.40 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:40:26,723 epoch 2 - iter 288/723 - loss 0.25718151 - time (sec): 7.13 - samples/sec: 9726.70 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:40:28,489 epoch 2 - iter 360/723 - loss 0.25119017 - time (sec): 8.89 - samples/sec: 9725.42 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:40:30,277 epoch 2 - iter 432/723 - loss 0.24390917 - time (sec): 10.68 - samples/sec: 9774.04 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:40:32,050 epoch 2 - iter 504/723 - loss 0.24273201 - time (sec): 12.45 - samples/sec: 9809.63 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:40:33,864 epoch 2 - iter 576/723 - loss 0.24040013 - time (sec): 14.27 - samples/sec: 9892.91 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:40:35,585 epoch 2 - iter 648/723 - loss 0.23365611 - time (sec): 15.99 - samples/sec: 9932.90 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:40:37,385 epoch 2 - iter 720/723 - loss 0.23607899 - time (sec): 17.79 - samples/sec: 9875.14 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:40:37,455 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:40:37,455 EPOCH 2 done: loss 0.2362 - lr: 0.000027 2023-10-18 22:40:39,578 DEV : loss 0.2550312876701355 - f1-score (micro avg) 0.2002 2023-10-18 22:40:39,592 saving best model 2023-10-18 22:40:39,622 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:40:41,214 epoch 3 - iter 72/723 - loss 0.22079626 - time (sec): 1.59 - samples/sec: 10781.78 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:40:42,794 epoch 3 - iter 144/723 - loss 0.20244048 - time (sec): 3.17 - samples/sec: 11109.25 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:40:44,423 epoch 3 - iter 216/723 - loss 0.19849172 - time (sec): 4.80 - samples/sec: 11002.12 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:40:46,182 epoch 3 - iter 288/723 - loss 0.19769252 - time (sec): 6.56 - samples/sec: 10779.32 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:40:47,893 epoch 3 - iter 360/723 - loss 0.19995418 - time (sec): 8.27 - samples/sec: 10451.44 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:40:49,679 epoch 3 - iter 432/723 - loss 0.20011392 - time (sec): 10.06 - samples/sec: 10377.07 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:40:51,486 epoch 3 - iter 504/723 - loss 0.19929608 - time (sec): 11.86 - samples/sec: 10340.39 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:40:53,195 epoch 3 - iter 576/723 - loss 0.20030929 - time (sec): 13.57 - samples/sec: 10252.78 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:40:54,998 epoch 3 - iter 648/723 - loss 0.20073613 - time (sec): 15.38 - samples/sec: 10270.49 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:40:56,810 epoch 3 - iter 720/723 - loss 0.19498318 - time (sec): 17.19 - samples/sec: 10222.93 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:40:56,873 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:40:56,873 EPOCH 3 done: loss 0.1949 - lr: 0.000023 2023-10-18 22:40:58,634 DEV : loss 0.2251613438129425 - f1-score (micro avg) 0.3257 2023-10-18 22:40:58,648 saving best model 2023-10-18 22:40:58,686 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:41:00,437 epoch 4 - iter 72/723 - loss 0.19403979 - time (sec): 1.75 - samples/sec: 10149.67 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:41:02,183 epoch 4 - iter 144/723 - loss 0.18784146 - time (sec): 3.50 - samples/sec: 9748.92 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:41:03,952 epoch 4 - iter 216/723 - loss 0.19203285 - time (sec): 5.26 - samples/sec: 9895.01 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:41:05,781 epoch 4 - iter 288/723 - loss 0.18324681 - time (sec): 7.09 - samples/sec: 9871.53 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:41:07,582 epoch 4 - iter 360/723 - loss 0.18156486 - time (sec): 8.89 - samples/sec: 9953.49 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:41:09,325 epoch 4 - iter 432/723 - loss 0.18116754 - time (sec): 10.64 - samples/sec: 9991.78 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:41:11,053 epoch 4 - iter 504/723 - loss 0.18093047 - time (sec): 12.37 - samples/sec: 9946.49 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:41:12,847 epoch 4 - iter 576/723 - loss 0.18201373 - time (sec): 14.16 - samples/sec: 9930.77 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:41:14,597 epoch 4 - iter 648/723 - loss 0.18085010 - time (sec): 15.91 - samples/sec: 9954.75 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:41:16,329 epoch 4 - iter 720/723 - loss 0.17999634 - time (sec): 17.64 - samples/sec: 9957.84 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:41:16,401 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:41:16,401 EPOCH 4 done: loss 0.1797 - lr: 0.000020 2023-10-18 22:41:18,476 DEV : loss 0.20678313076496124 - f1-score (micro avg) 0.3891 2023-10-18 22:41:18,490 saving best model 2023-10-18 22:41:18,525 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:41:20,277 epoch 5 - iter 72/723 - loss 0.17834339 - time (sec): 1.75 - samples/sec: 9709.97 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:41:22,062 epoch 5 - iter 144/723 - loss 0.16479979 - time (sec): 3.54 - samples/sec: 9719.79 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:41:23,813 epoch 5 - iter 216/723 - loss 0.16544296 - time (sec): 5.29 - samples/sec: 9775.51 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:41:25,599 epoch 5 - iter 288/723 - loss 0.16908757 - time (sec): 7.07 - samples/sec: 9698.46 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:41:27,411 epoch 5 - iter 360/723 - loss 0.16724889 - time (sec): 8.89 - samples/sec: 9833.64 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:41:29,173 epoch 5 - iter 432/723 - loss 0.16559838 - time (sec): 10.65 - samples/sec: 9937.96 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:41:30,855 epoch 5 - iter 504/723 - loss 0.16547999 - time (sec): 12.33 - samples/sec: 9999.51 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:41:32,671 epoch 5 - iter 576/723 - loss 0.16944376 - time (sec): 14.15 - samples/sec: 10007.08 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:41:34,392 epoch 5 - iter 648/723 - loss 0.17032973 - time (sec): 15.87 - samples/sec: 9953.67 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:41:36,124 epoch 5 - iter 720/723 - loss 0.16793137 - time (sec): 17.60 - samples/sec: 9971.25 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:41:36,187 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:41:36,187 EPOCH 5 done: loss 0.1682 - lr: 0.000017 2023-10-18 22:41:37,950 DEV : loss 0.21079857647418976 - f1-score (micro avg) 0.4172 2023-10-18 22:41:37,965 saving best model 2023-10-18 22:41:38,003 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:41:39,687 epoch 6 - iter 72/723 - loss 0.15977195 - time (sec): 1.68 - samples/sec: 9982.07 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:41:41,451 epoch 6 - iter 144/723 - loss 0.16016708 - time (sec): 3.45 - samples/sec: 10061.70 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:41:43,215 epoch 6 - iter 216/723 - loss 0.16708014 - time (sec): 5.21 - samples/sec: 10028.21 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:41:45,039 epoch 6 - iter 288/723 - loss 0.16529748 - time (sec): 7.04 - samples/sec: 10064.56 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:41:46,864 epoch 6 - iter 360/723 - loss 0.16782442 - time (sec): 8.86 - samples/sec: 10167.58 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:41:48,558 epoch 6 - iter 432/723 - loss 0.16816204 - time (sec): 10.55 - samples/sec: 10059.28 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:41:50,301 epoch 6 - iter 504/723 - loss 0.16538144 - time (sec): 12.30 - samples/sec: 10045.86 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:41:52,379 epoch 6 - iter 576/723 - loss 0.16214967 - time (sec): 14.38 - samples/sec: 9747.49 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:41:53,863 epoch 6 - iter 648/723 - loss 0.16245142 - time (sec): 15.86 - samples/sec: 9935.83 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:41:55,341 epoch 6 - iter 720/723 - loss 0.16180617 - time (sec): 17.34 - samples/sec: 10132.02 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:41:55,395 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:41:55,395 EPOCH 6 done: loss 0.1620 - lr: 0.000013 2023-10-18 22:41:57,177 DEV : loss 0.19605979323387146 - f1-score (micro avg) 0.4388 2023-10-18 22:41:57,192 saving best model 2023-10-18 22:41:57,229 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:41:59,142 epoch 7 - iter 72/723 - loss 0.15681213 - time (sec): 1.91 - samples/sec: 9891.47 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:42:01,017 epoch 7 - iter 144/723 - loss 0.16028128 - time (sec): 3.79 - samples/sec: 9988.93 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:42:02,852 epoch 7 - iter 216/723 - loss 0.15737959 - time (sec): 5.62 - samples/sec: 9661.12 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:42:04,713 epoch 7 - iter 288/723 - loss 0.15461758 - time (sec): 7.48 - samples/sec: 9652.08 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:42:06,639 epoch 7 - iter 360/723 - loss 0.15562415 - time (sec): 9.41 - samples/sec: 9633.34 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:42:08,393 epoch 7 - iter 432/723 - loss 0.15601858 - time (sec): 11.16 - samples/sec: 9547.00 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:42:10,195 epoch 7 - iter 504/723 - loss 0.15634551 - time (sec): 12.97 - samples/sec: 9604.06 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:42:11,917 epoch 7 - iter 576/723 - loss 0.15605432 - time (sec): 14.69 - samples/sec: 9593.42 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:42:13,742 epoch 7 - iter 648/723 - loss 0.15788652 - time (sec): 16.51 - samples/sec: 9581.30 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:42:15,506 epoch 7 - iter 720/723 - loss 0.15585343 - time (sec): 18.28 - samples/sec: 9595.68 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:42:15,576 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:42:15,577 EPOCH 7 done: loss 0.1557 - lr: 0.000010 2023-10-18 22:42:17,340 DEV : loss 0.18651294708251953 - f1-score (micro avg) 0.4956 2023-10-18 22:42:17,355 saving best model 2023-10-18 22:42:17,391 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:42:19,153 epoch 8 - iter 72/723 - loss 0.14374815 - time (sec): 1.76 - samples/sec: 10673.65 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:42:20,917 epoch 8 - iter 144/723 - loss 0.14794415 - time (sec): 3.53 - samples/sec: 10127.24 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:42:22,804 epoch 8 - iter 216/723 - loss 0.14360194 - time (sec): 5.41 - samples/sec: 10006.46 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:42:25,061 epoch 8 - iter 288/723 - loss 0.14897909 - time (sec): 7.67 - samples/sec: 9460.82 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:42:26,865 epoch 8 - iter 360/723 - loss 0.14900531 - time (sec): 9.47 - samples/sec: 9464.33 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:42:28,695 epoch 8 - iter 432/723 - loss 0.15163665 - time (sec): 11.30 - samples/sec: 9484.24 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:42:30,451 epoch 8 - iter 504/723 - loss 0.15049384 - time (sec): 13.06 - samples/sec: 9426.82 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:42:32,371 epoch 8 - iter 576/723 - loss 0.15391483 - time (sec): 14.98 - samples/sec: 9457.19 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:42:34,142 epoch 8 - iter 648/723 - loss 0.15242765 - time (sec): 16.75 - samples/sec: 9459.25 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:42:35,955 epoch 8 - iter 720/723 - loss 0.15035247 - time (sec): 18.56 - samples/sec: 9469.65 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:42:36,018 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:42:36,018 EPOCH 8 done: loss 0.1509 - lr: 0.000007 2023-10-18 22:42:37,781 DEV : loss 0.1935824304819107 - f1-score (micro avg) 0.4684 2023-10-18 22:42:37,796 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:42:39,582 epoch 9 - iter 72/723 - loss 0.14444690 - time (sec): 1.78 - samples/sec: 10509.51 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:42:41,335 epoch 9 - iter 144/723 - loss 0.14963003 - time (sec): 3.54 - samples/sec: 10394.85 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:42:43,151 epoch 9 - iter 216/723 - loss 0.14951345 - time (sec): 5.35 - samples/sec: 10242.28 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:42:45,037 epoch 9 - iter 288/723 - loss 0.15149133 - time (sec): 7.24 - samples/sec: 10096.19 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:42:46,793 epoch 9 - iter 360/723 - loss 0.15160429 - time (sec): 9.00 - samples/sec: 9962.62 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:42:48,624 epoch 9 - iter 432/723 - loss 0.15049137 - time (sec): 10.83 - samples/sec: 9947.91 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:42:50,372 epoch 9 - iter 504/723 - loss 0.15011278 - time (sec): 12.58 - samples/sec: 9938.17 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:42:52,084 epoch 9 - iter 576/723 - loss 0.14872092 - time (sec): 14.29 - samples/sec: 9907.26 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:42:53,983 epoch 9 - iter 648/723 - loss 0.14660520 - time (sec): 16.19 - samples/sec: 9860.50 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:42:55,746 epoch 9 - iter 720/723 - loss 0.14745168 - time (sec): 17.95 - samples/sec: 9787.18 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:42:55,805 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:42:55,805 EPOCH 9 done: loss 0.1474 - lr: 0.000003 2023-10-18 22:42:57,583 DEV : loss 0.18601642549037933 - f1-score (micro avg) 0.4934 2023-10-18 22:42:57,598 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:42:59,400 epoch 10 - iter 72/723 - loss 0.14058348 - time (sec): 1.80 - samples/sec: 9685.58 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:43:01,619 epoch 10 - iter 144/723 - loss 0.12600196 - time (sec): 4.02 - samples/sec: 8799.97 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:43:03,414 epoch 10 - iter 216/723 - loss 0.13939819 - time (sec): 5.82 - samples/sec: 9148.79 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:43:05,189 epoch 10 - iter 288/723 - loss 0.14322160 - time (sec): 7.59 - samples/sec: 9259.14 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:43:07,030 epoch 10 - iter 360/723 - loss 0.14226797 - time (sec): 9.43 - samples/sec: 9420.73 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:43:08,799 epoch 10 - iter 432/723 - loss 0.14213505 - time (sec): 11.20 - samples/sec: 9483.74 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:43:10,537 epoch 10 - iter 504/723 - loss 0.14173779 - time (sec): 12.94 - samples/sec: 9470.36 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:43:12,404 epoch 10 - iter 576/723 - loss 0.14185805 - time (sec): 14.81 - samples/sec: 9513.49 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:43:14,188 epoch 10 - iter 648/723 - loss 0.14424631 - time (sec): 16.59 - samples/sec: 9497.25 - lr: 0.000000 - momentum: 0.000000 2023-10-18 22:43:16,019 epoch 10 - iter 720/723 - loss 0.14588941 - time (sec): 18.42 - samples/sec: 9528.18 - lr: 0.000000 - momentum: 0.000000 2023-10-18 22:43:16,079 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:43:16,080 EPOCH 10 done: loss 0.1456 - lr: 0.000000 2023-10-18 22:43:17,855 DEV : loss 0.1870052069425583 - f1-score (micro avg) 0.4892 2023-10-18 22:43:17,901 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:43:17,902 Loading model from best epoch ... 2023-10-18 22:43:17,984 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-18 22:43:19,327 Results: - F-score (micro) 0.5152 - F-score (macro) 0.3532 - Accuracy 0.3613 By class: precision recall f1-score support LOC 0.5805 0.5983 0.5892 458 PER 0.6744 0.3610 0.4703 482 ORG 0.0000 0.0000 0.0000 69 micro avg 0.6137 0.4440 0.5152 1009 macro avg 0.4183 0.3197 0.3532 1009 weighted avg 0.5857 0.4440 0.4921 1009 2023-10-18 22:43:19,327 ----------------------------------------------------------------------------------------------------