2023-10-18 22:55:56,510 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:55:56,510 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 22:55:56,510 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:55:56,510 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-18 22:55:56,510 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:55:56,510 Train: 5777 sentences 2023-10-18 22:55:56,510 (train_with_dev=False, train_with_test=False) 2023-10-18 22:55:56,510 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:55:56,510 Training Params: 2023-10-18 22:55:56,510 - learning_rate: "3e-05" 2023-10-18 22:55:56,510 - mini_batch_size: "8" 2023-10-18 22:55:56,510 - max_epochs: "10" 2023-10-18 22:55:56,510 - shuffle: "True" 2023-10-18 22:55:56,510 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:55:56,510 Plugins: 2023-10-18 22:55:56,511 - TensorboardLogger 2023-10-18 22:55:56,511 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 22:55:56,511 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:55:56,511 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 22:55:56,511 - metric: "('micro avg', 'f1-score')" 2023-10-18 22:55:56,511 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:55:56,511 Computation: 2023-10-18 22:55:56,511 - compute on device: cuda:0 2023-10-18 22:55:56,511 - embedding storage: none 2023-10-18 22:55:56,511 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:55:56,511 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-18 22:55:56,511 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:55:56,511 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:55:56,511 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 22:55:58,254 epoch 1 - iter 72/723 - loss 3.42262036 - time (sec): 1.74 - samples/sec: 9302.87 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:56:00,005 epoch 1 - iter 144/723 - loss 3.24227113 - time (sec): 3.49 - samples/sec: 9609.69 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:56:01,802 epoch 1 - iter 216/723 - loss 2.95510672 - time (sec): 5.29 - samples/sec: 9718.03 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:56:03,598 epoch 1 - iter 288/723 - loss 2.57569294 - time (sec): 7.09 - samples/sec: 9834.91 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:56:05,446 epoch 1 - iter 360/723 - loss 2.21089906 - time (sec): 8.93 - samples/sec: 9689.02 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:56:07,307 epoch 1 - iter 432/723 - loss 1.92171641 - time (sec): 10.80 - samples/sec: 9584.33 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:56:09,078 epoch 1 - iter 504/723 - loss 1.70254512 - time (sec): 12.57 - samples/sec: 9580.68 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:56:10,968 epoch 1 - iter 576/723 - loss 1.52679826 - time (sec): 14.46 - samples/sec: 9625.95 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:56:12,823 epoch 1 - iter 648/723 - loss 1.38048154 - time (sec): 16.31 - samples/sec: 9688.82 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:56:14,681 epoch 1 - iter 720/723 - loss 1.27452802 - time (sec): 18.17 - samples/sec: 9669.11 - lr: 0.000030 - momentum: 0.000000 2023-10-18 22:56:14,747 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:56:14,747 EPOCH 1 done: loss 1.2718 - lr: 0.000030 2023-10-18 22:56:16,024 DEV : loss 0.383542537689209 - f1-score (micro avg) 0.0 2023-10-18 22:56:16,039 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:56:17,801 epoch 2 - iter 72/723 - loss 0.26690832 - time (sec): 1.76 - samples/sec: 10084.86 - lr: 0.000030 - momentum: 0.000000 2023-10-18 22:56:19,552 epoch 2 - iter 144/723 - loss 0.27198191 - time (sec): 3.51 - samples/sec: 9845.28 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:56:21,325 epoch 2 - iter 216/723 - loss 0.26855355 - time (sec): 5.28 - samples/sec: 9696.90 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:56:23,331 epoch 2 - iter 288/723 - loss 0.25857843 - time (sec): 7.29 - samples/sec: 9665.14 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:56:25,245 epoch 2 - iter 360/723 - loss 0.25269809 - time (sec): 9.21 - samples/sec: 9480.00 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:56:27,132 epoch 2 - iter 432/723 - loss 0.24641116 - time (sec): 11.09 - samples/sec: 9490.38 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:56:28,913 epoch 2 - iter 504/723 - loss 0.24268049 - time (sec): 12.87 - samples/sec: 9567.70 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:56:30,665 epoch 2 - iter 576/723 - loss 0.24535974 - time (sec): 14.63 - samples/sec: 9572.90 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:56:32,433 epoch 2 - iter 648/723 - loss 0.24136581 - time (sec): 16.39 - samples/sec: 9638.28 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:56:34,226 epoch 2 - iter 720/723 - loss 0.24079891 - time (sec): 18.19 - samples/sec: 9647.42 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:56:34,291 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:56:34,291 EPOCH 2 done: loss 0.2406 - lr: 0.000027 2023-10-18 22:56:36,363 DEV : loss 0.2568000555038452 - f1-score (micro avg) 0.1136 2023-10-18 22:56:36,378 saving best model 2023-10-18 22:56:36,409 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:56:38,173 epoch 3 - iter 72/723 - loss 0.20902337 - time (sec): 1.76 - samples/sec: 10632.39 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:56:40,014 epoch 3 - iter 144/723 - loss 0.21487658 - time (sec): 3.60 - samples/sec: 10443.35 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:56:41,785 epoch 3 - iter 216/723 - loss 0.20843859 - time (sec): 5.38 - samples/sec: 10261.69 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:56:43,590 epoch 3 - iter 288/723 - loss 0.20311414 - time (sec): 7.18 - samples/sec: 10178.10 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:56:45,423 epoch 3 - iter 360/723 - loss 0.20319710 - time (sec): 9.01 - samples/sec: 10025.50 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:56:47,180 epoch 3 - iter 432/723 - loss 0.20226321 - time (sec): 10.77 - samples/sec: 9889.11 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:56:48,993 epoch 3 - iter 504/723 - loss 0.20421469 - time (sec): 12.58 - samples/sec: 9890.71 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:56:50,707 epoch 3 - iter 576/723 - loss 0.20341312 - time (sec): 14.30 - samples/sec: 9879.51 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:56:52,498 epoch 3 - iter 648/723 - loss 0.19940795 - time (sec): 16.09 - samples/sec: 9891.77 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:56:54,232 epoch 3 - iter 720/723 - loss 0.19888487 - time (sec): 17.82 - samples/sec: 9845.70 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:56:54,305 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:56:54,305 EPOCH 3 done: loss 0.1986 - lr: 0.000023 2023-10-18 22:56:56,067 DEV : loss 0.23752760887145996 - f1-score (micro avg) 0.2402 2023-10-18 22:56:56,081 saving best model 2023-10-18 22:56:56,117 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:56:57,864 epoch 4 - iter 72/723 - loss 0.18226854 - time (sec): 1.75 - samples/sec: 9748.99 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:56:59,625 epoch 4 - iter 144/723 - loss 0.17750412 - time (sec): 3.51 - samples/sec: 9415.68 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:57:01,393 epoch 4 - iter 216/723 - loss 0.18220143 - time (sec): 5.28 - samples/sec: 9528.19 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:57:03,194 epoch 4 - iter 288/723 - loss 0.18036256 - time (sec): 7.08 - samples/sec: 9602.73 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:57:04,967 epoch 4 - iter 360/723 - loss 0.18084117 - time (sec): 8.85 - samples/sec: 9642.48 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:57:06,777 epoch 4 - iter 432/723 - loss 0.18157715 - time (sec): 10.66 - samples/sec: 9629.56 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:57:08,656 epoch 4 - iter 504/723 - loss 0.17979702 - time (sec): 12.54 - samples/sec: 9608.03 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:57:10,668 epoch 4 - iter 576/723 - loss 0.18168708 - time (sec): 14.55 - samples/sec: 9652.05 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:57:12,484 epoch 4 - iter 648/723 - loss 0.18042569 - time (sec): 16.37 - samples/sec: 9619.66 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:57:14,305 epoch 4 - iter 720/723 - loss 0.18043801 - time (sec): 18.19 - samples/sec: 9660.63 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:57:14,373 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:57:14,373 EPOCH 4 done: loss 0.1804 - lr: 0.000020 2023-10-18 22:57:16,480 DEV : loss 0.21891361474990845 - f1-score (micro avg) 0.3434 2023-10-18 22:57:16,495 saving best model 2023-10-18 22:57:16,531 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:57:18,323 epoch 5 - iter 72/723 - loss 0.17790732 - time (sec): 1.79 - samples/sec: 9958.89 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:57:20,100 epoch 5 - iter 144/723 - loss 0.17861548 - time (sec): 3.57 - samples/sec: 10105.93 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:57:21,842 epoch 5 - iter 216/723 - loss 0.17749356 - time (sec): 5.31 - samples/sec: 10116.70 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:57:23,657 epoch 5 - iter 288/723 - loss 0.17769605 - time (sec): 7.13 - samples/sec: 9890.75 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:57:25,353 epoch 5 - iter 360/723 - loss 0.17341481 - time (sec): 8.82 - samples/sec: 9828.62 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:57:27,135 epoch 5 - iter 432/723 - loss 0.17512851 - time (sec): 10.60 - samples/sec: 9813.50 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:57:28,937 epoch 5 - iter 504/723 - loss 0.17373162 - time (sec): 12.40 - samples/sec: 9795.38 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:57:30,802 epoch 5 - iter 576/723 - loss 0.17323872 - time (sec): 14.27 - samples/sec: 9816.46 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:57:32,548 epoch 5 - iter 648/723 - loss 0.17259816 - time (sec): 16.02 - samples/sec: 9833.06 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:57:34,416 epoch 5 - iter 720/723 - loss 0.17201555 - time (sec): 17.88 - samples/sec: 9836.04 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:57:34,471 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:57:34,471 EPOCH 5 done: loss 0.1721 - lr: 0.000017 2023-10-18 22:57:36,235 DEV : loss 0.21598245203495026 - f1-score (micro avg) 0.3603 2023-10-18 22:57:36,250 saving best model 2023-10-18 22:57:36,284 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:57:38,054 epoch 6 - iter 72/723 - loss 0.16484753 - time (sec): 1.77 - samples/sec: 9347.04 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:57:39,819 epoch 6 - iter 144/723 - loss 0.15606069 - time (sec): 3.53 - samples/sec: 9582.22 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:57:41,637 epoch 6 - iter 216/723 - loss 0.15864873 - time (sec): 5.35 - samples/sec: 9773.71 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:57:43,473 epoch 6 - iter 288/723 - loss 0.15870321 - time (sec): 7.19 - samples/sec: 9828.73 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:57:45,196 epoch 6 - iter 360/723 - loss 0.16234323 - time (sec): 8.91 - samples/sec: 9815.47 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:57:46,945 epoch 6 - iter 432/723 - loss 0.15856690 - time (sec): 10.66 - samples/sec: 9844.04 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:57:48,750 epoch 6 - iter 504/723 - loss 0.16153975 - time (sec): 12.47 - samples/sec: 9746.37 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:57:50,516 epoch 6 - iter 576/723 - loss 0.15864405 - time (sec): 14.23 - samples/sec: 9779.22 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:57:52,302 epoch 6 - iter 648/723 - loss 0.16169816 - time (sec): 16.02 - samples/sec: 9789.70 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:57:54,492 epoch 6 - iter 720/723 - loss 0.16273531 - time (sec): 18.21 - samples/sec: 9651.21 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:57:54,553 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:57:54,553 EPOCH 6 done: loss 0.1627 - lr: 0.000013 2023-10-18 22:57:56,329 DEV : loss 0.20462197065353394 - f1-score (micro avg) 0.4453 2023-10-18 22:57:56,343 saving best model 2023-10-18 22:57:56,380 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:57:58,120 epoch 7 - iter 72/723 - loss 0.15662991 - time (sec): 1.74 - samples/sec: 9472.72 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:57:59,877 epoch 7 - iter 144/723 - loss 0.15924704 - time (sec): 3.50 - samples/sec: 9819.69 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:58:01,664 epoch 7 - iter 216/723 - loss 0.16189171 - time (sec): 5.28 - samples/sec: 9682.85 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:58:03,519 epoch 7 - iter 288/723 - loss 0.16308365 - time (sec): 7.14 - samples/sec: 9742.21 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:58:05,279 epoch 7 - iter 360/723 - loss 0.15997371 - time (sec): 8.90 - samples/sec: 9728.59 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:58:07,107 epoch 7 - iter 432/723 - loss 0.15929076 - time (sec): 10.73 - samples/sec: 9703.79 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:58:08,914 epoch 7 - iter 504/723 - loss 0.15943523 - time (sec): 12.53 - samples/sec: 9763.34 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:58:10,735 epoch 7 - iter 576/723 - loss 0.15929299 - time (sec): 14.35 - samples/sec: 9753.67 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:58:12,561 epoch 7 - iter 648/723 - loss 0.16055045 - time (sec): 16.18 - samples/sec: 9764.87 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:58:14,285 epoch 7 - iter 720/723 - loss 0.15893188 - time (sec): 17.90 - samples/sec: 9809.31 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:58:14,357 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:58:14,357 EPOCH 7 done: loss 0.1588 - lr: 0.000010 2023-10-18 22:58:16,151 DEV : loss 0.20282398164272308 - f1-score (micro avg) 0.4363 2023-10-18 22:58:16,166 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:58:17,915 epoch 8 - iter 72/723 - loss 0.17813877 - time (sec): 1.75 - samples/sec: 9433.30 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:58:19,745 epoch 8 - iter 144/723 - loss 0.16727864 - time (sec): 3.58 - samples/sec: 9705.54 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:58:21,556 epoch 8 - iter 216/723 - loss 0.16087750 - time (sec): 5.39 - samples/sec: 9840.08 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:58:23,697 epoch 8 - iter 288/723 - loss 0.16062473 - time (sec): 7.53 - samples/sec: 9269.48 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:58:25,503 epoch 8 - iter 360/723 - loss 0.15637198 - time (sec): 9.34 - samples/sec: 9412.21 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:58:27,293 epoch 8 - iter 432/723 - loss 0.15307484 - time (sec): 11.13 - samples/sec: 9499.23 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:58:29,140 epoch 8 - iter 504/723 - loss 0.15285181 - time (sec): 12.97 - samples/sec: 9527.96 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:58:30,949 epoch 8 - iter 576/723 - loss 0.15081701 - time (sec): 14.78 - samples/sec: 9483.93 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:58:32,695 epoch 8 - iter 648/723 - loss 0.15212614 - time (sec): 16.53 - samples/sec: 9541.68 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:58:34,593 epoch 8 - iter 720/723 - loss 0.15293930 - time (sec): 18.43 - samples/sec: 9539.14 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:58:34,658 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:58:34,658 EPOCH 8 done: loss 0.1528 - lr: 0.000007 2023-10-18 22:58:36,427 DEV : loss 0.20108488202095032 - f1-score (micro avg) 0.4437 2023-10-18 22:58:36,441 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:58:38,298 epoch 9 - iter 72/723 - loss 0.15671408 - time (sec): 1.86 - samples/sec: 9684.00 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:58:40,047 epoch 9 - iter 144/723 - loss 0.15606990 - time (sec): 3.61 - samples/sec: 9977.58 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:58:41,672 epoch 9 - iter 216/723 - loss 0.14775714 - time (sec): 5.23 - samples/sec: 10078.84 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:58:43,185 epoch 9 - iter 288/723 - loss 0.14736930 - time (sec): 6.74 - samples/sec: 10426.94 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:58:44,941 epoch 9 - iter 360/723 - loss 0.14937082 - time (sec): 8.50 - samples/sec: 10387.93 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:58:46,773 epoch 9 - iter 432/723 - loss 0.14931943 - time (sec): 10.33 - samples/sec: 10324.00 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:58:48,632 epoch 9 - iter 504/723 - loss 0.15130844 - time (sec): 12.19 - samples/sec: 10210.11 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:58:50,375 epoch 9 - iter 576/723 - loss 0.15312491 - time (sec): 13.93 - samples/sec: 10138.98 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:58:52,141 epoch 9 - iter 648/723 - loss 0.15285247 - time (sec): 15.70 - samples/sec: 10116.75 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:58:53,970 epoch 9 - iter 720/723 - loss 0.15229974 - time (sec): 17.53 - samples/sec: 10023.57 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:58:54,038 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:58:54,038 EPOCH 9 done: loss 0.1524 - lr: 0.000003 2023-10-18 22:58:55,801 DEV : loss 0.19397485256195068 - f1-score (micro avg) 0.4627 2023-10-18 22:58:55,816 saving best model 2023-10-18 22:58:55,853 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:58:57,695 epoch 10 - iter 72/723 - loss 0.15773780 - time (sec): 1.84 - samples/sec: 9363.32 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:58:59,247 epoch 10 - iter 144/723 - loss 0.16358437 - time (sec): 3.39 - samples/sec: 10063.93 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:59:00,997 epoch 10 - iter 216/723 - loss 0.16315365 - time (sec): 5.14 - samples/sec: 10100.77 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:59:02,867 epoch 10 - iter 288/723 - loss 0.15797071 - time (sec): 7.01 - samples/sec: 10138.78 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:59:04,640 epoch 10 - iter 360/723 - loss 0.15242034 - time (sec): 8.79 - samples/sec: 10132.19 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:59:06,374 epoch 10 - iter 432/723 - loss 0.15330741 - time (sec): 10.52 - samples/sec: 10040.08 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:59:08,160 epoch 10 - iter 504/723 - loss 0.14940519 - time (sec): 12.31 - samples/sec: 10018.59 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:59:10,041 epoch 10 - iter 576/723 - loss 0.14798129 - time (sec): 14.19 - samples/sec: 9994.37 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:59:11,814 epoch 10 - iter 648/723 - loss 0.14933500 - time (sec): 15.96 - samples/sec: 9919.85 - lr: 0.000000 - momentum: 0.000000 2023-10-18 22:59:13,565 epoch 10 - iter 720/723 - loss 0.15018807 - time (sec): 17.71 - samples/sec: 9915.89 - lr: 0.000000 - momentum: 0.000000 2023-10-18 22:59:13,636 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:59:13,636 EPOCH 10 done: loss 0.1503 - lr: 0.000000 2023-10-18 22:59:15,409 DEV : loss 0.19576019048690796 - f1-score (micro avg) 0.4566 2023-10-18 22:59:15,453 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:59:15,453 Loading model from best epoch ... 2023-10-18 22:59:15,532 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-18 22:59:16,857 Results: - F-score (micro) 0.4758 - F-score (macro) 0.3265 - Accuracy 0.322 By class: precision recall f1-score support LOC 0.5332 0.5437 0.5384 458 PER 0.6653 0.3299 0.4411 482 ORG 0.0000 0.0000 0.0000 69 micro avg 0.5779 0.4044 0.4758 1009 macro avg 0.3995 0.2912 0.3265 1009 weighted avg 0.5598 0.4044 0.4551 1009 2023-10-18 22:59:16,857 ----------------------------------------------------------------------------------------------------