2023-10-18 22:19:42,151 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:19:42,151 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 22:19:42,151 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:19:42,151 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-18 22:19:42,152 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:19:42,152 Train: 5777 sentences 2023-10-18 22:19:42,152 (train_with_dev=False, train_with_test=False) 2023-10-18 22:19:42,152 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:19:42,152 Training Params: 2023-10-18 22:19:42,152 - learning_rate: "5e-05" 2023-10-18 22:19:42,152 - mini_batch_size: "4" 2023-10-18 22:19:42,152 - max_epochs: "10" 2023-10-18 22:19:42,152 - shuffle: "True" 2023-10-18 22:19:42,152 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:19:42,152 Plugins: 2023-10-18 22:19:42,152 - TensorboardLogger 2023-10-18 22:19:42,152 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 22:19:42,152 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:19:42,152 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 22:19:42,152 - metric: "('micro avg', 'f1-score')" 2023-10-18 22:19:42,152 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:19:42,152 Computation: 2023-10-18 22:19:42,152 - compute on device: cuda:0 2023-10-18 22:19:42,152 - embedding storage: none 2023-10-18 22:19:42,152 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:19:42,152 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-18 22:19:42,152 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:19:42,152 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:19:42,152 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 22:19:44,520 epoch 1 - iter 144/1445 - loss 2.32460041 - time (sec): 2.37 - samples/sec: 7843.84 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:19:46,876 epoch 1 - iter 288/1445 - loss 1.92642473 - time (sec): 4.72 - samples/sec: 7464.80 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:19:49,279 epoch 1 - iter 432/1445 - loss 1.48629302 - time (sec): 7.13 - samples/sec: 7314.71 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:19:51,709 epoch 1 - iter 576/1445 - loss 1.18873264 - time (sec): 9.56 - samples/sec: 7369.69 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:19:54,136 epoch 1 - iter 720/1445 - loss 1.01230605 - time (sec): 11.98 - samples/sec: 7321.67 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:19:56,577 epoch 1 - iter 864/1445 - loss 0.89552495 - time (sec): 14.42 - samples/sec: 7303.88 - lr: 0.000030 - momentum: 0.000000 2023-10-18 22:19:59,023 epoch 1 - iter 1008/1445 - loss 0.80131175 - time (sec): 16.87 - samples/sec: 7321.53 - lr: 0.000035 - momentum: 0.000000 2023-10-18 22:20:01,418 epoch 1 - iter 1152/1445 - loss 0.73544654 - time (sec): 19.26 - samples/sec: 7307.75 - lr: 0.000040 - momentum: 0.000000 2023-10-18 22:20:03,856 epoch 1 - iter 1296/1445 - loss 0.67844139 - time (sec): 21.70 - samples/sec: 7311.07 - lr: 0.000045 - momentum: 0.000000 2023-10-18 22:20:06,242 epoch 1 - iter 1440/1445 - loss 0.63343893 - time (sec): 24.09 - samples/sec: 7293.28 - lr: 0.000050 - momentum: 0.000000 2023-10-18 22:20:06,318 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:20:06,318 EPOCH 1 done: loss 0.6319 - lr: 0.000050 2023-10-18 22:20:07,617 DEV : loss 0.2580135762691498 - f1-score (micro avg) 0.0321 2023-10-18 22:20:07,631 saving best model 2023-10-18 22:20:07,663 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:20:10,040 epoch 2 - iter 144/1445 - loss 0.25850124 - time (sec): 2.38 - samples/sec: 6901.52 - lr: 0.000049 - momentum: 0.000000 2023-10-18 22:20:12,456 epoch 2 - iter 288/1445 - loss 0.21607017 - time (sec): 4.79 - samples/sec: 7204.82 - lr: 0.000049 - momentum: 0.000000 2023-10-18 22:20:14,896 epoch 2 - iter 432/1445 - loss 0.21285833 - time (sec): 7.23 - samples/sec: 7316.52 - lr: 0.000048 - momentum: 0.000000 2023-10-18 22:20:17,318 epoch 2 - iter 576/1445 - loss 0.20677689 - time (sec): 9.65 - samples/sec: 7313.33 - lr: 0.000048 - momentum: 0.000000 2023-10-18 22:20:19,730 epoch 2 - iter 720/1445 - loss 0.20570996 - time (sec): 12.07 - samples/sec: 7274.59 - lr: 0.000047 - momentum: 0.000000 2023-10-18 22:20:22,104 epoch 2 - iter 864/1445 - loss 0.20513499 - time (sec): 14.44 - samples/sec: 7277.59 - lr: 0.000047 - momentum: 0.000000 2023-10-18 22:20:24,416 epoch 2 - iter 1008/1445 - loss 0.20635190 - time (sec): 16.75 - samples/sec: 7260.09 - lr: 0.000046 - momentum: 0.000000 2023-10-18 22:20:26,712 epoch 2 - iter 1152/1445 - loss 0.20450060 - time (sec): 19.05 - samples/sec: 7370.36 - lr: 0.000046 - momentum: 0.000000 2023-10-18 22:20:29,099 epoch 2 - iter 1296/1445 - loss 0.20474662 - time (sec): 21.43 - samples/sec: 7327.74 - lr: 0.000045 - momentum: 0.000000 2023-10-18 22:20:31,543 epoch 2 - iter 1440/1445 - loss 0.20185081 - time (sec): 23.88 - samples/sec: 7358.25 - lr: 0.000044 - momentum: 0.000000 2023-10-18 22:20:31,615 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:20:31,615 EPOCH 2 done: loss 0.2018 - lr: 0.000044 2023-10-18 22:20:33,720 DEV : loss 0.22160635888576508 - f1-score (micro avg) 0.4189 2023-10-18 22:20:33,734 saving best model 2023-10-18 22:20:33,772 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:20:36,151 epoch 3 - iter 144/1445 - loss 0.17588072 - time (sec): 2.38 - samples/sec: 7116.26 - lr: 0.000044 - momentum: 0.000000 2023-10-18 22:20:38,467 epoch 3 - iter 288/1445 - loss 0.18389477 - time (sec): 4.69 - samples/sec: 7249.76 - lr: 0.000043 - momentum: 0.000000 2023-10-18 22:20:41,000 epoch 3 - iter 432/1445 - loss 0.18625268 - time (sec): 7.23 - samples/sec: 7422.60 - lr: 0.000043 - momentum: 0.000000 2023-10-18 22:20:43,401 epoch 3 - iter 576/1445 - loss 0.18382187 - time (sec): 9.63 - samples/sec: 7308.74 - lr: 0.000042 - momentum: 0.000000 2023-10-18 22:20:45,750 epoch 3 - iter 720/1445 - loss 0.18063974 - time (sec): 11.98 - samples/sec: 7350.99 - lr: 0.000042 - momentum: 0.000000 2023-10-18 22:20:48,188 epoch 3 - iter 864/1445 - loss 0.17872188 - time (sec): 14.42 - samples/sec: 7299.23 - lr: 0.000041 - momentum: 0.000000 2023-10-18 22:20:50,696 epoch 3 - iter 1008/1445 - loss 0.17759296 - time (sec): 16.92 - samples/sec: 7346.30 - lr: 0.000041 - momentum: 0.000000 2023-10-18 22:20:53,006 epoch 3 - iter 1152/1445 - loss 0.17684439 - time (sec): 19.23 - samples/sec: 7314.90 - lr: 0.000040 - momentum: 0.000000 2023-10-18 22:20:55,421 epoch 3 - iter 1296/1445 - loss 0.17496314 - time (sec): 21.65 - samples/sec: 7340.69 - lr: 0.000039 - momentum: 0.000000 2023-10-18 22:20:57,832 epoch 3 - iter 1440/1445 - loss 0.17443954 - time (sec): 24.06 - samples/sec: 7302.03 - lr: 0.000039 - momentum: 0.000000 2023-10-18 22:20:57,909 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:20:57,909 EPOCH 3 done: loss 0.1743 - lr: 0.000039 2023-10-18 22:20:59,671 DEV : loss 0.24006225168704987 - f1-score (micro avg) 0.3782 2023-10-18 22:20:59,685 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:21:02,053 epoch 4 - iter 144/1445 - loss 0.15704023 - time (sec): 2.37 - samples/sec: 7426.25 - lr: 0.000038 - momentum: 0.000000 2023-10-18 22:21:04,482 epoch 4 - iter 288/1445 - loss 0.14220206 - time (sec): 4.80 - samples/sec: 7531.94 - lr: 0.000038 - momentum: 0.000000 2023-10-18 22:21:06,890 epoch 4 - iter 432/1445 - loss 0.14888125 - time (sec): 7.20 - samples/sec: 7379.14 - lr: 0.000037 - momentum: 0.000000 2023-10-18 22:21:09,305 epoch 4 - iter 576/1445 - loss 0.14940799 - time (sec): 9.62 - samples/sec: 7363.11 - lr: 0.000037 - momentum: 0.000000 2023-10-18 22:21:11,606 epoch 4 - iter 720/1445 - loss 0.15192125 - time (sec): 11.92 - samples/sec: 7313.30 - lr: 0.000036 - momentum: 0.000000 2023-10-18 22:21:13,981 epoch 4 - iter 864/1445 - loss 0.15366328 - time (sec): 14.30 - samples/sec: 7376.79 - lr: 0.000036 - momentum: 0.000000 2023-10-18 22:21:16,386 epoch 4 - iter 1008/1445 - loss 0.15610653 - time (sec): 16.70 - samples/sec: 7371.99 - lr: 0.000035 - momentum: 0.000000 2023-10-18 22:21:18,758 epoch 4 - iter 1152/1445 - loss 0.15517198 - time (sec): 19.07 - samples/sec: 7361.65 - lr: 0.000034 - momentum: 0.000000 2023-10-18 22:21:20,878 epoch 4 - iter 1296/1445 - loss 0.15593427 - time (sec): 21.19 - samples/sec: 7424.63 - lr: 0.000034 - momentum: 0.000000 2023-10-18 22:21:23,147 epoch 4 - iter 1440/1445 - loss 0.15639925 - time (sec): 23.46 - samples/sec: 7485.62 - lr: 0.000033 - momentum: 0.000000 2023-10-18 22:21:23,246 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:21:23,246 EPOCH 4 done: loss 0.1564 - lr: 0.000033 2023-10-18 22:21:25,049 DEV : loss 0.18633760511875153 - f1-score (micro avg) 0.4766 2023-10-18 22:21:25,065 saving best model 2023-10-18 22:21:25,102 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:21:27,566 epoch 5 - iter 144/1445 - loss 0.14453777 - time (sec): 2.46 - samples/sec: 6981.73 - lr: 0.000033 - momentum: 0.000000 2023-10-18 22:21:30,047 epoch 5 - iter 288/1445 - loss 0.13657981 - time (sec): 4.94 - samples/sec: 7330.13 - lr: 0.000032 - momentum: 0.000000 2023-10-18 22:21:32,419 epoch 5 - iter 432/1445 - loss 0.13339998 - time (sec): 7.32 - samples/sec: 7300.07 - lr: 0.000032 - momentum: 0.000000 2023-10-18 22:21:34,892 epoch 5 - iter 576/1445 - loss 0.13696365 - time (sec): 9.79 - samples/sec: 7313.46 - lr: 0.000031 - momentum: 0.000000 2023-10-18 22:21:37,351 epoch 5 - iter 720/1445 - loss 0.13780642 - time (sec): 12.25 - samples/sec: 7348.95 - lr: 0.000031 - momentum: 0.000000 2023-10-18 22:21:39,733 epoch 5 - iter 864/1445 - loss 0.13896165 - time (sec): 14.63 - samples/sec: 7378.00 - lr: 0.000030 - momentum: 0.000000 2023-10-18 22:21:42,160 epoch 5 - iter 1008/1445 - loss 0.13873270 - time (sec): 17.06 - samples/sec: 7319.29 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:21:44,629 epoch 5 - iter 1152/1445 - loss 0.14038297 - time (sec): 19.53 - samples/sec: 7340.56 - lr: 0.000029 - momentum: 0.000000 2023-10-18 22:21:46,979 epoch 5 - iter 1296/1445 - loss 0.14262285 - time (sec): 21.88 - samples/sec: 7313.89 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:21:49,338 epoch 5 - iter 1440/1445 - loss 0.14218793 - time (sec): 24.24 - samples/sec: 7241.77 - lr: 0.000028 - momentum: 0.000000 2023-10-18 22:21:49,415 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:21:49,416 EPOCH 5 done: loss 0.1420 - lr: 0.000028 2023-10-18 22:21:51,587 DEV : loss 0.1872384250164032 - f1-score (micro avg) 0.4926 2023-10-18 22:21:51,602 saving best model 2023-10-18 22:21:51,637 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:21:54,062 epoch 6 - iter 144/1445 - loss 0.14976542 - time (sec): 2.42 - samples/sec: 7553.71 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:21:56,442 epoch 6 - iter 288/1445 - loss 0.14270623 - time (sec): 4.80 - samples/sec: 7364.78 - lr: 0.000027 - momentum: 0.000000 2023-10-18 22:21:58,860 epoch 6 - iter 432/1445 - loss 0.14528908 - time (sec): 7.22 - samples/sec: 7275.36 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:22:01,266 epoch 6 - iter 576/1445 - loss 0.13874679 - time (sec): 9.63 - samples/sec: 7350.00 - lr: 0.000026 - momentum: 0.000000 2023-10-18 22:22:03,716 epoch 6 - iter 720/1445 - loss 0.13598692 - time (sec): 12.08 - samples/sec: 7368.26 - lr: 0.000025 - momentum: 0.000000 2023-10-18 22:22:06,081 epoch 6 - iter 864/1445 - loss 0.13518192 - time (sec): 14.44 - samples/sec: 7323.88 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:22:08,376 epoch 6 - iter 1008/1445 - loss 0.13600947 - time (sec): 16.74 - samples/sec: 7333.86 - lr: 0.000024 - momentum: 0.000000 2023-10-18 22:22:10,772 epoch 6 - iter 1152/1445 - loss 0.13696181 - time (sec): 19.13 - samples/sec: 7338.30 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:22:13,239 epoch 6 - iter 1296/1445 - loss 0.13477708 - time (sec): 21.60 - samples/sec: 7319.71 - lr: 0.000023 - momentum: 0.000000 2023-10-18 22:22:15,372 epoch 6 - iter 1440/1445 - loss 0.13372540 - time (sec): 23.73 - samples/sec: 7394.25 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:22:15,481 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:22:15,481 EPOCH 6 done: loss 0.1338 - lr: 0.000022 2023-10-18 22:22:17,258 DEV : loss 0.18421348929405212 - f1-score (micro avg) 0.4972 2023-10-18 22:22:17,272 saving best model 2023-10-18 22:22:17,311 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:22:19,730 epoch 7 - iter 144/1445 - loss 0.12880539 - time (sec): 2.42 - samples/sec: 6843.39 - lr: 0.000022 - momentum: 0.000000 2023-10-18 22:22:22,186 epoch 7 - iter 288/1445 - loss 0.12742790 - time (sec): 4.87 - samples/sec: 7312.81 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:22:24,578 epoch 7 - iter 432/1445 - loss 0.12845746 - time (sec): 7.27 - samples/sec: 7399.15 - lr: 0.000021 - momentum: 0.000000 2023-10-18 22:22:26,964 epoch 7 - iter 576/1445 - loss 0.12798111 - time (sec): 9.65 - samples/sec: 7358.81 - lr: 0.000020 - momentum: 0.000000 2023-10-18 22:22:29,289 epoch 7 - iter 720/1445 - loss 0.13016094 - time (sec): 11.98 - samples/sec: 7385.20 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:22:31,764 epoch 7 - iter 864/1445 - loss 0.12801634 - time (sec): 14.45 - samples/sec: 7342.76 - lr: 0.000019 - momentum: 0.000000 2023-10-18 22:22:34,149 epoch 7 - iter 1008/1445 - loss 0.12857662 - time (sec): 16.84 - samples/sec: 7337.88 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:22:36,569 epoch 7 - iter 1152/1445 - loss 0.13054183 - time (sec): 19.26 - samples/sec: 7408.41 - lr: 0.000018 - momentum: 0.000000 2023-10-18 22:22:38,945 epoch 7 - iter 1296/1445 - loss 0.13113824 - time (sec): 21.63 - samples/sec: 7358.19 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:22:41,336 epoch 7 - iter 1440/1445 - loss 0.12852518 - time (sec): 24.02 - samples/sec: 7317.06 - lr: 0.000017 - momentum: 0.000000 2023-10-18 22:22:41,408 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:22:41,408 EPOCH 7 done: loss 0.1285 - lr: 0.000017 2023-10-18 22:22:43,179 DEV : loss 0.19626782834529877 - f1-score (micro avg) 0.502 2023-10-18 22:22:43,193 saving best model 2023-10-18 22:22:43,230 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:22:45,753 epoch 8 - iter 144/1445 - loss 0.14102370 - time (sec): 2.52 - samples/sec: 7557.82 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:22:48,152 epoch 8 - iter 288/1445 - loss 0.13225445 - time (sec): 4.92 - samples/sec: 7426.91 - lr: 0.000016 - momentum: 0.000000 2023-10-18 22:22:50,490 epoch 8 - iter 432/1445 - loss 0.12664049 - time (sec): 7.26 - samples/sec: 7311.35 - lr: 0.000015 - momentum: 0.000000 2023-10-18 22:22:52,853 epoch 8 - iter 576/1445 - loss 0.12982041 - time (sec): 9.62 - samples/sec: 7233.61 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:22:55,227 epoch 8 - iter 720/1445 - loss 0.12689940 - time (sec): 12.00 - samples/sec: 7186.26 - lr: 0.000014 - momentum: 0.000000 2023-10-18 22:22:57,640 epoch 8 - iter 864/1445 - loss 0.12659690 - time (sec): 14.41 - samples/sec: 7191.10 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:23:00,061 epoch 8 - iter 1008/1445 - loss 0.12444468 - time (sec): 16.83 - samples/sec: 7235.97 - lr: 0.000013 - momentum: 0.000000 2023-10-18 22:23:02,526 epoch 8 - iter 1152/1445 - loss 0.12377697 - time (sec): 19.29 - samples/sec: 7266.53 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:23:04,948 epoch 8 - iter 1296/1445 - loss 0.12275030 - time (sec): 21.72 - samples/sec: 7270.44 - lr: 0.000012 - momentum: 0.000000 2023-10-18 22:23:07,360 epoch 8 - iter 1440/1445 - loss 0.12073718 - time (sec): 24.13 - samples/sec: 7281.52 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:23:07,443 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:23:07,443 EPOCH 8 done: loss 0.1209 - lr: 0.000011 2023-10-18 22:23:09,575 DEV : loss 0.19142475724220276 - f1-score (micro avg) 0.5434 2023-10-18 22:23:09,590 saving best model 2023-10-18 22:23:09,625 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:23:12,078 epoch 9 - iter 144/1445 - loss 0.10959720 - time (sec): 2.45 - samples/sec: 6974.25 - lr: 0.000011 - momentum: 0.000000 2023-10-18 22:23:14,476 epoch 9 - iter 288/1445 - loss 0.10667037 - time (sec): 4.85 - samples/sec: 7252.19 - lr: 0.000010 - momentum: 0.000000 2023-10-18 22:23:16,919 epoch 9 - iter 432/1445 - loss 0.11178967 - time (sec): 7.29 - samples/sec: 7120.08 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:23:19,305 epoch 9 - iter 576/1445 - loss 0.11103154 - time (sec): 9.68 - samples/sec: 7277.83 - lr: 0.000009 - momentum: 0.000000 2023-10-18 22:23:21,730 epoch 9 - iter 720/1445 - loss 0.11192122 - time (sec): 12.10 - samples/sec: 7281.87 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:23:24,086 epoch 9 - iter 864/1445 - loss 0.11451662 - time (sec): 14.46 - samples/sec: 7238.30 - lr: 0.000008 - momentum: 0.000000 2023-10-18 22:23:26,606 epoch 9 - iter 1008/1445 - loss 0.11503792 - time (sec): 16.98 - samples/sec: 7204.60 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:23:29,063 epoch 9 - iter 1152/1445 - loss 0.11784699 - time (sec): 19.44 - samples/sec: 7246.04 - lr: 0.000007 - momentum: 0.000000 2023-10-18 22:23:31,381 epoch 9 - iter 1296/1445 - loss 0.11854548 - time (sec): 21.75 - samples/sec: 7285.23 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:23:33,674 epoch 9 - iter 1440/1445 - loss 0.11760527 - time (sec): 24.05 - samples/sec: 7305.87 - lr: 0.000006 - momentum: 0.000000 2023-10-18 22:23:33,747 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:23:33,747 EPOCH 9 done: loss 0.1177 - lr: 0.000006 2023-10-18 22:23:35,527 DEV : loss 0.19222453236579895 - f1-score (micro avg) 0.5287 2023-10-18 22:23:35,541 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:23:37,961 epoch 10 - iter 144/1445 - loss 0.14194788 - time (sec): 2.42 - samples/sec: 7135.18 - lr: 0.000005 - momentum: 0.000000 2023-10-18 22:23:40,379 epoch 10 - iter 288/1445 - loss 0.12535540 - time (sec): 4.84 - samples/sec: 7232.39 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:23:42,883 epoch 10 - iter 432/1445 - loss 0.11875922 - time (sec): 7.34 - samples/sec: 7294.08 - lr: 0.000004 - momentum: 0.000000 2023-10-18 22:23:45,300 epoch 10 - iter 576/1445 - loss 0.11839753 - time (sec): 9.76 - samples/sec: 7234.05 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:23:47,747 epoch 10 - iter 720/1445 - loss 0.11701466 - time (sec): 12.20 - samples/sec: 7228.56 - lr: 0.000003 - momentum: 0.000000 2023-10-18 22:23:50,178 epoch 10 - iter 864/1445 - loss 0.11883233 - time (sec): 14.64 - samples/sec: 7197.45 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:23:52,601 epoch 10 - iter 1008/1445 - loss 0.11719932 - time (sec): 17.06 - samples/sec: 7194.57 - lr: 0.000002 - momentum: 0.000000 2023-10-18 22:23:55,010 epoch 10 - iter 1152/1445 - loss 0.11486660 - time (sec): 19.47 - samples/sec: 7252.97 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:23:57,481 epoch 10 - iter 1296/1445 - loss 0.11531310 - time (sec): 21.94 - samples/sec: 7272.64 - lr: 0.000001 - momentum: 0.000000 2023-10-18 22:23:59,863 epoch 10 - iter 1440/1445 - loss 0.11535435 - time (sec): 24.32 - samples/sec: 7227.50 - lr: 0.000000 - momentum: 0.000000 2023-10-18 22:23:59,939 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:23:59,939 EPOCH 10 done: loss 0.1154 - lr: 0.000000 2023-10-18 22:24:01,714 DEV : loss 0.19508427381515503 - f1-score (micro avg) 0.5306 2023-10-18 22:24:01,760 ---------------------------------------------------------------------------------------------------- 2023-10-18 22:24:01,761 Loading model from best epoch ... 2023-10-18 22:24:01,844 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-18 22:24:03,182 Results: - F-score (micro) 0.5675 - F-score (macro) 0.4004 - Accuracy 0.4057 By class: precision recall f1-score support LOC 0.6191 0.6638 0.6407 458 PER 0.5795 0.4917 0.5320 482 ORG 1.0000 0.0145 0.0286 69 micro avg 0.6016 0.5372 0.5675 1009 macro avg 0.7329 0.3900 0.4004 1009 weighted avg 0.6262 0.5372 0.5469 1009 2023-10-18 22:24:03,182 ----------------------------------------------------------------------------------------------------