2023-10-17 17:38:15,449 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:15,450 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 17:38:15,450 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:15,450 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-17 17:38:15,451 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:15,451 Train: 5777 sentences 2023-10-17 17:38:15,451 (train_with_dev=False, train_with_test=False) 2023-10-17 17:38:15,451 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:15,451 Training Params: 2023-10-17 17:38:15,451 - learning_rate: "3e-05" 2023-10-17 17:38:15,451 - mini_batch_size: "4" 2023-10-17 17:38:15,451 - max_epochs: "10" 2023-10-17 17:38:15,451 - shuffle: "True" 2023-10-17 17:38:15,451 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:15,451 Plugins: 2023-10-17 17:38:15,451 - TensorboardLogger 2023-10-17 17:38:15,451 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 17:38:15,451 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:15,451 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 17:38:15,451 - metric: "('micro avg', 'f1-score')" 2023-10-17 17:38:15,451 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:15,451 Computation: 2023-10-17 17:38:15,451 - compute on device: cuda:0 2023-10-17 17:38:15,451 - embedding storage: none 2023-10-17 17:38:15,451 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:15,451 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-17 17:38:15,451 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:15,451 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:15,451 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 17:38:22,526 epoch 1 - iter 144/1445 - loss 2.78078658 - time (sec): 7.07 - samples/sec: 2289.63 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:38:29,934 epoch 1 - iter 288/1445 - loss 1.51861606 - time (sec): 14.48 - samples/sec: 2361.02 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:38:37,086 epoch 1 - iter 432/1445 - loss 1.07913069 - time (sec): 21.63 - samples/sec: 2389.34 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:38:44,314 epoch 1 - iter 576/1445 - loss 0.85096615 - time (sec): 28.86 - samples/sec: 2396.22 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:38:51,463 epoch 1 - iter 720/1445 - loss 0.70608937 - time (sec): 36.01 - samples/sec: 2419.98 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:38:58,764 epoch 1 - iter 864/1445 - loss 0.60898489 - time (sec): 43.31 - samples/sec: 2437.90 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:39:06,108 epoch 1 - iter 1008/1445 - loss 0.53851173 - time (sec): 50.66 - samples/sec: 2434.66 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:39:13,470 epoch 1 - iter 1152/1445 - loss 0.48695757 - time (sec): 58.02 - samples/sec: 2434.01 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:39:20,969 epoch 1 - iter 1296/1445 - loss 0.44708419 - time (sec): 65.52 - samples/sec: 2427.59 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:39:27,888 epoch 1 - iter 1440/1445 - loss 0.41563598 - time (sec): 72.44 - samples/sec: 2424.95 - lr: 0.000030 - momentum: 0.000000 2023-10-17 17:39:28,119 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:39:28,120 EPOCH 1 done: loss 0.4146 - lr: 0.000030 2023-10-17 17:39:31,097 DEV : loss 0.10123064368963242 - f1-score (micro avg) 0.6885 2023-10-17 17:39:31,119 saving best model 2023-10-17 17:39:31,520 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:39:38,950 epoch 2 - iter 144/1445 - loss 0.09208746 - time (sec): 7.43 - samples/sec: 2505.79 - lr: 0.000030 - momentum: 0.000000 2023-10-17 17:39:46,043 epoch 2 - iter 288/1445 - loss 0.09395078 - time (sec): 14.52 - samples/sec: 2443.88 - lr: 0.000029 - momentum: 0.000000 2023-10-17 17:39:53,806 epoch 2 - iter 432/1445 - loss 0.09675004 - time (sec): 22.28 - samples/sec: 2402.53 - lr: 0.000029 - momentum: 0.000000 2023-10-17 17:40:01,000 epoch 2 - iter 576/1445 - loss 0.09339561 - time (sec): 29.48 - samples/sec: 2413.55 - lr: 0.000029 - momentum: 0.000000 2023-10-17 17:40:08,104 epoch 2 - iter 720/1445 - loss 0.09667081 - time (sec): 36.58 - samples/sec: 2401.10 - lr: 0.000028 - momentum: 0.000000 2023-10-17 17:40:15,323 epoch 2 - iter 864/1445 - loss 0.09827203 - time (sec): 43.80 - samples/sec: 2384.73 - lr: 0.000028 - momentum: 0.000000 2023-10-17 17:40:23,213 epoch 2 - iter 1008/1445 - loss 0.09597707 - time (sec): 51.69 - samples/sec: 2384.96 - lr: 0.000028 - momentum: 0.000000 2023-10-17 17:40:30,320 epoch 2 - iter 1152/1445 - loss 0.09430318 - time (sec): 58.80 - samples/sec: 2373.58 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:40:37,393 epoch 2 - iter 1296/1445 - loss 0.09280542 - time (sec): 65.87 - samples/sec: 2384.11 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:40:44,895 epoch 2 - iter 1440/1445 - loss 0.09043549 - time (sec): 73.37 - samples/sec: 2395.99 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:40:45,128 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:40:45,129 EPOCH 2 done: loss 0.0906 - lr: 0.000027 2023-10-17 17:40:48,972 DEV : loss 0.08994048088788986 - f1-score (micro avg) 0.8041 2023-10-17 17:40:48,997 saving best model 2023-10-17 17:40:49,509 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:40:56,927 epoch 3 - iter 144/1445 - loss 0.05872201 - time (sec): 7.41 - samples/sec: 2360.16 - lr: 0.000026 - momentum: 0.000000 2023-10-17 17:41:03,913 epoch 3 - iter 288/1445 - loss 0.05857562 - time (sec): 14.40 - samples/sec: 2420.88 - lr: 0.000026 - momentum: 0.000000 2023-10-17 17:41:11,051 epoch 3 - iter 432/1445 - loss 0.06614001 - time (sec): 21.53 - samples/sec: 2394.73 - lr: 0.000026 - momentum: 0.000000 2023-10-17 17:41:18,016 epoch 3 - iter 576/1445 - loss 0.06718519 - time (sec): 28.50 - samples/sec: 2399.07 - lr: 0.000025 - momentum: 0.000000 2023-10-17 17:41:25,276 epoch 3 - iter 720/1445 - loss 0.06571201 - time (sec): 35.76 - samples/sec: 2412.78 - lr: 0.000025 - momentum: 0.000000 2023-10-17 17:41:32,934 epoch 3 - iter 864/1445 - loss 0.06765137 - time (sec): 43.42 - samples/sec: 2442.62 - lr: 0.000025 - momentum: 0.000000 2023-10-17 17:41:40,390 epoch 3 - iter 1008/1445 - loss 0.06716439 - time (sec): 50.87 - samples/sec: 2446.41 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:41:47,590 epoch 3 - iter 1152/1445 - loss 0.06595038 - time (sec): 58.07 - samples/sec: 2439.72 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:41:54,737 epoch 3 - iter 1296/1445 - loss 0.06557252 - time (sec): 65.22 - samples/sec: 2434.72 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:42:01,793 epoch 3 - iter 1440/1445 - loss 0.06558181 - time (sec): 72.28 - samples/sec: 2428.06 - lr: 0.000023 - momentum: 0.000000 2023-10-17 17:42:02,065 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:42:02,065 EPOCH 3 done: loss 0.0655 - lr: 0.000023 2023-10-17 17:42:05,446 DEV : loss 0.08329488337039948 - f1-score (micro avg) 0.8576 2023-10-17 17:42:05,463 saving best model 2023-10-17 17:42:05,984 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:42:13,079 epoch 4 - iter 144/1445 - loss 0.03908482 - time (sec): 7.09 - samples/sec: 2434.08 - lr: 0.000023 - momentum: 0.000000 2023-10-17 17:42:20,247 epoch 4 - iter 288/1445 - loss 0.04314761 - time (sec): 14.26 - samples/sec: 2426.27 - lr: 0.000023 - momentum: 0.000000 2023-10-17 17:42:27,325 epoch 4 - iter 432/1445 - loss 0.04479377 - time (sec): 21.34 - samples/sec: 2452.70 - lr: 0.000022 - momentum: 0.000000 2023-10-17 17:42:34,473 epoch 4 - iter 576/1445 - loss 0.04639701 - time (sec): 28.49 - samples/sec: 2443.28 - lr: 0.000022 - momentum: 0.000000 2023-10-17 17:42:41,304 epoch 4 - iter 720/1445 - loss 0.04663083 - time (sec): 35.32 - samples/sec: 2432.99 - lr: 0.000022 - momentum: 0.000000 2023-10-17 17:42:48,590 epoch 4 - iter 864/1445 - loss 0.04877294 - time (sec): 42.60 - samples/sec: 2446.39 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:42:56,157 epoch 4 - iter 1008/1445 - loss 0.05120701 - time (sec): 50.17 - samples/sec: 2442.48 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:43:03,410 epoch 4 - iter 1152/1445 - loss 0.05143852 - time (sec): 57.42 - samples/sec: 2428.03 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:43:11,041 epoch 4 - iter 1296/1445 - loss 0.05077155 - time (sec): 65.05 - samples/sec: 2416.48 - lr: 0.000020 - momentum: 0.000000 2023-10-17 17:43:18,170 epoch 4 - iter 1440/1445 - loss 0.05272378 - time (sec): 72.18 - samples/sec: 2432.26 - lr: 0.000020 - momentum: 0.000000 2023-10-17 17:43:18,411 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:43:18,411 EPOCH 4 done: loss 0.0528 - lr: 0.000020 2023-10-17 17:43:22,251 DEV : loss 0.09850851446390152 - f1-score (micro avg) 0.8597 2023-10-17 17:43:22,267 saving best model 2023-10-17 17:43:22,711 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:43:29,952 epoch 5 - iter 144/1445 - loss 0.02913580 - time (sec): 7.24 - samples/sec: 2455.71 - lr: 0.000020 - momentum: 0.000000 2023-10-17 17:43:37,014 epoch 5 - iter 288/1445 - loss 0.03254544 - time (sec): 14.30 - samples/sec: 2472.75 - lr: 0.000019 - momentum: 0.000000 2023-10-17 17:43:44,071 epoch 5 - iter 432/1445 - loss 0.03570376 - time (sec): 21.36 - samples/sec: 2430.96 - lr: 0.000019 - momentum: 0.000000 2023-10-17 17:43:51,427 epoch 5 - iter 576/1445 - loss 0.03709659 - time (sec): 28.71 - samples/sec: 2451.97 - lr: 0.000019 - momentum: 0.000000 2023-10-17 17:43:58,280 epoch 5 - iter 720/1445 - loss 0.03495719 - time (sec): 35.57 - samples/sec: 2450.94 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:44:05,351 epoch 5 - iter 864/1445 - loss 0.03454321 - time (sec): 42.64 - samples/sec: 2449.50 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:44:12,501 epoch 5 - iter 1008/1445 - loss 0.03603597 - time (sec): 49.79 - samples/sec: 2433.62 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:44:19,757 epoch 5 - iter 1152/1445 - loss 0.03646270 - time (sec): 57.04 - samples/sec: 2435.72 - lr: 0.000017 - momentum: 0.000000 2023-10-17 17:44:27,159 epoch 5 - iter 1296/1445 - loss 0.03650110 - time (sec): 64.45 - samples/sec: 2431.31 - lr: 0.000017 - momentum: 0.000000 2023-10-17 17:44:34,782 epoch 5 - iter 1440/1445 - loss 0.03803697 - time (sec): 72.07 - samples/sec: 2437.25 - lr: 0.000017 - momentum: 0.000000 2023-10-17 17:44:35,026 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:44:35,027 EPOCH 5 done: loss 0.0381 - lr: 0.000017 2023-10-17 17:44:38,550 DEV : loss 0.1178504228591919 - f1-score (micro avg) 0.8518 2023-10-17 17:44:38,569 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:44:46,150 epoch 6 - iter 144/1445 - loss 0.02532486 - time (sec): 7.58 - samples/sec: 2270.25 - lr: 0.000016 - momentum: 0.000000 2023-10-17 17:44:53,390 epoch 6 - iter 288/1445 - loss 0.02904802 - time (sec): 14.82 - samples/sec: 2313.74 - lr: 0.000016 - momentum: 0.000000 2023-10-17 17:45:00,617 epoch 6 - iter 432/1445 - loss 0.02918864 - time (sec): 22.05 - samples/sec: 2378.19 - lr: 0.000016 - momentum: 0.000000 2023-10-17 17:45:07,572 epoch 6 - iter 576/1445 - loss 0.02552071 - time (sec): 29.00 - samples/sec: 2348.34 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:45:15,347 epoch 6 - iter 720/1445 - loss 0.02625914 - time (sec): 36.78 - samples/sec: 2380.60 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:45:22,880 epoch 6 - iter 864/1445 - loss 0.02735437 - time (sec): 44.31 - samples/sec: 2377.18 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:45:30,013 epoch 6 - iter 1008/1445 - loss 0.02621190 - time (sec): 51.44 - samples/sec: 2380.53 - lr: 0.000014 - momentum: 0.000000 2023-10-17 17:45:37,375 epoch 6 - iter 1152/1445 - loss 0.02598037 - time (sec): 58.80 - samples/sec: 2407.78 - lr: 0.000014 - momentum: 0.000000 2023-10-17 17:45:44,521 epoch 6 - iter 1296/1445 - loss 0.02707724 - time (sec): 65.95 - samples/sec: 2399.92 - lr: 0.000014 - momentum: 0.000000 2023-10-17 17:45:51,591 epoch 6 - iter 1440/1445 - loss 0.02796616 - time (sec): 73.02 - samples/sec: 2407.42 - lr: 0.000013 - momentum: 0.000000 2023-10-17 17:45:51,807 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:45:51,807 EPOCH 6 done: loss 0.0280 - lr: 0.000013 2023-10-17 17:45:55,246 DEV : loss 0.12150729447603226 - f1-score (micro avg) 0.8515 2023-10-17 17:45:55,264 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:46:02,158 epoch 7 - iter 144/1445 - loss 0.01685010 - time (sec): 6.89 - samples/sec: 2430.06 - lr: 0.000013 - momentum: 0.000000 2023-10-17 17:46:09,209 epoch 7 - iter 288/1445 - loss 0.01264635 - time (sec): 13.94 - samples/sec: 2462.29 - lr: 0.000013 - momentum: 0.000000 2023-10-17 17:46:16,421 epoch 7 - iter 432/1445 - loss 0.01653909 - time (sec): 21.16 - samples/sec: 2502.88 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:46:23,521 epoch 7 - iter 576/1445 - loss 0.01929903 - time (sec): 28.26 - samples/sec: 2487.16 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:46:30,784 epoch 7 - iter 720/1445 - loss 0.01985335 - time (sec): 35.52 - samples/sec: 2493.20 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:46:38,378 epoch 7 - iter 864/1445 - loss 0.02389990 - time (sec): 43.11 - samples/sec: 2459.10 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:46:45,985 epoch 7 - iter 1008/1445 - loss 0.02333411 - time (sec): 50.72 - samples/sec: 2462.85 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:46:53,205 epoch 7 - iter 1152/1445 - loss 0.02230595 - time (sec): 57.94 - samples/sec: 2440.83 - lr: 0.000011 - momentum: 0.000000 2023-10-17 17:47:00,241 epoch 7 - iter 1296/1445 - loss 0.02208829 - time (sec): 64.98 - samples/sec: 2444.66 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:47:07,359 epoch 7 - iter 1440/1445 - loss 0.02176833 - time (sec): 72.09 - samples/sec: 2437.16 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:47:07,585 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:47:07,586 EPOCH 7 done: loss 0.0219 - lr: 0.000010 2023-10-17 17:47:10,966 DEV : loss 0.12723445892333984 - f1-score (micro avg) 0.8684 2023-10-17 17:47:10,985 saving best model 2023-10-17 17:47:11,514 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:47:18,708 epoch 8 - iter 144/1445 - loss 0.01150973 - time (sec): 7.19 - samples/sec: 2544.39 - lr: 0.000010 - momentum: 0.000000 2023-10-17 17:47:25,677 epoch 8 - iter 288/1445 - loss 0.01301164 - time (sec): 14.15 - samples/sec: 2555.66 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:47:32,800 epoch 8 - iter 432/1445 - loss 0.01277090 - time (sec): 21.28 - samples/sec: 2514.23 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:47:39,780 epoch 8 - iter 576/1445 - loss 0.01323955 - time (sec): 28.26 - samples/sec: 2479.48 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:47:46,861 epoch 8 - iter 720/1445 - loss 0.01367934 - time (sec): 35.34 - samples/sec: 2505.38 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:47:53,573 epoch 8 - iter 864/1445 - loss 0.01302397 - time (sec): 42.05 - samples/sec: 2521.56 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:48:00,449 epoch 8 - iter 1008/1445 - loss 0.01325247 - time (sec): 48.93 - samples/sec: 2499.40 - lr: 0.000008 - momentum: 0.000000 2023-10-17 17:48:07,708 epoch 8 - iter 1152/1445 - loss 0.01320999 - time (sec): 56.19 - samples/sec: 2489.07 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:48:15,148 epoch 8 - iter 1296/1445 - loss 0.01416894 - time (sec): 63.63 - samples/sec: 2487.61 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:48:22,372 epoch 8 - iter 1440/1445 - loss 0.01469288 - time (sec): 70.85 - samples/sec: 2481.73 - lr: 0.000007 - momentum: 0.000000 2023-10-17 17:48:22,622 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:48:22,623 EPOCH 8 done: loss 0.0147 - lr: 0.000007 2023-10-17 17:48:25,928 DEV : loss 0.13977086544036865 - f1-score (micro avg) 0.8641 2023-10-17 17:48:25,946 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:48:33,142 epoch 9 - iter 144/1445 - loss 0.00822659 - time (sec): 7.19 - samples/sec: 2436.78 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:48:40,157 epoch 9 - iter 288/1445 - loss 0.00559292 - time (sec): 14.21 - samples/sec: 2471.90 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:48:47,291 epoch 9 - iter 432/1445 - loss 0.00857791 - time (sec): 21.34 - samples/sec: 2500.95 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:48:54,475 epoch 9 - iter 576/1445 - loss 0.01013173 - time (sec): 28.53 - samples/sec: 2500.07 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:49:01,473 epoch 9 - iter 720/1445 - loss 0.00984675 - time (sec): 35.53 - samples/sec: 2469.71 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:49:08,496 epoch 9 - iter 864/1445 - loss 0.00969007 - time (sec): 42.55 - samples/sec: 2481.24 - lr: 0.000005 - momentum: 0.000000 2023-10-17 17:49:15,662 epoch 9 - iter 1008/1445 - loss 0.00977286 - time (sec): 49.71 - samples/sec: 2478.89 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:49:24,103 epoch 9 - iter 1152/1445 - loss 0.01047676 - time (sec): 58.16 - samples/sec: 2431.23 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:49:31,147 epoch 9 - iter 1296/1445 - loss 0.00999327 - time (sec): 65.20 - samples/sec: 2423.11 - lr: 0.000004 - momentum: 0.000000 2023-10-17 17:49:38,579 epoch 9 - iter 1440/1445 - loss 0.00998262 - time (sec): 72.63 - samples/sec: 2418.14 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:49:38,821 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:49:38,821 EPOCH 9 done: loss 0.0100 - lr: 0.000003 2023-10-17 17:49:42,238 DEV : loss 0.14191032946109772 - f1-score (micro avg) 0.8661 2023-10-17 17:49:42,256 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:49:49,582 epoch 10 - iter 144/1445 - loss 0.00635513 - time (sec): 7.33 - samples/sec: 2531.30 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:49:56,970 epoch 10 - iter 288/1445 - loss 0.00533428 - time (sec): 14.71 - samples/sec: 2412.38 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:50:04,167 epoch 10 - iter 432/1445 - loss 0.00667599 - time (sec): 21.91 - samples/sec: 2408.36 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:50:11,207 epoch 10 - iter 576/1445 - loss 0.00622294 - time (sec): 28.95 - samples/sec: 2412.67 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:50:18,623 epoch 10 - iter 720/1445 - loss 0.00658126 - time (sec): 36.37 - samples/sec: 2430.04 - lr: 0.000002 - momentum: 0.000000 2023-10-17 17:50:25,770 epoch 10 - iter 864/1445 - loss 0.00769149 - time (sec): 43.51 - samples/sec: 2457.20 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:50:32,601 epoch 10 - iter 1008/1445 - loss 0.00793393 - time (sec): 50.34 - samples/sec: 2466.53 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:50:39,315 epoch 10 - iter 1152/1445 - loss 0.00724425 - time (sec): 57.06 - samples/sec: 2475.80 - lr: 0.000001 - momentum: 0.000000 2023-10-17 17:50:46,386 epoch 10 - iter 1296/1445 - loss 0.00766193 - time (sec): 64.13 - samples/sec: 2487.64 - lr: 0.000000 - momentum: 0.000000 2023-10-17 17:50:53,224 epoch 10 - iter 1440/1445 - loss 0.00738923 - time (sec): 70.97 - samples/sec: 2472.81 - lr: 0.000000 - momentum: 0.000000 2023-10-17 17:50:53,477 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:50:53,477 EPOCH 10 done: loss 0.0074 - lr: 0.000000 2023-10-17 17:50:56,778 DEV : loss 0.14468367397785187 - f1-score (micro avg) 0.8679 2023-10-17 17:50:57,176 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:50:57,177 Loading model from best epoch ... 2023-10-17 17:50:58,557 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-17 17:51:01,444 Results: - F-score (micro) 0.8493 - F-score (macro) 0.7512 - Accuracy 0.7473 By class: precision recall f1-score support PER 0.8596 0.8382 0.8487 482 LOC 0.9236 0.8712 0.8966 458 ORG 0.5849 0.4493 0.5082 69 micro avg 0.8733 0.8266 0.8493 1009 macro avg 0.7894 0.7195 0.7512 1009 weighted avg 0.8699 0.8266 0.8472 1009 2023-10-17 17:51:01,445 ----------------------------------------------------------------------------------------------------