2023-10-25 12:04:43,420 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:04:43,421 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 12:04:43,421 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:04:43,421 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-25 12:04:43,421 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:04:43,421 Train: 20847 sentences 2023-10-25 12:04:43,421 (train_with_dev=False, train_with_test=False) 2023-10-25 12:04:43,421 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:04:43,421 Training Params: 2023-10-25 12:04:43,421 - learning_rate: "5e-05" 2023-10-25 12:04:43,421 - mini_batch_size: "8" 2023-10-25 12:04:43,421 - max_epochs: "10" 2023-10-25 12:04:43,421 - shuffle: "True" 2023-10-25 12:04:43,421 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:04:43,421 Plugins: 2023-10-25 12:04:43,421 - TensorboardLogger 2023-10-25 12:04:43,421 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 12:04:43,421 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:04:43,422 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 12:04:43,422 - metric: "('micro avg', 'f1-score')" 2023-10-25 12:04:43,422 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:04:43,422 Computation: 2023-10-25 12:04:43,422 - compute on device: cuda:0 2023-10-25 12:04:43,422 - embedding storage: none 2023-10-25 12:04:43,422 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:04:43,422 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-25 12:04:43,422 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:04:43,422 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:04:43,422 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 12:04:57,329 epoch 1 - iter 260/2606 - loss 1.33404298 - time (sec): 13.91 - samples/sec: 2676.54 - lr: 0.000005 - momentum: 0.000000 2023-10-25 12:05:11,911 epoch 1 - iter 520/2606 - loss 0.81750725 - time (sec): 28.49 - samples/sec: 2698.06 - lr: 0.000010 - momentum: 0.000000 2023-10-25 12:05:26,285 epoch 1 - iter 780/2606 - loss 0.63734009 - time (sec): 42.86 - samples/sec: 2662.27 - lr: 0.000015 - momentum: 0.000000 2023-10-25 12:05:40,386 epoch 1 - iter 1040/2606 - loss 0.54688049 - time (sec): 56.96 - samples/sec: 2609.80 - lr: 0.000020 - momentum: 0.000000 2023-10-25 12:05:54,652 epoch 1 - iter 1300/2606 - loss 0.48271010 - time (sec): 71.23 - samples/sec: 2585.48 - lr: 0.000025 - momentum: 0.000000 2023-10-25 12:06:09,224 epoch 1 - iter 1560/2606 - loss 0.43686814 - time (sec): 85.80 - samples/sec: 2591.06 - lr: 0.000030 - momentum: 0.000000 2023-10-25 12:06:22,977 epoch 1 - iter 1820/2606 - loss 0.40324464 - time (sec): 99.55 - samples/sec: 2576.82 - lr: 0.000035 - momentum: 0.000000 2023-10-25 12:06:37,534 epoch 1 - iter 2080/2606 - loss 0.37960994 - time (sec): 114.11 - samples/sec: 2560.25 - lr: 0.000040 - momentum: 0.000000 2023-10-25 12:06:51,663 epoch 1 - iter 2340/2606 - loss 0.35867330 - time (sec): 128.24 - samples/sec: 2557.80 - lr: 0.000045 - momentum: 0.000000 2023-10-25 12:07:06,164 epoch 1 - iter 2600/2606 - loss 0.33892838 - time (sec): 142.74 - samples/sec: 2569.97 - lr: 0.000050 - momentum: 0.000000 2023-10-25 12:07:06,496 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:07:06,496 EPOCH 1 done: loss 0.3387 - lr: 0.000050 2023-10-25 12:07:10,522 DEV : loss 0.1899159997701645 - f1-score (micro avg) 0.3206 2023-10-25 12:07:10,548 saving best model 2023-10-25 12:07:11,071 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:07:25,878 epoch 2 - iter 260/2606 - loss 0.18263844 - time (sec): 14.81 - samples/sec: 2609.92 - lr: 0.000049 - momentum: 0.000000 2023-10-25 12:07:40,312 epoch 2 - iter 520/2606 - loss 0.17196157 - time (sec): 29.24 - samples/sec: 2598.58 - lr: 0.000049 - momentum: 0.000000 2023-10-25 12:07:54,450 epoch 2 - iter 780/2606 - loss 0.16810378 - time (sec): 43.38 - samples/sec: 2626.37 - lr: 0.000048 - momentum: 0.000000 2023-10-25 12:08:08,143 epoch 2 - iter 1040/2606 - loss 0.16310966 - time (sec): 57.07 - samples/sec: 2587.48 - lr: 0.000048 - momentum: 0.000000 2023-10-25 12:08:22,199 epoch 2 - iter 1300/2606 - loss 0.16748312 - time (sec): 71.13 - samples/sec: 2565.37 - lr: 0.000047 - momentum: 0.000000 2023-10-25 12:08:36,287 epoch 2 - iter 1560/2606 - loss 0.17242041 - time (sec): 85.21 - samples/sec: 2564.40 - lr: 0.000047 - momentum: 0.000000 2023-10-25 12:08:51,010 epoch 2 - iter 1820/2606 - loss 0.18294843 - time (sec): 99.94 - samples/sec: 2559.73 - lr: 0.000046 - momentum: 0.000000 2023-10-25 12:09:05,058 epoch 2 - iter 2080/2606 - loss 0.18014163 - time (sec): 113.99 - samples/sec: 2562.12 - lr: 0.000046 - momentum: 0.000000 2023-10-25 12:09:19,987 epoch 2 - iter 2340/2606 - loss 0.17697195 - time (sec): 128.91 - samples/sec: 2554.90 - lr: 0.000045 - momentum: 0.000000 2023-10-25 12:09:34,480 epoch 2 - iter 2600/2606 - loss 0.17543410 - time (sec): 143.41 - samples/sec: 2557.33 - lr: 0.000044 - momentum: 0.000000 2023-10-25 12:09:34,754 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:09:34,754 EPOCH 2 done: loss 0.1757 - lr: 0.000044 2023-10-25 12:09:41,662 DEV : loss 0.12365195900201797 - f1-score (micro avg) 0.1991 2023-10-25 12:09:41,687 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:09:55,576 epoch 3 - iter 260/2606 - loss 0.13097015 - time (sec): 13.89 - samples/sec: 2422.06 - lr: 0.000044 - momentum: 0.000000 2023-10-25 12:10:09,421 epoch 3 - iter 520/2606 - loss 0.13232082 - time (sec): 27.73 - samples/sec: 2442.10 - lr: 0.000043 - momentum: 0.000000 2023-10-25 12:10:24,291 epoch 3 - iter 780/2606 - loss 0.11661536 - time (sec): 42.60 - samples/sec: 2529.83 - lr: 0.000043 - momentum: 0.000000 2023-10-25 12:10:38,435 epoch 3 - iter 1040/2606 - loss 0.12084986 - time (sec): 56.75 - samples/sec: 2521.57 - lr: 0.000042 - momentum: 0.000000 2023-10-25 12:10:52,440 epoch 3 - iter 1300/2606 - loss 0.11793891 - time (sec): 70.75 - samples/sec: 2546.11 - lr: 0.000042 - momentum: 0.000000 2023-10-25 12:11:06,700 epoch 3 - iter 1560/2606 - loss 0.11620440 - time (sec): 85.01 - samples/sec: 2552.64 - lr: 0.000041 - momentum: 0.000000 2023-10-25 12:11:21,546 epoch 3 - iter 1820/2606 - loss 0.11625199 - time (sec): 99.86 - samples/sec: 2574.35 - lr: 0.000041 - momentum: 0.000000 2023-10-25 12:11:36,104 epoch 3 - iter 2080/2606 - loss 0.11570847 - time (sec): 114.42 - samples/sec: 2571.76 - lr: 0.000040 - momentum: 0.000000 2023-10-25 12:11:50,815 epoch 3 - iter 2340/2606 - loss 0.11725099 - time (sec): 129.13 - samples/sec: 2573.15 - lr: 0.000039 - momentum: 0.000000 2023-10-25 12:12:04,853 epoch 3 - iter 2600/2606 - loss 0.11640683 - time (sec): 143.16 - samples/sec: 2561.04 - lr: 0.000039 - momentum: 0.000000 2023-10-25 12:12:05,150 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:12:05,150 EPOCH 3 done: loss 0.1164 - lr: 0.000039 2023-10-25 12:12:12,345 DEV : loss 0.18110370635986328 - f1-score (micro avg) 0.3487 2023-10-25 12:12:12,371 saving best model 2023-10-25 12:12:13,034 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:12:27,120 epoch 4 - iter 260/2606 - loss 0.07314164 - time (sec): 14.08 - samples/sec: 2617.27 - lr: 0.000038 - momentum: 0.000000 2023-10-25 12:12:41,122 epoch 4 - iter 520/2606 - loss 0.07274429 - time (sec): 28.09 - samples/sec: 2607.65 - lr: 0.000038 - momentum: 0.000000 2023-10-25 12:12:55,178 epoch 4 - iter 780/2606 - loss 0.07820023 - time (sec): 42.14 - samples/sec: 2600.12 - lr: 0.000037 - momentum: 0.000000 2023-10-25 12:13:09,293 epoch 4 - iter 1040/2606 - loss 0.08079671 - time (sec): 56.26 - samples/sec: 2534.91 - lr: 0.000037 - momentum: 0.000000 2023-10-25 12:13:24,132 epoch 4 - iter 1300/2606 - loss 0.08272013 - time (sec): 71.10 - samples/sec: 2541.48 - lr: 0.000036 - momentum: 0.000000 2023-10-25 12:13:38,910 epoch 4 - iter 1560/2606 - loss 0.08294572 - time (sec): 85.88 - samples/sec: 2523.61 - lr: 0.000036 - momentum: 0.000000 2023-10-25 12:13:53,194 epoch 4 - iter 1820/2606 - loss 0.08005595 - time (sec): 100.16 - samples/sec: 2534.43 - lr: 0.000035 - momentum: 0.000000 2023-10-25 12:14:08,011 epoch 4 - iter 2080/2606 - loss 0.08143243 - time (sec): 114.98 - samples/sec: 2551.54 - lr: 0.000034 - momentum: 0.000000 2023-10-25 12:14:22,431 epoch 4 - iter 2340/2606 - loss 0.08041802 - time (sec): 129.40 - samples/sec: 2532.25 - lr: 0.000034 - momentum: 0.000000 2023-10-25 12:14:37,281 epoch 4 - iter 2600/2606 - loss 0.08144784 - time (sec): 144.25 - samples/sec: 2539.74 - lr: 0.000033 - momentum: 0.000000 2023-10-25 12:14:37,596 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:14:37,597 EPOCH 4 done: loss 0.0813 - lr: 0.000033 2023-10-25 12:14:44,681 DEV : loss 0.24672392010688782 - f1-score (micro avg) 0.3258 2023-10-25 12:14:44,706 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:14:58,782 epoch 5 - iter 260/2606 - loss 0.04975046 - time (sec): 14.07 - samples/sec: 2514.00 - lr: 0.000033 - momentum: 0.000000 2023-10-25 12:15:13,022 epoch 5 - iter 520/2606 - loss 0.05842153 - time (sec): 28.31 - samples/sec: 2512.62 - lr: 0.000032 - momentum: 0.000000 2023-10-25 12:15:27,708 epoch 5 - iter 780/2606 - loss 0.05600138 - time (sec): 43.00 - samples/sec: 2555.65 - lr: 0.000032 - momentum: 0.000000 2023-10-25 12:15:42,374 epoch 5 - iter 1040/2606 - loss 0.05571986 - time (sec): 57.67 - samples/sec: 2559.74 - lr: 0.000031 - momentum: 0.000000 2023-10-25 12:15:56,321 epoch 5 - iter 1300/2606 - loss 0.05410019 - time (sec): 71.61 - samples/sec: 2592.02 - lr: 0.000031 - momentum: 0.000000 2023-10-25 12:16:11,032 epoch 5 - iter 1560/2606 - loss 0.05554265 - time (sec): 86.32 - samples/sec: 2600.43 - lr: 0.000030 - momentum: 0.000000 2023-10-25 12:16:25,229 epoch 5 - iter 1820/2606 - loss 0.05623815 - time (sec): 100.52 - samples/sec: 2591.86 - lr: 0.000029 - momentum: 0.000000 2023-10-25 12:16:39,142 epoch 5 - iter 2080/2606 - loss 0.05640579 - time (sec): 114.43 - samples/sec: 2577.66 - lr: 0.000029 - momentum: 0.000000 2023-10-25 12:16:52,969 epoch 5 - iter 2340/2606 - loss 0.05583642 - time (sec): 128.26 - samples/sec: 2566.52 - lr: 0.000028 - momentum: 0.000000 2023-10-25 12:17:07,359 epoch 5 - iter 2600/2606 - loss 0.05546860 - time (sec): 142.65 - samples/sec: 2568.74 - lr: 0.000028 - momentum: 0.000000 2023-10-25 12:17:07,671 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:17:07,671 EPOCH 5 done: loss 0.0554 - lr: 0.000028 2023-10-25 12:17:14,198 DEV : loss 0.3652787506580353 - f1-score (micro avg) 0.3473 2023-10-25 12:17:14,227 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:17:28,718 epoch 6 - iter 260/2606 - loss 0.04669705 - time (sec): 14.49 - samples/sec: 2618.65 - lr: 0.000027 - momentum: 0.000000 2023-10-25 12:17:44,097 epoch 6 - iter 520/2606 - loss 0.06386000 - time (sec): 29.87 - samples/sec: 2579.82 - lr: 0.000027 - momentum: 0.000000 2023-10-25 12:17:59,389 epoch 6 - iter 780/2606 - loss 0.05599133 - time (sec): 45.16 - samples/sec: 2534.51 - lr: 0.000026 - momentum: 0.000000 2023-10-25 12:18:13,860 epoch 6 - iter 1040/2606 - loss 0.05300823 - time (sec): 59.63 - samples/sec: 2554.23 - lr: 0.000026 - momentum: 0.000000 2023-10-25 12:18:27,592 epoch 6 - iter 1300/2606 - loss 0.05110818 - time (sec): 73.36 - samples/sec: 2537.33 - lr: 0.000025 - momentum: 0.000000 2023-10-25 12:18:41,721 epoch 6 - iter 1560/2606 - loss 0.04968908 - time (sec): 87.49 - samples/sec: 2540.03 - lr: 0.000024 - momentum: 0.000000 2023-10-25 12:18:56,223 epoch 6 - iter 1820/2606 - loss 0.04814738 - time (sec): 101.99 - samples/sec: 2516.29 - lr: 0.000024 - momentum: 0.000000 2023-10-25 12:19:10,794 epoch 6 - iter 2080/2606 - loss 0.04656793 - time (sec): 116.57 - samples/sec: 2511.11 - lr: 0.000023 - momentum: 0.000000 2023-10-25 12:19:24,885 epoch 6 - iter 2340/2606 - loss 0.04599298 - time (sec): 130.66 - samples/sec: 2519.24 - lr: 0.000023 - momentum: 0.000000 2023-10-25 12:19:39,483 epoch 6 - iter 2600/2606 - loss 0.04534244 - time (sec): 145.25 - samples/sec: 2523.96 - lr: 0.000022 - momentum: 0.000000 2023-10-25 12:19:39,796 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:19:39,796 EPOCH 6 done: loss 0.0453 - lr: 0.000022 2023-10-25 12:19:46,429 DEV : loss 0.36484959721565247 - f1-score (micro avg) 0.3757 2023-10-25 12:19:46,470 saving best model 2023-10-25 12:19:47,101 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:20:02,047 epoch 7 - iter 260/2606 - loss 0.02579681 - time (sec): 14.94 - samples/sec: 2481.46 - lr: 0.000022 - momentum: 0.000000 2023-10-25 12:20:17,070 epoch 7 - iter 520/2606 - loss 0.02723529 - time (sec): 29.97 - samples/sec: 2514.98 - lr: 0.000021 - momentum: 0.000000 2023-10-25 12:20:31,473 epoch 7 - iter 780/2606 - loss 0.03443679 - time (sec): 44.37 - samples/sec: 2487.49 - lr: 0.000021 - momentum: 0.000000 2023-10-25 12:20:46,443 epoch 7 - iter 1040/2606 - loss 0.05293747 - time (sec): 59.34 - samples/sec: 2490.37 - lr: 0.000020 - momentum: 0.000000 2023-10-25 12:21:01,062 epoch 7 - iter 1300/2606 - loss 0.05896085 - time (sec): 73.96 - samples/sec: 2498.42 - lr: 0.000019 - momentum: 0.000000 2023-10-25 12:21:15,728 epoch 7 - iter 1560/2606 - loss 0.05488902 - time (sec): 88.63 - samples/sec: 2525.43 - lr: 0.000019 - momentum: 0.000000 2023-10-25 12:21:30,405 epoch 7 - iter 1820/2606 - loss 0.05514632 - time (sec): 103.30 - samples/sec: 2519.10 - lr: 0.000018 - momentum: 0.000000 2023-10-25 12:21:44,288 epoch 7 - iter 2080/2606 - loss 0.05673720 - time (sec): 117.19 - samples/sec: 2518.16 - lr: 0.000018 - momentum: 0.000000 2023-10-25 12:21:58,440 epoch 7 - iter 2340/2606 - loss 0.06075104 - time (sec): 131.34 - samples/sec: 2518.27 - lr: 0.000017 - momentum: 0.000000 2023-10-25 12:22:12,005 epoch 7 - iter 2600/2606 - loss 0.07080521 - time (sec): 144.90 - samples/sec: 2527.51 - lr: 0.000017 - momentum: 0.000000 2023-10-25 12:22:12,429 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:22:12,430 EPOCH 7 done: loss 0.0709 - lr: 0.000017 2023-10-25 12:22:18,685 DEV : loss 0.28045564889907837 - f1-score (micro avg) 0.1776 2023-10-25 12:22:18,711 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:22:32,882 epoch 8 - iter 260/2606 - loss 0.10479596 - time (sec): 14.17 - samples/sec: 2620.27 - lr: 0.000016 - momentum: 0.000000 2023-10-25 12:22:47,009 epoch 8 - iter 520/2606 - loss 0.07922867 - time (sec): 28.30 - samples/sec: 2666.93 - lr: 0.000016 - momentum: 0.000000 2023-10-25 12:23:00,754 epoch 8 - iter 780/2606 - loss 0.08213498 - time (sec): 42.04 - samples/sec: 2670.80 - lr: 0.000015 - momentum: 0.000000 2023-10-25 12:23:14,774 epoch 8 - iter 1040/2606 - loss 0.08392793 - time (sec): 56.06 - samples/sec: 2676.38 - lr: 0.000014 - momentum: 0.000000 2023-10-25 12:23:29,316 epoch 8 - iter 1300/2606 - loss 0.09298697 - time (sec): 70.60 - samples/sec: 2692.58 - lr: 0.000014 - momentum: 0.000000 2023-10-25 12:23:43,269 epoch 8 - iter 1560/2606 - loss 0.09808489 - time (sec): 84.56 - samples/sec: 2660.91 - lr: 0.000013 - momentum: 0.000000 2023-10-25 12:23:57,806 epoch 8 - iter 1820/2606 - loss 0.09890906 - time (sec): 99.09 - samples/sec: 2606.94 - lr: 0.000013 - momentum: 0.000000 2023-10-25 12:24:12,049 epoch 8 - iter 2080/2606 - loss 0.10345234 - time (sec): 113.34 - samples/sec: 2615.34 - lr: 0.000012 - momentum: 0.000000 2023-10-25 12:24:26,187 epoch 8 - iter 2340/2606 - loss 0.10411809 - time (sec): 127.48 - samples/sec: 2608.57 - lr: 0.000012 - momentum: 0.000000 2023-10-25 12:24:40,125 epoch 8 - iter 2600/2606 - loss 0.10325893 - time (sec): 141.41 - samples/sec: 2593.05 - lr: 0.000011 - momentum: 0.000000 2023-10-25 12:24:40,426 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:24:40,426 EPOCH 8 done: loss 0.1032 - lr: 0.000011 2023-10-25 12:24:46,844 DEV : loss 0.3105942904949188 - f1-score (micro avg) 0.1121 2023-10-25 12:24:46,869 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:25:01,469 epoch 9 - iter 260/2606 - loss 0.12293943 - time (sec): 14.60 - samples/sec: 2468.64 - lr: 0.000011 - momentum: 0.000000 2023-10-25 12:25:15,333 epoch 9 - iter 520/2606 - loss 0.12263835 - time (sec): 28.46 - samples/sec: 2559.07 - lr: 0.000010 - momentum: 0.000000 2023-10-25 12:25:29,557 epoch 9 - iter 780/2606 - loss 0.14482561 - time (sec): 42.69 - samples/sec: 2559.99 - lr: 0.000009 - momentum: 0.000000 2023-10-25 12:25:43,465 epoch 9 - iter 1040/2606 - loss 0.15345955 - time (sec): 56.59 - samples/sec: 2570.82 - lr: 0.000009 - momentum: 0.000000 2023-10-25 12:25:58,521 epoch 9 - iter 1300/2606 - loss 0.15045690 - time (sec): 71.65 - samples/sec: 2560.35 - lr: 0.000008 - momentum: 0.000000 2023-10-25 12:26:12,756 epoch 9 - iter 1560/2606 - loss 0.14920787 - time (sec): 85.89 - samples/sec: 2568.73 - lr: 0.000008 - momentum: 0.000000 2023-10-25 12:26:26,678 epoch 9 - iter 1820/2606 - loss 0.14989750 - time (sec): 99.81 - samples/sec: 2586.72 - lr: 0.000007 - momentum: 0.000000 2023-10-25 12:26:40,643 epoch 9 - iter 2080/2606 - loss 0.15181262 - time (sec): 113.77 - samples/sec: 2578.12 - lr: 0.000007 - momentum: 0.000000 2023-10-25 12:26:55,783 epoch 9 - iter 2340/2606 - loss 0.15048265 - time (sec): 128.91 - samples/sec: 2565.28 - lr: 0.000006 - momentum: 0.000000 2023-10-25 12:27:09,724 epoch 9 - iter 2600/2606 - loss 0.14790385 - time (sec): 142.85 - samples/sec: 2564.79 - lr: 0.000006 - momentum: 0.000000 2023-10-25 12:27:10,104 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:27:10,104 EPOCH 9 done: loss 0.1478 - lr: 0.000006 2023-10-25 12:27:16,359 DEV : loss 0.2825768291950226 - f1-score (micro avg) 0.0264 2023-10-25 12:27:16,386 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:27:31,009 epoch 10 - iter 260/2606 - loss 0.12349381 - time (sec): 14.62 - samples/sec: 2544.38 - lr: 0.000005 - momentum: 0.000000 2023-10-25 12:27:44,544 epoch 10 - iter 520/2606 - loss 0.12212158 - time (sec): 28.16 - samples/sec: 2556.67 - lr: 0.000004 - momentum: 0.000000 2023-10-25 12:28:00,425 epoch 10 - iter 780/2606 - loss 0.12260170 - time (sec): 44.04 - samples/sec: 2533.39 - lr: 0.000004 - momentum: 0.000000 2023-10-25 12:28:14,921 epoch 10 - iter 1040/2606 - loss 0.12686123 - time (sec): 58.53 - samples/sec: 2549.96 - lr: 0.000003 - momentum: 0.000000 2023-10-25 12:28:28,874 epoch 10 - iter 1300/2606 - loss 0.12305199 - time (sec): 72.49 - samples/sec: 2528.69 - lr: 0.000003 - momentum: 0.000000 2023-10-25 12:28:44,113 epoch 10 - iter 1560/2606 - loss 0.12360049 - time (sec): 87.73 - samples/sec: 2501.90 - lr: 0.000002 - momentum: 0.000000 2023-10-25 12:28:58,489 epoch 10 - iter 1820/2606 - loss 0.12493764 - time (sec): 102.10 - samples/sec: 2499.93 - lr: 0.000002 - momentum: 0.000000 2023-10-25 12:29:13,302 epoch 10 - iter 2080/2606 - loss 0.12640882 - time (sec): 116.91 - samples/sec: 2502.36 - lr: 0.000001 - momentum: 0.000000 2023-10-25 12:29:27,892 epoch 10 - iter 2340/2606 - loss 0.12676932 - time (sec): 131.50 - samples/sec: 2510.93 - lr: 0.000001 - momentum: 0.000000 2023-10-25 12:29:41,917 epoch 10 - iter 2600/2606 - loss 0.12636628 - time (sec): 145.53 - samples/sec: 2521.08 - lr: 0.000000 - momentum: 0.000000 2023-10-25 12:29:42,199 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:29:42,199 EPOCH 10 done: loss 0.1263 - lr: 0.000000 2023-10-25 12:29:49,286 DEV : loss 0.30260512232780457 - f1-score (micro avg) 0.0731 2023-10-25 12:29:49,854 ---------------------------------------------------------------------------------------------------- 2023-10-25 12:29:49,855 Loading model from best epoch ... 2023-10-25 12:29:51,548 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-25 12:30:03,057 Results: - F-score (micro) 0.451 - F-score (macro) 0.2998 - Accuracy 0.2947 By class: precision recall f1-score support LOC 0.4629 0.5964 0.5212 1214 PER 0.4000 0.4356 0.4171 808 ORG 0.2843 0.2408 0.2607 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.4210 0.4858 0.4510 2390 macro avg 0.2868 0.3182 0.2998 2390 weighted avg 0.4124 0.4858 0.4443 2390 2023-10-25 12:30:03,057 ----------------------------------------------------------------------------------------------------