2023-10-14 09:54:33,483 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:54:33,484 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-14 09:54:33,484 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:54:33,484 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-14 09:54:33,484 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:54:33,484 Train: 5777 sentences 2023-10-14 09:54:33,484 (train_with_dev=False, train_with_test=False) 2023-10-14 09:54:33,485 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:54:33,485 Training Params: 2023-10-14 09:54:33,485 - learning_rate: "5e-05" 2023-10-14 09:54:33,485 - mini_batch_size: "4" 2023-10-14 09:54:33,485 - max_epochs: "10" 2023-10-14 09:54:33,485 - shuffle: "True" 2023-10-14 09:54:33,485 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:54:33,485 Plugins: 2023-10-14 09:54:33,485 - LinearScheduler | warmup_fraction: '0.1' 2023-10-14 09:54:33,485 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:54:33,485 Final evaluation on model from best epoch (best-model.pt) 2023-10-14 09:54:33,485 - metric: "('micro avg', 'f1-score')" 2023-10-14 09:54:33,485 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:54:33,485 Computation: 2023-10-14 09:54:33,485 - compute on device: cuda:0 2023-10-14 09:54:33,485 - embedding storage: none 2023-10-14 09:54:33,485 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:54:33,485 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-14 09:54:33,485 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:54:33,485 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:54:40,631 epoch 1 - iter 144/1445 - loss 1.32151919 - time (sec): 7.14 - samples/sec: 2426.87 - lr: 0.000005 - momentum: 0.000000 2023-10-14 09:54:47,906 epoch 1 - iter 288/1445 - loss 0.79135229 - time (sec): 14.42 - samples/sec: 2414.73 - lr: 0.000010 - momentum: 0.000000 2023-10-14 09:54:54,888 epoch 1 - iter 432/1445 - loss 0.58960751 - time (sec): 21.40 - samples/sec: 2424.36 - lr: 0.000015 - momentum: 0.000000 2023-10-14 09:55:02,060 epoch 1 - iter 576/1445 - loss 0.49079879 - time (sec): 28.57 - samples/sec: 2430.02 - lr: 0.000020 - momentum: 0.000000 2023-10-14 09:55:09,558 epoch 1 - iter 720/1445 - loss 0.41600711 - time (sec): 36.07 - samples/sec: 2465.79 - lr: 0.000025 - momentum: 0.000000 2023-10-14 09:55:16,802 epoch 1 - iter 864/1445 - loss 0.37774428 - time (sec): 43.32 - samples/sec: 2443.10 - lr: 0.000030 - momentum: 0.000000 2023-10-14 09:55:23,834 epoch 1 - iter 1008/1445 - loss 0.34717009 - time (sec): 50.35 - samples/sec: 2433.37 - lr: 0.000035 - momentum: 0.000000 2023-10-14 09:55:31,157 epoch 1 - iter 1152/1445 - loss 0.32075520 - time (sec): 57.67 - samples/sec: 2436.34 - lr: 0.000040 - momentum: 0.000000 2023-10-14 09:55:38,446 epoch 1 - iter 1296/1445 - loss 0.29995277 - time (sec): 64.96 - samples/sec: 2440.65 - lr: 0.000045 - momentum: 0.000000 2023-10-14 09:55:45,688 epoch 1 - iter 1440/1445 - loss 0.28332814 - time (sec): 72.20 - samples/sec: 2436.27 - lr: 0.000050 - momentum: 0.000000 2023-10-14 09:55:45,899 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:55:45,900 EPOCH 1 done: loss 0.2833 - lr: 0.000050 2023-10-14 09:55:49,489 DEV : loss 0.11788433790206909 - f1-score (micro avg) 0.7468 2023-10-14 09:55:49,514 saving best model 2023-10-14 09:55:49,872 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:55:57,243 epoch 2 - iter 144/1445 - loss 0.13020671 - time (sec): 7.37 - samples/sec: 2424.95 - lr: 0.000049 - momentum: 0.000000 2023-10-14 09:56:04,400 epoch 2 - iter 288/1445 - loss 0.12131136 - time (sec): 14.53 - samples/sec: 2414.49 - lr: 0.000049 - momentum: 0.000000 2023-10-14 09:56:11,439 epoch 2 - iter 432/1445 - loss 0.11672384 - time (sec): 21.56 - samples/sec: 2425.04 - lr: 0.000048 - momentum: 0.000000 2023-10-14 09:56:18,951 epoch 2 - iter 576/1445 - loss 0.11924226 - time (sec): 29.08 - samples/sec: 2436.14 - lr: 0.000048 - momentum: 0.000000 2023-10-14 09:56:26,351 epoch 2 - iter 720/1445 - loss 0.11484030 - time (sec): 36.48 - samples/sec: 2450.06 - lr: 0.000047 - momentum: 0.000000 2023-10-14 09:56:33,706 epoch 2 - iter 864/1445 - loss 0.10994458 - time (sec): 43.83 - samples/sec: 2433.06 - lr: 0.000047 - momentum: 0.000000 2023-10-14 09:56:40,743 epoch 2 - iter 1008/1445 - loss 0.10976559 - time (sec): 50.87 - samples/sec: 2422.72 - lr: 0.000046 - momentum: 0.000000 2023-10-14 09:56:48,032 epoch 2 - iter 1152/1445 - loss 0.10785653 - time (sec): 58.16 - samples/sec: 2425.32 - lr: 0.000046 - momentum: 0.000000 2023-10-14 09:56:55,326 epoch 2 - iter 1296/1445 - loss 0.10867392 - time (sec): 65.45 - samples/sec: 2423.28 - lr: 0.000045 - momentum: 0.000000 2023-10-14 09:57:02,445 epoch 2 - iter 1440/1445 - loss 0.10774871 - time (sec): 72.57 - samples/sec: 2421.80 - lr: 0.000044 - momentum: 0.000000 2023-10-14 09:57:02,665 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:57:02,666 EPOCH 2 done: loss 0.1077 - lr: 0.000044 2023-10-14 09:57:07,129 DEV : loss 0.10633940249681473 - f1-score (micro avg) 0.74 2023-10-14 09:57:07,146 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:57:15,050 epoch 3 - iter 144/1445 - loss 0.09035149 - time (sec): 7.90 - samples/sec: 2218.66 - lr: 0.000044 - momentum: 0.000000 2023-10-14 09:57:22,733 epoch 3 - iter 288/1445 - loss 0.08638552 - time (sec): 15.59 - samples/sec: 2237.53 - lr: 0.000043 - momentum: 0.000000 2023-10-14 09:57:30,022 epoch 3 - iter 432/1445 - loss 0.07945969 - time (sec): 22.87 - samples/sec: 2313.62 - lr: 0.000043 - momentum: 0.000000 2023-10-14 09:57:37,519 epoch 3 - iter 576/1445 - loss 0.07900069 - time (sec): 30.37 - samples/sec: 2328.85 - lr: 0.000042 - momentum: 0.000000 2023-10-14 09:57:44,928 epoch 3 - iter 720/1445 - loss 0.07585140 - time (sec): 37.78 - samples/sec: 2331.64 - lr: 0.000042 - momentum: 0.000000 2023-10-14 09:57:52,073 epoch 3 - iter 864/1445 - loss 0.07259362 - time (sec): 44.93 - samples/sec: 2350.70 - lr: 0.000041 - momentum: 0.000000 2023-10-14 09:57:59,957 epoch 3 - iter 1008/1445 - loss 0.07342512 - time (sec): 52.81 - samples/sec: 2349.05 - lr: 0.000041 - momentum: 0.000000 2023-10-14 09:58:06,869 epoch 3 - iter 1152/1445 - loss 0.07356550 - time (sec): 59.72 - samples/sec: 2355.16 - lr: 0.000040 - momentum: 0.000000 2023-10-14 09:58:13,892 epoch 3 - iter 1296/1445 - loss 0.07388942 - time (sec): 66.74 - samples/sec: 2374.81 - lr: 0.000039 - momentum: 0.000000 2023-10-14 09:58:20,897 epoch 3 - iter 1440/1445 - loss 0.07387594 - time (sec): 73.75 - samples/sec: 2382.74 - lr: 0.000039 - momentum: 0.000000 2023-10-14 09:58:21,112 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:58:21,112 EPOCH 3 done: loss 0.0739 - lr: 0.000039 2023-10-14 09:58:24,726 DEV : loss 0.10955189168453217 - f1-score (micro avg) 0.7736 2023-10-14 09:58:24,753 saving best model 2023-10-14 09:58:25,237 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:58:33,415 epoch 4 - iter 144/1445 - loss 0.05575292 - time (sec): 8.17 - samples/sec: 2111.86 - lr: 0.000038 - momentum: 0.000000 2023-10-14 09:58:40,972 epoch 4 - iter 288/1445 - loss 0.05433728 - time (sec): 15.73 - samples/sec: 2209.10 - lr: 0.000038 - momentum: 0.000000 2023-10-14 09:58:48,740 epoch 4 - iter 432/1445 - loss 0.05126783 - time (sec): 23.50 - samples/sec: 2174.54 - lr: 0.000037 - momentum: 0.000000 2023-10-14 09:58:55,970 epoch 4 - iter 576/1445 - loss 0.05255992 - time (sec): 30.73 - samples/sec: 2256.80 - lr: 0.000037 - momentum: 0.000000 2023-10-14 09:59:03,172 epoch 4 - iter 720/1445 - loss 0.05383510 - time (sec): 37.93 - samples/sec: 2293.78 - lr: 0.000036 - momentum: 0.000000 2023-10-14 09:59:10,572 epoch 4 - iter 864/1445 - loss 0.05800646 - time (sec): 45.33 - samples/sec: 2327.58 - lr: 0.000036 - momentum: 0.000000 2023-10-14 09:59:17,841 epoch 4 - iter 1008/1445 - loss 0.05787781 - time (sec): 52.60 - samples/sec: 2357.32 - lr: 0.000035 - momentum: 0.000000 2023-10-14 09:59:24,960 epoch 4 - iter 1152/1445 - loss 0.05816878 - time (sec): 59.72 - samples/sec: 2351.87 - lr: 0.000034 - momentum: 0.000000 2023-10-14 09:59:31,975 epoch 4 - iter 1296/1445 - loss 0.05645199 - time (sec): 66.73 - samples/sec: 2360.65 - lr: 0.000034 - momentum: 0.000000 2023-10-14 09:59:39,299 epoch 4 - iter 1440/1445 - loss 0.05795378 - time (sec): 74.06 - samples/sec: 2374.45 - lr: 0.000033 - momentum: 0.000000 2023-10-14 09:59:39,517 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:59:39,517 EPOCH 4 done: loss 0.0579 - lr: 0.000033 2023-10-14 09:59:43,265 DEV : loss 0.13582421839237213 - f1-score (micro avg) 0.7781 2023-10-14 09:59:43,289 saving best model 2023-10-14 09:59:43,862 ---------------------------------------------------------------------------------------------------- 2023-10-14 09:59:52,276 epoch 5 - iter 144/1445 - loss 0.03810618 - time (sec): 8.41 - samples/sec: 2223.59 - lr: 0.000033 - momentum: 0.000000 2023-10-14 10:00:00,013 epoch 5 - iter 288/1445 - loss 0.03940126 - time (sec): 16.15 - samples/sec: 2221.95 - lr: 0.000032 - momentum: 0.000000 2023-10-14 10:00:07,583 epoch 5 - iter 432/1445 - loss 0.04118564 - time (sec): 23.72 - samples/sec: 2276.97 - lr: 0.000032 - momentum: 0.000000 2023-10-14 10:00:14,965 epoch 5 - iter 576/1445 - loss 0.04133400 - time (sec): 31.10 - samples/sec: 2304.36 - lr: 0.000031 - momentum: 0.000000 2023-10-14 10:00:22,239 epoch 5 - iter 720/1445 - loss 0.04096234 - time (sec): 38.37 - samples/sec: 2322.89 - lr: 0.000031 - momentum: 0.000000 2023-10-14 10:00:29,460 epoch 5 - iter 864/1445 - loss 0.04169367 - time (sec): 45.60 - samples/sec: 2339.62 - lr: 0.000030 - momentum: 0.000000 2023-10-14 10:00:36,493 epoch 5 - iter 1008/1445 - loss 0.04137275 - time (sec): 52.63 - samples/sec: 2344.75 - lr: 0.000029 - momentum: 0.000000 2023-10-14 10:00:43,616 epoch 5 - iter 1152/1445 - loss 0.04174783 - time (sec): 59.75 - samples/sec: 2357.54 - lr: 0.000029 - momentum: 0.000000 2023-10-14 10:00:50,767 epoch 5 - iter 1296/1445 - loss 0.04150397 - time (sec): 66.90 - samples/sec: 2368.59 - lr: 0.000028 - momentum: 0.000000 2023-10-14 10:00:58,092 epoch 5 - iter 1440/1445 - loss 0.04328645 - time (sec): 74.23 - samples/sec: 2366.61 - lr: 0.000028 - momentum: 0.000000 2023-10-14 10:00:58,319 ---------------------------------------------------------------------------------------------------- 2023-10-14 10:00:58,319 EPOCH 5 done: loss 0.0432 - lr: 0.000028 2023-10-14 10:01:02,496 DEV : loss 0.13541918992996216 - f1-score (micro avg) 0.8024 2023-10-14 10:01:02,522 saving best model 2023-10-14 10:01:03,190 ---------------------------------------------------------------------------------------------------- 2023-10-14 10:01:11,193 epoch 6 - iter 144/1445 - loss 0.02701007 - time (sec): 8.00 - samples/sec: 2186.87 - lr: 0.000027 - momentum: 0.000000 2023-10-14 10:01:19,114 epoch 6 - iter 288/1445 - loss 0.02878942 - time (sec): 15.92 - samples/sec: 2281.52 - lr: 0.000027 - momentum: 0.000000 2023-10-14 10:01:26,397 epoch 6 - iter 432/1445 - loss 0.02954965 - time (sec): 23.21 - samples/sec: 2310.29 - lr: 0.000026 - momentum: 0.000000 2023-10-14 10:01:33,678 epoch 6 - iter 576/1445 - loss 0.03123260 - time (sec): 30.49 - samples/sec: 2330.56 - lr: 0.000026 - momentum: 0.000000 2023-10-14 10:01:40,900 epoch 6 - iter 720/1445 - loss 0.03333774 - time (sec): 37.71 - samples/sec: 2343.36 - lr: 0.000025 - momentum: 0.000000 2023-10-14 10:01:48,262 epoch 6 - iter 864/1445 - loss 0.03315317 - time (sec): 45.07 - samples/sec: 2370.89 - lr: 0.000024 - momentum: 0.000000 2023-10-14 10:01:55,480 epoch 6 - iter 1008/1445 - loss 0.03407127 - time (sec): 52.29 - samples/sec: 2374.76 - lr: 0.000024 - momentum: 0.000000 2023-10-14 10:02:02,421 epoch 6 - iter 1152/1445 - loss 0.03360862 - time (sec): 59.23 - samples/sec: 2376.83 - lr: 0.000023 - momentum: 0.000000 2023-10-14 10:02:09,545 epoch 6 - iter 1296/1445 - loss 0.03318720 - time (sec): 66.35 - samples/sec: 2369.79 - lr: 0.000023 - momentum: 0.000000 2023-10-14 10:02:16,875 epoch 6 - iter 1440/1445 - loss 0.03384080 - time (sec): 73.68 - samples/sec: 2382.28 - lr: 0.000022 - momentum: 0.000000 2023-10-14 10:02:17,142 ---------------------------------------------------------------------------------------------------- 2023-10-14 10:02:17,142 EPOCH 6 done: loss 0.0337 - lr: 0.000022 2023-10-14 10:02:20,761 DEV : loss 0.14951762557029724 - f1-score (micro avg) 0.8055 2023-10-14 10:02:20,782 saving best model 2023-10-14 10:02:21,326 ---------------------------------------------------------------------------------------------------- 2023-10-14 10:02:28,530 epoch 7 - iter 144/1445 - loss 0.01859679 - time (sec): 7.20 - samples/sec: 2409.56 - lr: 0.000022 - momentum: 0.000000 2023-10-14 10:02:36,495 epoch 7 - iter 288/1445 - loss 0.01862360 - time (sec): 15.17 - samples/sec: 2313.65 - lr: 0.000021 - momentum: 0.000000 2023-10-14 10:02:43,691 epoch 7 - iter 432/1445 - loss 0.02122236 - time (sec): 22.36 - samples/sec: 2339.13 - lr: 0.000021 - momentum: 0.000000 2023-10-14 10:02:51,023 epoch 7 - iter 576/1445 - loss 0.02018797 - time (sec): 29.69 - samples/sec: 2366.08 - lr: 0.000020 - momentum: 0.000000 2023-10-14 10:02:58,172 epoch 7 - iter 720/1445 - loss 0.02085339 - time (sec): 36.84 - samples/sec: 2377.25 - lr: 0.000019 - momentum: 0.000000 2023-10-14 10:03:05,594 epoch 7 - iter 864/1445 - loss 0.02188189 - time (sec): 44.27 - samples/sec: 2386.72 - lr: 0.000019 - momentum: 0.000000 2023-10-14 10:03:12,809 epoch 7 - iter 1008/1445 - loss 0.02164479 - time (sec): 51.48 - samples/sec: 2388.79 - lr: 0.000018 - momentum: 0.000000 2023-10-14 10:03:19,872 epoch 7 - iter 1152/1445 - loss 0.02215737 - time (sec): 58.54 - samples/sec: 2389.64 - lr: 0.000018 - momentum: 0.000000 2023-10-14 10:03:27,552 epoch 7 - iter 1296/1445 - loss 0.02159639 - time (sec): 66.22 - samples/sec: 2387.54 - lr: 0.000017 - momentum: 0.000000 2023-10-14 10:03:34,922 epoch 7 - iter 1440/1445 - loss 0.02175986 - time (sec): 73.59 - samples/sec: 2388.59 - lr: 0.000017 - momentum: 0.000000 2023-10-14 10:03:35,171 ---------------------------------------------------------------------------------------------------- 2023-10-14 10:03:35,172 EPOCH 7 done: loss 0.0218 - lr: 0.000017 2023-10-14 10:03:38,829 DEV : loss 0.1777421534061432 - f1-score (micro avg) 0.8178 2023-10-14 10:03:38,851 saving best model 2023-10-14 10:03:39,474 ---------------------------------------------------------------------------------------------------- 2023-10-14 10:03:46,871 epoch 8 - iter 144/1445 - loss 0.01548083 - time (sec): 7.40 - samples/sec: 2445.84 - lr: 0.000016 - momentum: 0.000000 2023-10-14 10:03:53,941 epoch 8 - iter 288/1445 - loss 0.01427568 - time (sec): 14.47 - samples/sec: 2431.92 - lr: 0.000016 - momentum: 0.000000 2023-10-14 10:04:01,854 epoch 8 - iter 432/1445 - loss 0.01655715 - time (sec): 22.38 - samples/sec: 2449.47 - lr: 0.000015 - momentum: 0.000000 2023-10-14 10:04:08,682 epoch 8 - iter 576/1445 - loss 0.01523833 - time (sec): 29.21 - samples/sec: 2379.70 - lr: 0.000014 - momentum: 0.000000 2023-10-14 10:04:16,110 epoch 8 - iter 720/1445 - loss 0.01549776 - time (sec): 36.63 - samples/sec: 2408.93 - lr: 0.000014 - momentum: 0.000000 2023-10-14 10:04:23,692 epoch 8 - iter 864/1445 - loss 0.01513458 - time (sec): 44.22 - samples/sec: 2413.23 - lr: 0.000013 - momentum: 0.000000 2023-10-14 10:04:30,915 epoch 8 - iter 1008/1445 - loss 0.01430193 - time (sec): 51.44 - samples/sec: 2407.19 - lr: 0.000013 - momentum: 0.000000 2023-10-14 10:04:38,088 epoch 8 - iter 1152/1445 - loss 0.01514747 - time (sec): 58.61 - samples/sec: 2405.46 - lr: 0.000012 - momentum: 0.000000 2023-10-14 10:04:45,354 epoch 8 - iter 1296/1445 - loss 0.01503901 - time (sec): 65.88 - samples/sec: 2412.79 - lr: 0.000012 - momentum: 0.000000 2023-10-14 10:04:52,563 epoch 8 - iter 1440/1445 - loss 0.01540928 - time (sec): 73.09 - samples/sec: 2404.82 - lr: 0.000011 - momentum: 0.000000 2023-10-14 10:04:52,787 ---------------------------------------------------------------------------------------------------- 2023-10-14 10:04:52,787 EPOCH 8 done: loss 0.0154 - lr: 0.000011 2023-10-14 10:04:56,876 DEV : loss 0.17469239234924316 - f1-score (micro avg) 0.8195 2023-10-14 10:04:56,901 saving best model 2023-10-14 10:04:57,410 ---------------------------------------------------------------------------------------------------- 2023-10-14 10:05:04,739 epoch 9 - iter 144/1445 - loss 0.00719715 - time (sec): 7.32 - samples/sec: 2462.89 - lr: 0.000011 - momentum: 0.000000 2023-10-14 10:05:11,984 epoch 9 - iter 288/1445 - loss 0.00849542 - time (sec): 14.57 - samples/sec: 2445.28 - lr: 0.000010 - momentum: 0.000000 2023-10-14 10:05:19,098 epoch 9 - iter 432/1445 - loss 0.00729865 - time (sec): 21.68 - samples/sec: 2425.52 - lr: 0.000009 - momentum: 0.000000 2023-10-14 10:05:26,573 epoch 9 - iter 576/1445 - loss 0.00850589 - time (sec): 29.15 - samples/sec: 2435.29 - lr: 0.000009 - momentum: 0.000000 2023-10-14 10:05:33,750 epoch 9 - iter 720/1445 - loss 0.00957931 - time (sec): 36.33 - samples/sec: 2419.21 - lr: 0.000008 - momentum: 0.000000 2023-10-14 10:05:41,317 epoch 9 - iter 864/1445 - loss 0.00998317 - time (sec): 43.90 - samples/sec: 2439.16 - lr: 0.000008 - momentum: 0.000000 2023-10-14 10:05:48,361 epoch 9 - iter 1008/1445 - loss 0.00949987 - time (sec): 50.94 - samples/sec: 2424.28 - lr: 0.000007 - momentum: 0.000000 2023-10-14 10:05:55,632 epoch 9 - iter 1152/1445 - loss 0.00972046 - time (sec): 58.21 - samples/sec: 2432.51 - lr: 0.000007 - momentum: 0.000000 2023-10-14 10:06:02,678 epoch 9 - iter 1296/1445 - loss 0.01012022 - time (sec): 65.26 - samples/sec: 2429.75 - lr: 0.000006 - momentum: 0.000000 2023-10-14 10:06:09,749 epoch 9 - iter 1440/1445 - loss 0.00980864 - time (sec): 72.33 - samples/sec: 2431.32 - lr: 0.000006 - momentum: 0.000000 2023-10-14 10:06:09,984 ---------------------------------------------------------------------------------------------------- 2023-10-14 10:06:09,984 EPOCH 9 done: loss 0.0098 - lr: 0.000006 2023-10-14 10:06:13,690 DEV : loss 0.17753612995147705 - f1-score (micro avg) 0.8164 2023-10-14 10:06:13,708 ---------------------------------------------------------------------------------------------------- 2023-10-14 10:06:20,952 epoch 10 - iter 144/1445 - loss 0.00578931 - time (sec): 7.24 - samples/sec: 2299.68 - lr: 0.000005 - momentum: 0.000000 2023-10-14 10:06:28,812 epoch 10 - iter 288/1445 - loss 0.00515456 - time (sec): 15.10 - samples/sec: 2349.36 - lr: 0.000004 - momentum: 0.000000 2023-10-14 10:06:36,007 epoch 10 - iter 432/1445 - loss 0.00857074 - time (sec): 22.30 - samples/sec: 2386.36 - lr: 0.000004 - momentum: 0.000000 2023-10-14 10:06:43,072 epoch 10 - iter 576/1445 - loss 0.00818790 - time (sec): 29.36 - samples/sec: 2377.72 - lr: 0.000003 - momentum: 0.000000 2023-10-14 10:06:50,686 epoch 10 - iter 720/1445 - loss 0.00811613 - time (sec): 36.98 - samples/sec: 2386.90 - lr: 0.000003 - momentum: 0.000000 2023-10-14 10:06:58,142 epoch 10 - iter 864/1445 - loss 0.00732934 - time (sec): 44.43 - samples/sec: 2373.05 - lr: 0.000002 - momentum: 0.000000 2023-10-14 10:07:05,597 epoch 10 - iter 1008/1445 - loss 0.00736235 - time (sec): 51.89 - samples/sec: 2389.94 - lr: 0.000002 - momentum: 0.000000 2023-10-14 10:07:12,759 epoch 10 - iter 1152/1445 - loss 0.00677836 - time (sec): 59.05 - samples/sec: 2393.33 - lr: 0.000001 - momentum: 0.000000 2023-10-14 10:07:19,810 epoch 10 - iter 1296/1445 - loss 0.00657795 - time (sec): 66.10 - samples/sec: 2392.12 - lr: 0.000001 - momentum: 0.000000 2023-10-14 10:07:27,162 epoch 10 - iter 1440/1445 - loss 0.00682360 - time (sec): 73.45 - samples/sec: 2388.96 - lr: 0.000000 - momentum: 0.000000 2023-10-14 10:07:27,463 ---------------------------------------------------------------------------------------------------- 2023-10-14 10:07:27,464 EPOCH 10 done: loss 0.0068 - lr: 0.000000 2023-10-14 10:07:31,044 DEV : loss 0.18386265635490417 - f1-score (micro avg) 0.8243 2023-10-14 10:07:31,069 saving best model 2023-10-14 10:07:32,182 ---------------------------------------------------------------------------------------------------- 2023-10-14 10:07:32,184 Loading model from best epoch ... 2023-10-14 10:07:33,955 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-14 10:07:37,400 Results: - F-score (micro) 0.818 - F-score (macro) 0.7206 - Accuracy 0.7037 By class: precision recall f1-score support PER 0.8142 0.8091 0.8117 482 LOC 0.9071 0.8319 0.8679 458 ORG 0.6279 0.3913 0.4821 69 micro avg 0.8471 0.7909 0.8180 1009 macro avg 0.7831 0.6774 0.7206 1009 weighted avg 0.8436 0.7909 0.8146 1009 2023-10-14 10:07:37,400 ----------------------------------------------------------------------------------------------------