stefan-it's picture
Upload folder using huggingface_hub
4f66084
2023-10-14 09:54:33,483 ----------------------------------------------------------------------------------------------------
2023-10-14 09:54:33,484 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 09:54:33,484 ----------------------------------------------------------------------------------------------------
2023-10-14 09:54:33,484 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-14 09:54:33,484 ----------------------------------------------------------------------------------------------------
2023-10-14 09:54:33,484 Train: 5777 sentences
2023-10-14 09:54:33,484 (train_with_dev=False, train_with_test=False)
2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
2023-10-14 09:54:33,485 Training Params:
2023-10-14 09:54:33,485 - learning_rate: "5e-05"
2023-10-14 09:54:33,485 - mini_batch_size: "4"
2023-10-14 09:54:33,485 - max_epochs: "10"
2023-10-14 09:54:33,485 - shuffle: "True"
2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
2023-10-14 09:54:33,485 Plugins:
2023-10-14 09:54:33,485 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
2023-10-14 09:54:33,485 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 09:54:33,485 - metric: "('micro avg', 'f1-score')"
2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
2023-10-14 09:54:33,485 Computation:
2023-10-14 09:54:33,485 - compute on device: cuda:0
2023-10-14 09:54:33,485 - embedding storage: none
2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
2023-10-14 09:54:33,485 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
2023-10-14 09:54:33,485 ----------------------------------------------------------------------------------------------------
2023-10-14 09:54:40,631 epoch 1 - iter 144/1445 - loss 1.32151919 - time (sec): 7.14 - samples/sec: 2426.87 - lr: 0.000005 - momentum: 0.000000
2023-10-14 09:54:47,906 epoch 1 - iter 288/1445 - loss 0.79135229 - time (sec): 14.42 - samples/sec: 2414.73 - lr: 0.000010 - momentum: 0.000000
2023-10-14 09:54:54,888 epoch 1 - iter 432/1445 - loss 0.58960751 - time (sec): 21.40 - samples/sec: 2424.36 - lr: 0.000015 - momentum: 0.000000
2023-10-14 09:55:02,060 epoch 1 - iter 576/1445 - loss 0.49079879 - time (sec): 28.57 - samples/sec: 2430.02 - lr: 0.000020 - momentum: 0.000000
2023-10-14 09:55:09,558 epoch 1 - iter 720/1445 - loss 0.41600711 - time (sec): 36.07 - samples/sec: 2465.79 - lr: 0.000025 - momentum: 0.000000
2023-10-14 09:55:16,802 epoch 1 - iter 864/1445 - loss 0.37774428 - time (sec): 43.32 - samples/sec: 2443.10 - lr: 0.000030 - momentum: 0.000000
2023-10-14 09:55:23,834 epoch 1 - iter 1008/1445 - loss 0.34717009 - time (sec): 50.35 - samples/sec: 2433.37 - lr: 0.000035 - momentum: 0.000000
2023-10-14 09:55:31,157 epoch 1 - iter 1152/1445 - loss 0.32075520 - time (sec): 57.67 - samples/sec: 2436.34 - lr: 0.000040 - momentum: 0.000000
2023-10-14 09:55:38,446 epoch 1 - iter 1296/1445 - loss 0.29995277 - time (sec): 64.96 - samples/sec: 2440.65 - lr: 0.000045 - momentum: 0.000000
2023-10-14 09:55:45,688 epoch 1 - iter 1440/1445 - loss 0.28332814 - time (sec): 72.20 - samples/sec: 2436.27 - lr: 0.000050 - momentum: 0.000000
2023-10-14 09:55:45,899 ----------------------------------------------------------------------------------------------------
2023-10-14 09:55:45,900 EPOCH 1 done: loss 0.2833 - lr: 0.000050
2023-10-14 09:55:49,489 DEV : loss 0.11788433790206909 - f1-score (micro avg) 0.7468
2023-10-14 09:55:49,514 saving best model
2023-10-14 09:55:49,872 ----------------------------------------------------------------------------------------------------
2023-10-14 09:55:57,243 epoch 2 - iter 144/1445 - loss 0.13020671 - time (sec): 7.37 - samples/sec: 2424.95 - lr: 0.000049 - momentum: 0.000000
2023-10-14 09:56:04,400 epoch 2 - iter 288/1445 - loss 0.12131136 - time (sec): 14.53 - samples/sec: 2414.49 - lr: 0.000049 - momentum: 0.000000
2023-10-14 09:56:11,439 epoch 2 - iter 432/1445 - loss 0.11672384 - time (sec): 21.56 - samples/sec: 2425.04 - lr: 0.000048 - momentum: 0.000000
2023-10-14 09:56:18,951 epoch 2 - iter 576/1445 - loss 0.11924226 - time (sec): 29.08 - samples/sec: 2436.14 - lr: 0.000048 - momentum: 0.000000
2023-10-14 09:56:26,351 epoch 2 - iter 720/1445 - loss 0.11484030 - time (sec): 36.48 - samples/sec: 2450.06 - lr: 0.000047 - momentum: 0.000000
2023-10-14 09:56:33,706 epoch 2 - iter 864/1445 - loss 0.10994458 - time (sec): 43.83 - samples/sec: 2433.06 - lr: 0.000047 - momentum: 0.000000
2023-10-14 09:56:40,743 epoch 2 - iter 1008/1445 - loss 0.10976559 - time (sec): 50.87 - samples/sec: 2422.72 - lr: 0.000046 - momentum: 0.000000
2023-10-14 09:56:48,032 epoch 2 - iter 1152/1445 - loss 0.10785653 - time (sec): 58.16 - samples/sec: 2425.32 - lr: 0.000046 - momentum: 0.000000
2023-10-14 09:56:55,326 epoch 2 - iter 1296/1445 - loss 0.10867392 - time (sec): 65.45 - samples/sec: 2423.28 - lr: 0.000045 - momentum: 0.000000
2023-10-14 09:57:02,445 epoch 2 - iter 1440/1445 - loss 0.10774871 - time (sec): 72.57 - samples/sec: 2421.80 - lr: 0.000044 - momentum: 0.000000
2023-10-14 09:57:02,665 ----------------------------------------------------------------------------------------------------
2023-10-14 09:57:02,666 EPOCH 2 done: loss 0.1077 - lr: 0.000044
2023-10-14 09:57:07,129 DEV : loss 0.10633940249681473 - f1-score (micro avg) 0.74
2023-10-14 09:57:07,146 ----------------------------------------------------------------------------------------------------
2023-10-14 09:57:15,050 epoch 3 - iter 144/1445 - loss 0.09035149 - time (sec): 7.90 - samples/sec: 2218.66 - lr: 0.000044 - momentum: 0.000000
2023-10-14 09:57:22,733 epoch 3 - iter 288/1445 - loss 0.08638552 - time (sec): 15.59 - samples/sec: 2237.53 - lr: 0.000043 - momentum: 0.000000
2023-10-14 09:57:30,022 epoch 3 - iter 432/1445 - loss 0.07945969 - time (sec): 22.87 - samples/sec: 2313.62 - lr: 0.000043 - momentum: 0.000000
2023-10-14 09:57:37,519 epoch 3 - iter 576/1445 - loss 0.07900069 - time (sec): 30.37 - samples/sec: 2328.85 - lr: 0.000042 - momentum: 0.000000
2023-10-14 09:57:44,928 epoch 3 - iter 720/1445 - loss 0.07585140 - time (sec): 37.78 - samples/sec: 2331.64 - lr: 0.000042 - momentum: 0.000000
2023-10-14 09:57:52,073 epoch 3 - iter 864/1445 - loss 0.07259362 - time (sec): 44.93 - samples/sec: 2350.70 - lr: 0.000041 - momentum: 0.000000
2023-10-14 09:57:59,957 epoch 3 - iter 1008/1445 - loss 0.07342512 - time (sec): 52.81 - samples/sec: 2349.05 - lr: 0.000041 - momentum: 0.000000
2023-10-14 09:58:06,869 epoch 3 - iter 1152/1445 - loss 0.07356550 - time (sec): 59.72 - samples/sec: 2355.16 - lr: 0.000040 - momentum: 0.000000
2023-10-14 09:58:13,892 epoch 3 - iter 1296/1445 - loss 0.07388942 - time (sec): 66.74 - samples/sec: 2374.81 - lr: 0.000039 - momentum: 0.000000
2023-10-14 09:58:20,897 epoch 3 - iter 1440/1445 - loss 0.07387594 - time (sec): 73.75 - samples/sec: 2382.74 - lr: 0.000039 - momentum: 0.000000
2023-10-14 09:58:21,112 ----------------------------------------------------------------------------------------------------
2023-10-14 09:58:21,112 EPOCH 3 done: loss 0.0739 - lr: 0.000039
2023-10-14 09:58:24,726 DEV : loss 0.10955189168453217 - f1-score (micro avg) 0.7736
2023-10-14 09:58:24,753 saving best model
2023-10-14 09:58:25,237 ----------------------------------------------------------------------------------------------------
2023-10-14 09:58:33,415 epoch 4 - iter 144/1445 - loss 0.05575292 - time (sec): 8.17 - samples/sec: 2111.86 - lr: 0.000038 - momentum: 0.000000
2023-10-14 09:58:40,972 epoch 4 - iter 288/1445 - loss 0.05433728 - time (sec): 15.73 - samples/sec: 2209.10 - lr: 0.000038 - momentum: 0.000000
2023-10-14 09:58:48,740 epoch 4 - iter 432/1445 - loss 0.05126783 - time (sec): 23.50 - samples/sec: 2174.54 - lr: 0.000037 - momentum: 0.000000
2023-10-14 09:58:55,970 epoch 4 - iter 576/1445 - loss 0.05255992 - time (sec): 30.73 - samples/sec: 2256.80 - lr: 0.000037 - momentum: 0.000000
2023-10-14 09:59:03,172 epoch 4 - iter 720/1445 - loss 0.05383510 - time (sec): 37.93 - samples/sec: 2293.78 - lr: 0.000036 - momentum: 0.000000
2023-10-14 09:59:10,572 epoch 4 - iter 864/1445 - loss 0.05800646 - time (sec): 45.33 - samples/sec: 2327.58 - lr: 0.000036 - momentum: 0.000000
2023-10-14 09:59:17,841 epoch 4 - iter 1008/1445 - loss 0.05787781 - time (sec): 52.60 - samples/sec: 2357.32 - lr: 0.000035 - momentum: 0.000000
2023-10-14 09:59:24,960 epoch 4 - iter 1152/1445 - loss 0.05816878 - time (sec): 59.72 - samples/sec: 2351.87 - lr: 0.000034 - momentum: 0.000000
2023-10-14 09:59:31,975 epoch 4 - iter 1296/1445 - loss 0.05645199 - time (sec): 66.73 - samples/sec: 2360.65 - lr: 0.000034 - momentum: 0.000000
2023-10-14 09:59:39,299 epoch 4 - iter 1440/1445 - loss 0.05795378 - time (sec): 74.06 - samples/sec: 2374.45 - lr: 0.000033 - momentum: 0.000000
2023-10-14 09:59:39,517 ----------------------------------------------------------------------------------------------------
2023-10-14 09:59:39,517 EPOCH 4 done: loss 0.0579 - lr: 0.000033
2023-10-14 09:59:43,265 DEV : loss 0.13582421839237213 - f1-score (micro avg) 0.7781
2023-10-14 09:59:43,289 saving best model
2023-10-14 09:59:43,862 ----------------------------------------------------------------------------------------------------
2023-10-14 09:59:52,276 epoch 5 - iter 144/1445 - loss 0.03810618 - time (sec): 8.41 - samples/sec: 2223.59 - lr: 0.000033 - momentum: 0.000000
2023-10-14 10:00:00,013 epoch 5 - iter 288/1445 - loss 0.03940126 - time (sec): 16.15 - samples/sec: 2221.95 - lr: 0.000032 - momentum: 0.000000
2023-10-14 10:00:07,583 epoch 5 - iter 432/1445 - loss 0.04118564 - time (sec): 23.72 - samples/sec: 2276.97 - lr: 0.000032 - momentum: 0.000000
2023-10-14 10:00:14,965 epoch 5 - iter 576/1445 - loss 0.04133400 - time (sec): 31.10 - samples/sec: 2304.36 - lr: 0.000031 - momentum: 0.000000
2023-10-14 10:00:22,239 epoch 5 - iter 720/1445 - loss 0.04096234 - time (sec): 38.37 - samples/sec: 2322.89 - lr: 0.000031 - momentum: 0.000000
2023-10-14 10:00:29,460 epoch 5 - iter 864/1445 - loss 0.04169367 - time (sec): 45.60 - samples/sec: 2339.62 - lr: 0.000030 - momentum: 0.000000
2023-10-14 10:00:36,493 epoch 5 - iter 1008/1445 - loss 0.04137275 - time (sec): 52.63 - samples/sec: 2344.75 - lr: 0.000029 - momentum: 0.000000
2023-10-14 10:00:43,616 epoch 5 - iter 1152/1445 - loss 0.04174783 - time (sec): 59.75 - samples/sec: 2357.54 - lr: 0.000029 - momentum: 0.000000
2023-10-14 10:00:50,767 epoch 5 - iter 1296/1445 - loss 0.04150397 - time (sec): 66.90 - samples/sec: 2368.59 - lr: 0.000028 - momentum: 0.000000
2023-10-14 10:00:58,092 epoch 5 - iter 1440/1445 - loss 0.04328645 - time (sec): 74.23 - samples/sec: 2366.61 - lr: 0.000028 - momentum: 0.000000
2023-10-14 10:00:58,319 ----------------------------------------------------------------------------------------------------
2023-10-14 10:00:58,319 EPOCH 5 done: loss 0.0432 - lr: 0.000028
2023-10-14 10:01:02,496 DEV : loss 0.13541918992996216 - f1-score (micro avg) 0.8024
2023-10-14 10:01:02,522 saving best model
2023-10-14 10:01:03,190 ----------------------------------------------------------------------------------------------------
2023-10-14 10:01:11,193 epoch 6 - iter 144/1445 - loss 0.02701007 - time (sec): 8.00 - samples/sec: 2186.87 - lr: 0.000027 - momentum: 0.000000
2023-10-14 10:01:19,114 epoch 6 - iter 288/1445 - loss 0.02878942 - time (sec): 15.92 - samples/sec: 2281.52 - lr: 0.000027 - momentum: 0.000000
2023-10-14 10:01:26,397 epoch 6 - iter 432/1445 - loss 0.02954965 - time (sec): 23.21 - samples/sec: 2310.29 - lr: 0.000026 - momentum: 0.000000
2023-10-14 10:01:33,678 epoch 6 - iter 576/1445 - loss 0.03123260 - time (sec): 30.49 - samples/sec: 2330.56 - lr: 0.000026 - momentum: 0.000000
2023-10-14 10:01:40,900 epoch 6 - iter 720/1445 - loss 0.03333774 - time (sec): 37.71 - samples/sec: 2343.36 - lr: 0.000025 - momentum: 0.000000
2023-10-14 10:01:48,262 epoch 6 - iter 864/1445 - loss 0.03315317 - time (sec): 45.07 - samples/sec: 2370.89 - lr: 0.000024 - momentum: 0.000000
2023-10-14 10:01:55,480 epoch 6 - iter 1008/1445 - loss 0.03407127 - time (sec): 52.29 - samples/sec: 2374.76 - lr: 0.000024 - momentum: 0.000000
2023-10-14 10:02:02,421 epoch 6 - iter 1152/1445 - loss 0.03360862 - time (sec): 59.23 - samples/sec: 2376.83 - lr: 0.000023 - momentum: 0.000000
2023-10-14 10:02:09,545 epoch 6 - iter 1296/1445 - loss 0.03318720 - time (sec): 66.35 - samples/sec: 2369.79 - lr: 0.000023 - momentum: 0.000000
2023-10-14 10:02:16,875 epoch 6 - iter 1440/1445 - loss 0.03384080 - time (sec): 73.68 - samples/sec: 2382.28 - lr: 0.000022 - momentum: 0.000000
2023-10-14 10:02:17,142 ----------------------------------------------------------------------------------------------------
2023-10-14 10:02:17,142 EPOCH 6 done: loss 0.0337 - lr: 0.000022
2023-10-14 10:02:20,761 DEV : loss 0.14951762557029724 - f1-score (micro avg) 0.8055
2023-10-14 10:02:20,782 saving best model
2023-10-14 10:02:21,326 ----------------------------------------------------------------------------------------------------
2023-10-14 10:02:28,530 epoch 7 - iter 144/1445 - loss 0.01859679 - time (sec): 7.20 - samples/sec: 2409.56 - lr: 0.000022 - momentum: 0.000000
2023-10-14 10:02:36,495 epoch 7 - iter 288/1445 - loss 0.01862360 - time (sec): 15.17 - samples/sec: 2313.65 - lr: 0.000021 - momentum: 0.000000
2023-10-14 10:02:43,691 epoch 7 - iter 432/1445 - loss 0.02122236 - time (sec): 22.36 - samples/sec: 2339.13 - lr: 0.000021 - momentum: 0.000000
2023-10-14 10:02:51,023 epoch 7 - iter 576/1445 - loss 0.02018797 - time (sec): 29.69 - samples/sec: 2366.08 - lr: 0.000020 - momentum: 0.000000
2023-10-14 10:02:58,172 epoch 7 - iter 720/1445 - loss 0.02085339 - time (sec): 36.84 - samples/sec: 2377.25 - lr: 0.000019 - momentum: 0.000000
2023-10-14 10:03:05,594 epoch 7 - iter 864/1445 - loss 0.02188189 - time (sec): 44.27 - samples/sec: 2386.72 - lr: 0.000019 - momentum: 0.000000
2023-10-14 10:03:12,809 epoch 7 - iter 1008/1445 - loss 0.02164479 - time (sec): 51.48 - samples/sec: 2388.79 - lr: 0.000018 - momentum: 0.000000
2023-10-14 10:03:19,872 epoch 7 - iter 1152/1445 - loss 0.02215737 - time (sec): 58.54 - samples/sec: 2389.64 - lr: 0.000018 - momentum: 0.000000
2023-10-14 10:03:27,552 epoch 7 - iter 1296/1445 - loss 0.02159639 - time (sec): 66.22 - samples/sec: 2387.54 - lr: 0.000017 - momentum: 0.000000
2023-10-14 10:03:34,922 epoch 7 - iter 1440/1445 - loss 0.02175986 - time (sec): 73.59 - samples/sec: 2388.59 - lr: 0.000017 - momentum: 0.000000
2023-10-14 10:03:35,171 ----------------------------------------------------------------------------------------------------
2023-10-14 10:03:35,172 EPOCH 7 done: loss 0.0218 - lr: 0.000017
2023-10-14 10:03:38,829 DEV : loss 0.1777421534061432 - f1-score (micro avg) 0.8178
2023-10-14 10:03:38,851 saving best model
2023-10-14 10:03:39,474 ----------------------------------------------------------------------------------------------------
2023-10-14 10:03:46,871 epoch 8 - iter 144/1445 - loss 0.01548083 - time (sec): 7.40 - samples/sec: 2445.84 - lr: 0.000016 - momentum: 0.000000
2023-10-14 10:03:53,941 epoch 8 - iter 288/1445 - loss 0.01427568 - time (sec): 14.47 - samples/sec: 2431.92 - lr: 0.000016 - momentum: 0.000000
2023-10-14 10:04:01,854 epoch 8 - iter 432/1445 - loss 0.01655715 - time (sec): 22.38 - samples/sec: 2449.47 - lr: 0.000015 - momentum: 0.000000
2023-10-14 10:04:08,682 epoch 8 - iter 576/1445 - loss 0.01523833 - time (sec): 29.21 - samples/sec: 2379.70 - lr: 0.000014 - momentum: 0.000000
2023-10-14 10:04:16,110 epoch 8 - iter 720/1445 - loss 0.01549776 - time (sec): 36.63 - samples/sec: 2408.93 - lr: 0.000014 - momentum: 0.000000
2023-10-14 10:04:23,692 epoch 8 - iter 864/1445 - loss 0.01513458 - time (sec): 44.22 - samples/sec: 2413.23 - lr: 0.000013 - momentum: 0.000000
2023-10-14 10:04:30,915 epoch 8 - iter 1008/1445 - loss 0.01430193 - time (sec): 51.44 - samples/sec: 2407.19 - lr: 0.000013 - momentum: 0.000000
2023-10-14 10:04:38,088 epoch 8 - iter 1152/1445 - loss 0.01514747 - time (sec): 58.61 - samples/sec: 2405.46 - lr: 0.000012 - momentum: 0.000000
2023-10-14 10:04:45,354 epoch 8 - iter 1296/1445 - loss 0.01503901 - time (sec): 65.88 - samples/sec: 2412.79 - lr: 0.000012 - momentum: 0.000000
2023-10-14 10:04:52,563 epoch 8 - iter 1440/1445 - loss 0.01540928 - time (sec): 73.09 - samples/sec: 2404.82 - lr: 0.000011 - momentum: 0.000000
2023-10-14 10:04:52,787 ----------------------------------------------------------------------------------------------------
2023-10-14 10:04:52,787 EPOCH 8 done: loss 0.0154 - lr: 0.000011
2023-10-14 10:04:56,876 DEV : loss 0.17469239234924316 - f1-score (micro avg) 0.8195
2023-10-14 10:04:56,901 saving best model
2023-10-14 10:04:57,410 ----------------------------------------------------------------------------------------------------
2023-10-14 10:05:04,739 epoch 9 - iter 144/1445 - loss 0.00719715 - time (sec): 7.32 - samples/sec: 2462.89 - lr: 0.000011 - momentum: 0.000000
2023-10-14 10:05:11,984 epoch 9 - iter 288/1445 - loss 0.00849542 - time (sec): 14.57 - samples/sec: 2445.28 - lr: 0.000010 - momentum: 0.000000
2023-10-14 10:05:19,098 epoch 9 - iter 432/1445 - loss 0.00729865 - time (sec): 21.68 - samples/sec: 2425.52 - lr: 0.000009 - momentum: 0.000000
2023-10-14 10:05:26,573 epoch 9 - iter 576/1445 - loss 0.00850589 - time (sec): 29.15 - samples/sec: 2435.29 - lr: 0.000009 - momentum: 0.000000
2023-10-14 10:05:33,750 epoch 9 - iter 720/1445 - loss 0.00957931 - time (sec): 36.33 - samples/sec: 2419.21 - lr: 0.000008 - momentum: 0.000000
2023-10-14 10:05:41,317 epoch 9 - iter 864/1445 - loss 0.00998317 - time (sec): 43.90 - samples/sec: 2439.16 - lr: 0.000008 - momentum: 0.000000
2023-10-14 10:05:48,361 epoch 9 - iter 1008/1445 - loss 0.00949987 - time (sec): 50.94 - samples/sec: 2424.28 - lr: 0.000007 - momentum: 0.000000
2023-10-14 10:05:55,632 epoch 9 - iter 1152/1445 - loss 0.00972046 - time (sec): 58.21 - samples/sec: 2432.51 - lr: 0.000007 - momentum: 0.000000
2023-10-14 10:06:02,678 epoch 9 - iter 1296/1445 - loss 0.01012022 - time (sec): 65.26 - samples/sec: 2429.75 - lr: 0.000006 - momentum: 0.000000
2023-10-14 10:06:09,749 epoch 9 - iter 1440/1445 - loss 0.00980864 - time (sec): 72.33 - samples/sec: 2431.32 - lr: 0.000006 - momentum: 0.000000
2023-10-14 10:06:09,984 ----------------------------------------------------------------------------------------------------
2023-10-14 10:06:09,984 EPOCH 9 done: loss 0.0098 - lr: 0.000006
2023-10-14 10:06:13,690 DEV : loss 0.17753612995147705 - f1-score (micro avg) 0.8164
2023-10-14 10:06:13,708 ----------------------------------------------------------------------------------------------------
2023-10-14 10:06:20,952 epoch 10 - iter 144/1445 - loss 0.00578931 - time (sec): 7.24 - samples/sec: 2299.68 - lr: 0.000005 - momentum: 0.000000
2023-10-14 10:06:28,812 epoch 10 - iter 288/1445 - loss 0.00515456 - time (sec): 15.10 - samples/sec: 2349.36 - lr: 0.000004 - momentum: 0.000000
2023-10-14 10:06:36,007 epoch 10 - iter 432/1445 - loss 0.00857074 - time (sec): 22.30 - samples/sec: 2386.36 - lr: 0.000004 - momentum: 0.000000
2023-10-14 10:06:43,072 epoch 10 - iter 576/1445 - loss 0.00818790 - time (sec): 29.36 - samples/sec: 2377.72 - lr: 0.000003 - momentum: 0.000000
2023-10-14 10:06:50,686 epoch 10 - iter 720/1445 - loss 0.00811613 - time (sec): 36.98 - samples/sec: 2386.90 - lr: 0.000003 - momentum: 0.000000
2023-10-14 10:06:58,142 epoch 10 - iter 864/1445 - loss 0.00732934 - time (sec): 44.43 - samples/sec: 2373.05 - lr: 0.000002 - momentum: 0.000000
2023-10-14 10:07:05,597 epoch 10 - iter 1008/1445 - loss 0.00736235 - time (sec): 51.89 - samples/sec: 2389.94 - lr: 0.000002 - momentum: 0.000000
2023-10-14 10:07:12,759 epoch 10 - iter 1152/1445 - loss 0.00677836 - time (sec): 59.05 - samples/sec: 2393.33 - lr: 0.000001 - momentum: 0.000000
2023-10-14 10:07:19,810 epoch 10 - iter 1296/1445 - loss 0.00657795 - time (sec): 66.10 - samples/sec: 2392.12 - lr: 0.000001 - momentum: 0.000000
2023-10-14 10:07:27,162 epoch 10 - iter 1440/1445 - loss 0.00682360 - time (sec): 73.45 - samples/sec: 2388.96 - lr: 0.000000 - momentum: 0.000000
2023-10-14 10:07:27,463 ----------------------------------------------------------------------------------------------------
2023-10-14 10:07:27,464 EPOCH 10 done: loss 0.0068 - lr: 0.000000
2023-10-14 10:07:31,044 DEV : loss 0.18386265635490417 - f1-score (micro avg) 0.8243
2023-10-14 10:07:31,069 saving best model
2023-10-14 10:07:32,182 ----------------------------------------------------------------------------------------------------
2023-10-14 10:07:32,184 Loading model from best epoch ...
2023-10-14 10:07:33,955 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-14 10:07:37,400
Results:
- F-score (micro) 0.818
- F-score (macro) 0.7206
- Accuracy 0.7037
By class:
precision recall f1-score support
PER 0.8142 0.8091 0.8117 482
LOC 0.9071 0.8319 0.8679 458
ORG 0.6279 0.3913 0.4821 69
micro avg 0.8471 0.7909 0.8180 1009
macro avg 0.7831 0.6774 0.7206 1009
weighted avg 0.8436 0.7909 0.8146 1009
2023-10-14 10:07:37,400 ----------------------------------------------------------------------------------------------------