stefan-it's picture
Upload folder using huggingface_hub
464d828
2023-10-14 10:43:55,873 ----------------------------------------------------------------------------------------------------
2023-10-14 10:43:55,874 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 10:43:55,874 ----------------------------------------------------------------------------------------------------
2023-10-14 10:43:55,874 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-14 10:43:55,874 ----------------------------------------------------------------------------------------------------
2023-10-14 10:43:55,874 Train: 5777 sentences
2023-10-14 10:43:55,874 (train_with_dev=False, train_with_test=False)
2023-10-14 10:43:55,874 ----------------------------------------------------------------------------------------------------
2023-10-14 10:43:55,875 Training Params:
2023-10-14 10:43:55,875 - learning_rate: "5e-05"
2023-10-14 10:43:55,875 - mini_batch_size: "4"
2023-10-14 10:43:55,875 - max_epochs: "10"
2023-10-14 10:43:55,875 - shuffle: "True"
2023-10-14 10:43:55,875 ----------------------------------------------------------------------------------------------------
2023-10-14 10:43:55,875 Plugins:
2023-10-14 10:43:55,875 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 10:43:55,875 ----------------------------------------------------------------------------------------------------
2023-10-14 10:43:55,875 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 10:43:55,875 - metric: "('micro avg', 'f1-score')"
2023-10-14 10:43:55,875 ----------------------------------------------------------------------------------------------------
2023-10-14 10:43:55,875 Computation:
2023-10-14 10:43:55,875 - compute on device: cuda:0
2023-10-14 10:43:55,875 - embedding storage: none
2023-10-14 10:43:55,875 ----------------------------------------------------------------------------------------------------
2023-10-14 10:43:55,875 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-14 10:43:55,875 ----------------------------------------------------------------------------------------------------
2023-10-14 10:43:55,875 ----------------------------------------------------------------------------------------------------
2023-10-14 10:44:03,371 epoch 1 - iter 144/1445 - loss 1.53614842 - time (sec): 7.49 - samples/sec: 2488.14 - lr: 0.000005 - momentum: 0.000000
2023-10-14 10:44:10,666 epoch 1 - iter 288/1445 - loss 0.92543100 - time (sec): 14.79 - samples/sec: 2442.39 - lr: 0.000010 - momentum: 0.000000
2023-10-14 10:44:18,047 epoch 1 - iter 432/1445 - loss 0.68772860 - time (sec): 22.17 - samples/sec: 2403.60 - lr: 0.000015 - momentum: 0.000000
2023-10-14 10:44:25,279 epoch 1 - iter 576/1445 - loss 0.56173958 - time (sec): 29.40 - samples/sec: 2387.31 - lr: 0.000020 - momentum: 0.000000
2023-10-14 10:44:32,595 epoch 1 - iter 720/1445 - loss 0.47974609 - time (sec): 36.72 - samples/sec: 2404.37 - lr: 0.000025 - momentum: 0.000000
2023-10-14 10:44:39,880 epoch 1 - iter 864/1445 - loss 0.42590462 - time (sec): 44.00 - samples/sec: 2411.14 - lr: 0.000030 - momentum: 0.000000
2023-10-14 10:44:47,180 epoch 1 - iter 1008/1445 - loss 0.38604397 - time (sec): 51.30 - samples/sec: 2409.43 - lr: 0.000035 - momentum: 0.000000
2023-10-14 10:44:54,827 epoch 1 - iter 1152/1445 - loss 0.35511393 - time (sec): 58.95 - samples/sec: 2419.50 - lr: 0.000040 - momentum: 0.000000
2023-10-14 10:45:01,879 epoch 1 - iter 1296/1445 - loss 0.33007466 - time (sec): 66.00 - samples/sec: 2423.12 - lr: 0.000045 - momentum: 0.000000
2023-10-14 10:45:08,740 epoch 1 - iter 1440/1445 - loss 0.31318374 - time (sec): 72.86 - samples/sec: 2410.77 - lr: 0.000050 - momentum: 0.000000
2023-10-14 10:45:08,985 ----------------------------------------------------------------------------------------------------
2023-10-14 10:45:08,986 EPOCH 1 done: loss 0.3125 - lr: 0.000050
2023-10-14 10:45:12,844 DEV : loss 0.1323496401309967 - f1-score (micro avg) 0.6425
2023-10-14 10:45:12,863 saving best model
2023-10-14 10:45:13,235 ----------------------------------------------------------------------------------------------------
2023-10-14 10:45:20,499 epoch 2 - iter 144/1445 - loss 0.12655175 - time (sec): 7.26 - samples/sec: 2233.65 - lr: 0.000049 - momentum: 0.000000
2023-10-14 10:45:28,359 epoch 2 - iter 288/1445 - loss 0.11664640 - time (sec): 15.12 - samples/sec: 2234.54 - lr: 0.000049 - momentum: 0.000000
2023-10-14 10:45:35,769 epoch 2 - iter 432/1445 - loss 0.11764083 - time (sec): 22.53 - samples/sec: 2297.07 - lr: 0.000048 - momentum: 0.000000
2023-10-14 10:45:43,380 epoch 2 - iter 576/1445 - loss 0.11288782 - time (sec): 30.14 - samples/sec: 2351.55 - lr: 0.000048 - momentum: 0.000000
2023-10-14 10:45:50,679 epoch 2 - iter 720/1445 - loss 0.11005752 - time (sec): 37.44 - samples/sec: 2372.40 - lr: 0.000047 - momentum: 0.000000
2023-10-14 10:45:57,858 epoch 2 - iter 864/1445 - loss 0.10861039 - time (sec): 44.62 - samples/sec: 2366.36 - lr: 0.000047 - momentum: 0.000000
2023-10-14 10:46:04,905 epoch 2 - iter 1008/1445 - loss 0.11009884 - time (sec): 51.67 - samples/sec: 2363.61 - lr: 0.000046 - momentum: 0.000000
2023-10-14 10:46:12,036 epoch 2 - iter 1152/1445 - loss 0.10756740 - time (sec): 58.80 - samples/sec: 2371.45 - lr: 0.000046 - momentum: 0.000000
2023-10-14 10:46:19,797 epoch 2 - iter 1296/1445 - loss 0.10553537 - time (sec): 66.56 - samples/sec: 2362.48 - lr: 0.000045 - momentum: 0.000000
2023-10-14 10:46:27,342 epoch 2 - iter 1440/1445 - loss 0.10547907 - time (sec): 74.10 - samples/sec: 2370.93 - lr: 0.000044 - momentum: 0.000000
2023-10-14 10:46:27,595 ----------------------------------------------------------------------------------------------------
2023-10-14 10:46:27,595 EPOCH 2 done: loss 0.1053 - lr: 0.000044
2023-10-14 10:46:31,969 DEV : loss 0.10467828810214996 - f1-score (micro avg) 0.7101
2023-10-14 10:46:31,987 saving best model
2023-10-14 10:46:32,513 ----------------------------------------------------------------------------------------------------
2023-10-14 10:46:40,845 epoch 3 - iter 144/1445 - loss 0.06046434 - time (sec): 8.33 - samples/sec: 2176.51 - lr: 0.000044 - momentum: 0.000000
2023-10-14 10:46:49,225 epoch 3 - iter 288/1445 - loss 0.06004263 - time (sec): 16.71 - samples/sec: 2131.19 - lr: 0.000043 - momentum: 0.000000
2023-10-14 10:46:56,492 epoch 3 - iter 432/1445 - loss 0.06632407 - time (sec): 23.98 - samples/sec: 2165.44 - lr: 0.000043 - momentum: 0.000000
2023-10-14 10:47:03,797 epoch 3 - iter 576/1445 - loss 0.06736150 - time (sec): 31.28 - samples/sec: 2213.57 - lr: 0.000042 - momentum: 0.000000
2023-10-14 10:47:11,200 epoch 3 - iter 720/1445 - loss 0.06752059 - time (sec): 38.68 - samples/sec: 2266.94 - lr: 0.000042 - momentum: 0.000000
2023-10-14 10:47:18,528 epoch 3 - iter 864/1445 - loss 0.07237618 - time (sec): 46.01 - samples/sec: 2285.16 - lr: 0.000041 - momentum: 0.000000
2023-10-14 10:47:26,048 epoch 3 - iter 1008/1445 - loss 0.07408386 - time (sec): 53.53 - samples/sec: 2314.03 - lr: 0.000041 - momentum: 0.000000
2023-10-14 10:47:33,156 epoch 3 - iter 1152/1445 - loss 0.07290332 - time (sec): 60.64 - samples/sec: 2314.08 - lr: 0.000040 - momentum: 0.000000
2023-10-14 10:47:40,398 epoch 3 - iter 1296/1445 - loss 0.07248165 - time (sec): 67.88 - samples/sec: 2319.56 - lr: 0.000039 - momentum: 0.000000
2023-10-14 10:47:47,740 epoch 3 - iter 1440/1445 - loss 0.07325238 - time (sec): 75.22 - samples/sec: 2331.61 - lr: 0.000039 - momentum: 0.000000
2023-10-14 10:47:48,047 ----------------------------------------------------------------------------------------------------
2023-10-14 10:47:48,048 EPOCH 3 done: loss 0.0730 - lr: 0.000039
2023-10-14 10:47:51,583 DEV : loss 0.09553560614585876 - f1-score (micro avg) 0.8021
2023-10-14 10:47:51,599 saving best model
2023-10-14 10:47:52,265 ----------------------------------------------------------------------------------------------------
2023-10-14 10:47:59,539 epoch 4 - iter 144/1445 - loss 0.04698177 - time (sec): 7.27 - samples/sec: 2415.48 - lr: 0.000038 - momentum: 0.000000
2023-10-14 10:48:07,133 epoch 4 - iter 288/1445 - loss 0.06031754 - time (sec): 14.87 - samples/sec: 2411.66 - lr: 0.000038 - momentum: 0.000000
2023-10-14 10:48:14,165 epoch 4 - iter 432/1445 - loss 0.06263096 - time (sec): 21.90 - samples/sec: 2410.81 - lr: 0.000037 - momentum: 0.000000
2023-10-14 10:48:21,491 epoch 4 - iter 576/1445 - loss 0.05725339 - time (sec): 29.22 - samples/sec: 2417.83 - lr: 0.000037 - momentum: 0.000000
2023-10-14 10:48:28,487 epoch 4 - iter 720/1445 - loss 0.05516418 - time (sec): 36.22 - samples/sec: 2406.40 - lr: 0.000036 - momentum: 0.000000
2023-10-14 10:48:36,217 epoch 4 - iter 864/1445 - loss 0.05343786 - time (sec): 43.95 - samples/sec: 2404.89 - lr: 0.000036 - momentum: 0.000000
2023-10-14 10:48:43,475 epoch 4 - iter 1008/1445 - loss 0.05296204 - time (sec): 51.21 - samples/sec: 2397.09 - lr: 0.000035 - momentum: 0.000000
2023-10-14 10:48:50,864 epoch 4 - iter 1152/1445 - loss 0.05396480 - time (sec): 58.60 - samples/sec: 2397.71 - lr: 0.000034 - momentum: 0.000000
2023-10-14 10:48:58,163 epoch 4 - iter 1296/1445 - loss 0.05465199 - time (sec): 65.90 - samples/sec: 2406.38 - lr: 0.000034 - momentum: 0.000000
2023-10-14 10:49:05,509 epoch 4 - iter 1440/1445 - loss 0.05379926 - time (sec): 73.24 - samples/sec: 2397.56 - lr: 0.000033 - momentum: 0.000000
2023-10-14 10:49:05,752 ----------------------------------------------------------------------------------------------------
2023-10-14 10:49:05,752 EPOCH 4 done: loss 0.0541 - lr: 0.000033
2023-10-14 10:49:09,309 DEV : loss 0.12118156254291534 - f1-score (micro avg) 0.7946
2023-10-14 10:49:09,326 ----------------------------------------------------------------------------------------------------
2023-10-14 10:49:16,954 epoch 5 - iter 144/1445 - loss 0.04673101 - time (sec): 7.63 - samples/sec: 2414.26 - lr: 0.000033 - momentum: 0.000000
2023-10-14 10:49:24,004 epoch 5 - iter 288/1445 - loss 0.04488133 - time (sec): 14.68 - samples/sec: 2414.51 - lr: 0.000032 - momentum: 0.000000
2023-10-14 10:49:31,520 epoch 5 - iter 432/1445 - loss 0.04016346 - time (sec): 22.19 - samples/sec: 2439.19 - lr: 0.000032 - momentum: 0.000000
2023-10-14 10:49:38,723 epoch 5 - iter 576/1445 - loss 0.04147102 - time (sec): 29.40 - samples/sec: 2409.52 - lr: 0.000031 - momentum: 0.000000
2023-10-14 10:49:46,016 epoch 5 - iter 720/1445 - loss 0.04117617 - time (sec): 36.69 - samples/sec: 2388.99 - lr: 0.000031 - momentum: 0.000000
2023-10-14 10:49:53,185 epoch 5 - iter 864/1445 - loss 0.04145618 - time (sec): 43.86 - samples/sec: 2366.07 - lr: 0.000030 - momentum: 0.000000
2023-10-14 10:50:00,848 epoch 5 - iter 1008/1445 - loss 0.03979828 - time (sec): 51.52 - samples/sec: 2367.21 - lr: 0.000029 - momentum: 0.000000
2023-10-14 10:50:08,189 epoch 5 - iter 1152/1445 - loss 0.03982370 - time (sec): 58.86 - samples/sec: 2374.01 - lr: 0.000029 - momentum: 0.000000
2023-10-14 10:50:15,584 epoch 5 - iter 1296/1445 - loss 0.04077806 - time (sec): 66.26 - samples/sec: 2388.64 - lr: 0.000028 - momentum: 0.000000
2023-10-14 10:50:23,148 epoch 5 - iter 1440/1445 - loss 0.03927038 - time (sec): 73.82 - samples/sec: 2380.09 - lr: 0.000028 - momentum: 0.000000
2023-10-14 10:50:23,369 ----------------------------------------------------------------------------------------------------
2023-10-14 10:50:23,369 EPOCH 5 done: loss 0.0393 - lr: 0.000028
2023-10-14 10:50:27,269 DEV : loss 0.15388108789920807 - f1-score (micro avg) 0.7917
2023-10-14 10:50:27,287 ----------------------------------------------------------------------------------------------------
2023-10-14 10:50:34,633 epoch 6 - iter 144/1445 - loss 0.03233428 - time (sec): 7.34 - samples/sec: 2362.50 - lr: 0.000027 - momentum: 0.000000
2023-10-14 10:50:41,959 epoch 6 - iter 288/1445 - loss 0.03501451 - time (sec): 14.67 - samples/sec: 2390.85 - lr: 0.000027 - momentum: 0.000000
2023-10-14 10:50:49,055 epoch 6 - iter 432/1445 - loss 0.03123834 - time (sec): 21.77 - samples/sec: 2411.58 - lr: 0.000026 - momentum: 0.000000
2023-10-14 10:50:56,443 epoch 6 - iter 576/1445 - loss 0.03274913 - time (sec): 29.16 - samples/sec: 2419.30 - lr: 0.000026 - momentum: 0.000000
2023-10-14 10:51:03,609 epoch 6 - iter 720/1445 - loss 0.03314690 - time (sec): 36.32 - samples/sec: 2406.97 - lr: 0.000025 - momentum: 0.000000
2023-10-14 10:51:10,863 epoch 6 - iter 864/1445 - loss 0.03210885 - time (sec): 43.58 - samples/sec: 2410.60 - lr: 0.000024 - momentum: 0.000000
2023-10-14 10:51:18,264 epoch 6 - iter 1008/1445 - loss 0.03115240 - time (sec): 50.98 - samples/sec: 2410.40 - lr: 0.000024 - momentum: 0.000000
2023-10-14 10:51:25,820 epoch 6 - iter 1152/1445 - loss 0.03008407 - time (sec): 58.53 - samples/sec: 2430.27 - lr: 0.000023 - momentum: 0.000000
2023-10-14 10:51:32,892 epoch 6 - iter 1296/1445 - loss 0.03095943 - time (sec): 65.60 - samples/sec: 2425.45 - lr: 0.000023 - momentum: 0.000000
2023-10-14 10:51:39,991 epoch 6 - iter 1440/1445 - loss 0.03162133 - time (sec): 72.70 - samples/sec: 2417.44 - lr: 0.000022 - momentum: 0.000000
2023-10-14 10:51:40,215 ----------------------------------------------------------------------------------------------------
2023-10-14 10:51:40,215 EPOCH 6 done: loss 0.0317 - lr: 0.000022
2023-10-14 10:51:43,805 DEV : loss 0.19756034016609192 - f1-score (micro avg) 0.8016
2023-10-14 10:51:43,821 ----------------------------------------------------------------------------------------------------
2023-10-14 10:51:51,088 epoch 7 - iter 144/1445 - loss 0.01816229 - time (sec): 7.27 - samples/sec: 2414.93 - lr: 0.000022 - momentum: 0.000000
2023-10-14 10:51:58,649 epoch 7 - iter 288/1445 - loss 0.01954064 - time (sec): 14.83 - samples/sec: 2471.92 - lr: 0.000021 - momentum: 0.000000
2023-10-14 10:52:05,759 epoch 7 - iter 432/1445 - loss 0.01867134 - time (sec): 21.94 - samples/sec: 2439.90 - lr: 0.000021 - momentum: 0.000000
2023-10-14 10:52:13,556 epoch 7 - iter 576/1445 - loss 0.01895690 - time (sec): 29.73 - samples/sec: 2418.92 - lr: 0.000020 - momentum: 0.000000
2023-10-14 10:52:20,679 epoch 7 - iter 720/1445 - loss 0.01836345 - time (sec): 36.86 - samples/sec: 2400.83 - lr: 0.000019 - momentum: 0.000000
2023-10-14 10:52:27,667 epoch 7 - iter 864/1445 - loss 0.01877869 - time (sec): 43.84 - samples/sec: 2396.78 - lr: 0.000019 - momentum: 0.000000
2023-10-14 10:52:35,009 epoch 7 - iter 1008/1445 - loss 0.02080686 - time (sec): 51.19 - samples/sec: 2419.50 - lr: 0.000018 - momentum: 0.000000
2023-10-14 10:52:42,373 epoch 7 - iter 1152/1445 - loss 0.02134693 - time (sec): 58.55 - samples/sec: 2423.15 - lr: 0.000018 - momentum: 0.000000
2023-10-14 10:52:49,576 epoch 7 - iter 1296/1445 - loss 0.02105516 - time (sec): 65.75 - samples/sec: 2420.11 - lr: 0.000017 - momentum: 0.000000
2023-10-14 10:52:56,887 epoch 7 - iter 1440/1445 - loss 0.02073299 - time (sec): 73.06 - samples/sec: 2406.17 - lr: 0.000017 - momentum: 0.000000
2023-10-14 10:52:57,114 ----------------------------------------------------------------------------------------------------
2023-10-14 10:52:57,115 EPOCH 7 done: loss 0.0209 - lr: 0.000017
2023-10-14 10:53:00,712 DEV : loss 0.18008936941623688 - f1-score (micro avg) 0.8011
2023-10-14 10:53:00,729 ----------------------------------------------------------------------------------------------------
2023-10-14 10:53:07,996 epoch 8 - iter 144/1445 - loss 0.01866497 - time (sec): 7.27 - samples/sec: 2300.59 - lr: 0.000016 - momentum: 0.000000
2023-10-14 10:53:15,313 epoch 8 - iter 288/1445 - loss 0.01466923 - time (sec): 14.58 - samples/sec: 2372.09 - lr: 0.000016 - momentum: 0.000000
2023-10-14 10:53:22,675 epoch 8 - iter 432/1445 - loss 0.01583407 - time (sec): 21.95 - samples/sec: 2384.78 - lr: 0.000015 - momentum: 0.000000
2023-10-14 10:53:29,992 epoch 8 - iter 576/1445 - loss 0.01493921 - time (sec): 29.26 - samples/sec: 2381.30 - lr: 0.000014 - momentum: 0.000000
2023-10-14 10:53:37,264 epoch 8 - iter 720/1445 - loss 0.01503871 - time (sec): 36.53 - samples/sec: 2407.26 - lr: 0.000014 - momentum: 0.000000
2023-10-14 10:53:44,308 epoch 8 - iter 864/1445 - loss 0.01515293 - time (sec): 43.58 - samples/sec: 2399.36 - lr: 0.000013 - momentum: 0.000000
2023-10-14 10:53:51,766 epoch 8 - iter 1008/1445 - loss 0.01576861 - time (sec): 51.04 - samples/sec: 2404.97 - lr: 0.000013 - momentum: 0.000000
2023-10-14 10:53:59,034 epoch 8 - iter 1152/1445 - loss 0.01561594 - time (sec): 58.30 - samples/sec: 2408.13 - lr: 0.000012 - momentum: 0.000000
2023-10-14 10:54:06,080 epoch 8 - iter 1296/1445 - loss 0.01534350 - time (sec): 65.35 - samples/sec: 2411.17 - lr: 0.000012 - momentum: 0.000000
2023-10-14 10:54:13,533 epoch 8 - iter 1440/1445 - loss 0.01509812 - time (sec): 72.80 - samples/sec: 2409.56 - lr: 0.000011 - momentum: 0.000000
2023-10-14 10:54:13,784 ----------------------------------------------------------------------------------------------------
2023-10-14 10:54:13,784 EPOCH 8 done: loss 0.0151 - lr: 0.000011
2023-10-14 10:54:17,824 DEV : loss 0.20430612564086914 - f1-score (micro avg) 0.7978
2023-10-14 10:54:17,841 ----------------------------------------------------------------------------------------------------
2023-10-14 10:54:25,386 epoch 9 - iter 144/1445 - loss 0.01055124 - time (sec): 7.54 - samples/sec: 2408.34 - lr: 0.000011 - momentum: 0.000000
2023-10-14 10:54:32,660 epoch 9 - iter 288/1445 - loss 0.00857411 - time (sec): 14.82 - samples/sec: 2384.81 - lr: 0.000010 - momentum: 0.000000
2023-10-14 10:54:39,915 epoch 9 - iter 432/1445 - loss 0.00889903 - time (sec): 22.07 - samples/sec: 2361.30 - lr: 0.000009 - momentum: 0.000000
2023-10-14 10:54:47,643 epoch 9 - iter 576/1445 - loss 0.01030940 - time (sec): 29.80 - samples/sec: 2394.86 - lr: 0.000009 - momentum: 0.000000
2023-10-14 10:54:54,820 epoch 9 - iter 720/1445 - loss 0.00993058 - time (sec): 36.98 - samples/sec: 2400.63 - lr: 0.000008 - momentum: 0.000000
2023-10-14 10:55:02,175 epoch 9 - iter 864/1445 - loss 0.00936448 - time (sec): 44.33 - samples/sec: 2406.24 - lr: 0.000008 - momentum: 0.000000
2023-10-14 10:55:09,201 epoch 9 - iter 1008/1445 - loss 0.00913328 - time (sec): 51.36 - samples/sec: 2404.99 - lr: 0.000007 - momentum: 0.000000
2023-10-14 10:55:16,575 epoch 9 - iter 1152/1445 - loss 0.00941487 - time (sec): 58.73 - samples/sec: 2408.84 - lr: 0.000007 - momentum: 0.000000
2023-10-14 10:55:23,825 epoch 9 - iter 1296/1445 - loss 0.01026341 - time (sec): 65.98 - samples/sec: 2401.83 - lr: 0.000006 - momentum: 0.000000
2023-10-14 10:55:31,036 epoch 9 - iter 1440/1445 - loss 0.01047293 - time (sec): 73.19 - samples/sec: 2400.81 - lr: 0.000006 - momentum: 0.000000
2023-10-14 10:55:31,280 ----------------------------------------------------------------------------------------------------
2023-10-14 10:55:31,280 EPOCH 9 done: loss 0.0105 - lr: 0.000006
2023-10-14 10:55:34,878 DEV : loss 0.1839500218629837 - f1-score (micro avg) 0.7974
2023-10-14 10:55:34,896 ----------------------------------------------------------------------------------------------------
2023-10-14 10:55:42,196 epoch 10 - iter 144/1445 - loss 0.00490526 - time (sec): 7.30 - samples/sec: 2287.22 - lr: 0.000005 - momentum: 0.000000
2023-10-14 10:55:49,779 epoch 10 - iter 288/1445 - loss 0.00524218 - time (sec): 14.88 - samples/sec: 2333.86 - lr: 0.000004 - momentum: 0.000000
2023-10-14 10:55:57,744 epoch 10 - iter 432/1445 - loss 0.00648842 - time (sec): 22.85 - samples/sec: 2293.26 - lr: 0.000004 - momentum: 0.000000
2023-10-14 10:56:05,487 epoch 10 - iter 576/1445 - loss 0.00656819 - time (sec): 30.59 - samples/sec: 2325.72 - lr: 0.000003 - momentum: 0.000000
2023-10-14 10:56:12,959 epoch 10 - iter 720/1445 - loss 0.00698945 - time (sec): 38.06 - samples/sec: 2354.45 - lr: 0.000003 - momentum: 0.000000
2023-10-14 10:56:20,142 epoch 10 - iter 864/1445 - loss 0.00778598 - time (sec): 45.24 - samples/sec: 2362.14 - lr: 0.000002 - momentum: 0.000000
2023-10-14 10:56:27,101 epoch 10 - iter 1008/1445 - loss 0.00737225 - time (sec): 52.20 - samples/sec: 2348.21 - lr: 0.000002 - momentum: 0.000000
2023-10-14 10:56:34,107 epoch 10 - iter 1152/1445 - loss 0.00697948 - time (sec): 59.21 - samples/sec: 2352.75 - lr: 0.000001 - momentum: 0.000000
2023-10-14 10:56:41,417 epoch 10 - iter 1296/1445 - loss 0.00700128 - time (sec): 66.52 - samples/sec: 2371.69 - lr: 0.000001 - momentum: 0.000000
2023-10-14 10:56:48,753 epoch 10 - iter 1440/1445 - loss 0.00696742 - time (sec): 73.86 - samples/sec: 2376.15 - lr: 0.000000 - momentum: 0.000000
2023-10-14 10:56:49,020 ----------------------------------------------------------------------------------------------------
2023-10-14 10:56:49,021 EPOCH 10 done: loss 0.0070 - lr: 0.000000
2023-10-14 10:56:52,539 DEV : loss 0.19384992122650146 - f1-score (micro avg) 0.8093
2023-10-14 10:56:52,555 saving best model
2023-10-14 10:56:53,410 ----------------------------------------------------------------------------------------------------
2023-10-14 10:56:53,411 Loading model from best epoch ...
2023-10-14 10:56:55,129 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-14 10:56:58,276
Results:
- F-score (micro) 0.7959
- F-score (macro) 0.6975
- Accuracy 0.6749
By class:
precision recall f1-score support
PER 0.8184 0.7759 0.7966 482
LOC 0.8949 0.7991 0.8443 458
ORG 0.5091 0.4058 0.4516 69
micro avg 0.8339 0.7611 0.7959 1009
macro avg 0.7408 0.6603 0.6975 1009
weighted avg 0.8319 0.7611 0.7947 1009
2023-10-14 10:56:58,276 ----------------------------------------------------------------------------------------------------