stefan-it's picture
Upload folder using huggingface_hub
838391c
2023-10-13 23:55:15,562 ----------------------------------------------------------------------------------------------------
2023-10-13 23:55:15,563 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 23:55:15,563 ----------------------------------------------------------------------------------------------------
2023-10-13 23:55:15,564 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-13 23:55:15,564 ----------------------------------------------------------------------------------------------------
2023-10-13 23:55:15,564 Train: 7936 sentences
2023-10-13 23:55:15,564 (train_with_dev=False, train_with_test=False)
2023-10-13 23:55:15,564 ----------------------------------------------------------------------------------------------------
2023-10-13 23:55:15,564 Training Params:
2023-10-13 23:55:15,564 - learning_rate: "5e-05"
2023-10-13 23:55:15,564 - mini_batch_size: "4"
2023-10-13 23:55:15,564 - max_epochs: "10"
2023-10-13 23:55:15,564 - shuffle: "True"
2023-10-13 23:55:15,564 ----------------------------------------------------------------------------------------------------
2023-10-13 23:55:15,564 Plugins:
2023-10-13 23:55:15,564 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 23:55:15,564 ----------------------------------------------------------------------------------------------------
2023-10-13 23:55:15,564 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 23:55:15,564 - metric: "('micro avg', 'f1-score')"
2023-10-13 23:55:15,564 ----------------------------------------------------------------------------------------------------
2023-10-13 23:55:15,564 Computation:
2023-10-13 23:55:15,564 - compute on device: cuda:0
2023-10-13 23:55:15,564 - embedding storage: none
2023-10-13 23:55:15,564 ----------------------------------------------------------------------------------------------------
2023-10-13 23:55:15,564 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-13 23:55:15,564 ----------------------------------------------------------------------------------------------------
2023-10-13 23:55:15,564 ----------------------------------------------------------------------------------------------------
2023-10-13 23:55:24,306 epoch 1 - iter 198/1984 - loss 1.50772186 - time (sec): 8.74 - samples/sec: 1863.43 - lr: 0.000005 - momentum: 0.000000
2023-10-13 23:55:32,982 epoch 1 - iter 396/1984 - loss 0.90184360 - time (sec): 17.42 - samples/sec: 1859.29 - lr: 0.000010 - momentum: 0.000000
2023-10-13 23:55:41,617 epoch 1 - iter 594/1984 - loss 0.67583760 - time (sec): 26.05 - samples/sec: 1849.23 - lr: 0.000015 - momentum: 0.000000
2023-10-13 23:55:50,732 epoch 1 - iter 792/1984 - loss 0.55353196 - time (sec): 35.17 - samples/sec: 1835.49 - lr: 0.000020 - momentum: 0.000000
2023-10-13 23:55:59,824 epoch 1 - iter 990/1984 - loss 0.47277969 - time (sec): 44.26 - samples/sec: 1838.05 - lr: 0.000025 - momentum: 0.000000
2023-10-13 23:56:08,934 epoch 1 - iter 1188/1984 - loss 0.41000215 - time (sec): 53.37 - samples/sec: 1862.91 - lr: 0.000030 - momentum: 0.000000
2023-10-13 23:56:17,850 epoch 1 - iter 1386/1984 - loss 0.37433987 - time (sec): 62.28 - samples/sec: 1853.85 - lr: 0.000035 - momentum: 0.000000
2023-10-13 23:56:27,010 epoch 1 - iter 1584/1984 - loss 0.34567962 - time (sec): 71.45 - samples/sec: 1845.25 - lr: 0.000040 - momentum: 0.000000
2023-10-13 23:56:36,022 epoch 1 - iter 1782/1984 - loss 0.32457592 - time (sec): 80.46 - samples/sec: 1832.92 - lr: 0.000045 - momentum: 0.000000
2023-10-13 23:56:44,992 epoch 1 - iter 1980/1984 - loss 0.30739042 - time (sec): 89.43 - samples/sec: 1829.76 - lr: 0.000050 - momentum: 0.000000
2023-10-13 23:56:45,174 ----------------------------------------------------------------------------------------------------
2023-10-13 23:56:45,174 EPOCH 1 done: loss 0.3074 - lr: 0.000050
2023-10-13 23:56:48,815 DEV : loss 0.1368168443441391 - f1-score (micro avg) 0.6857
2023-10-13 23:56:48,836 saving best model
2023-10-13 23:56:49,275 ----------------------------------------------------------------------------------------------------
2023-10-13 23:56:58,249 epoch 2 - iter 198/1984 - loss 0.12279295 - time (sec): 8.97 - samples/sec: 1784.86 - lr: 0.000049 - momentum: 0.000000
2023-10-13 23:57:07,264 epoch 2 - iter 396/1984 - loss 0.11883519 - time (sec): 17.99 - samples/sec: 1808.64 - lr: 0.000049 - momentum: 0.000000
2023-10-13 23:57:16,299 epoch 2 - iter 594/1984 - loss 0.12767893 - time (sec): 27.02 - samples/sec: 1811.49 - lr: 0.000048 - momentum: 0.000000
2023-10-13 23:57:25,354 epoch 2 - iter 792/1984 - loss 0.12653949 - time (sec): 36.08 - samples/sec: 1815.08 - lr: 0.000048 - momentum: 0.000000
2023-10-13 23:57:34,318 epoch 2 - iter 990/1984 - loss 0.12528597 - time (sec): 45.04 - samples/sec: 1818.90 - lr: 0.000047 - momentum: 0.000000
2023-10-13 23:57:43,288 epoch 2 - iter 1188/1984 - loss 0.12318635 - time (sec): 54.01 - samples/sec: 1823.91 - lr: 0.000047 - momentum: 0.000000
2023-10-13 23:57:52,593 epoch 2 - iter 1386/1984 - loss 0.12351316 - time (sec): 63.32 - samples/sec: 1813.04 - lr: 0.000046 - momentum: 0.000000
2023-10-13 23:58:01,570 epoch 2 - iter 1584/1984 - loss 0.12274190 - time (sec): 72.29 - samples/sec: 1811.65 - lr: 0.000046 - momentum: 0.000000
2023-10-13 23:58:10,839 epoch 2 - iter 1782/1984 - loss 0.11980283 - time (sec): 81.56 - samples/sec: 1809.58 - lr: 0.000045 - momentum: 0.000000
2023-10-13 23:58:19,789 epoch 2 - iter 1980/1984 - loss 0.11769240 - time (sec): 90.51 - samples/sec: 1808.94 - lr: 0.000044 - momentum: 0.000000
2023-10-13 23:58:19,967 ----------------------------------------------------------------------------------------------------
2023-10-13 23:58:19,967 EPOCH 2 done: loss 0.1178 - lr: 0.000044
2023-10-13 23:58:23,421 DEV : loss 0.1154676303267479 - f1-score (micro avg) 0.7134
2023-10-13 23:58:23,442 saving best model
2023-10-13 23:58:23,980 ----------------------------------------------------------------------------------------------------
2023-10-13 23:58:33,121 epoch 3 - iter 198/1984 - loss 0.08197092 - time (sec): 9.14 - samples/sec: 1751.59 - lr: 0.000044 - momentum: 0.000000
2023-10-13 23:58:41,994 epoch 3 - iter 396/1984 - loss 0.08662040 - time (sec): 18.01 - samples/sec: 1770.73 - lr: 0.000043 - momentum: 0.000000
2023-10-13 23:58:51,072 epoch 3 - iter 594/1984 - loss 0.08983954 - time (sec): 27.09 - samples/sec: 1801.95 - lr: 0.000043 - momentum: 0.000000
2023-10-13 23:59:00,163 epoch 3 - iter 792/1984 - loss 0.08922528 - time (sec): 36.18 - samples/sec: 1823.58 - lr: 0.000042 - momentum: 0.000000
2023-10-13 23:59:09,201 epoch 3 - iter 990/1984 - loss 0.09021301 - time (sec): 45.22 - samples/sec: 1817.72 - lr: 0.000042 - momentum: 0.000000
2023-10-13 23:59:18,138 epoch 3 - iter 1188/1984 - loss 0.09283341 - time (sec): 54.15 - samples/sec: 1810.17 - lr: 0.000041 - momentum: 0.000000
2023-10-13 23:59:27,086 epoch 3 - iter 1386/1984 - loss 0.09059581 - time (sec): 63.10 - samples/sec: 1810.95 - lr: 0.000041 - momentum: 0.000000
2023-10-13 23:59:36,138 epoch 3 - iter 1584/1984 - loss 0.08879431 - time (sec): 72.15 - samples/sec: 1817.41 - lr: 0.000040 - momentum: 0.000000
2023-10-13 23:59:44,813 epoch 3 - iter 1782/1984 - loss 0.08740617 - time (sec): 80.83 - samples/sec: 1824.42 - lr: 0.000039 - momentum: 0.000000
2023-10-13 23:59:53,461 epoch 3 - iter 1980/1984 - loss 0.08760937 - time (sec): 89.48 - samples/sec: 1830.63 - lr: 0.000039 - momentum: 0.000000
2023-10-13 23:59:53,633 ----------------------------------------------------------------------------------------------------
2023-10-13 23:59:53,633 EPOCH 3 done: loss 0.0877 - lr: 0.000039
2023-10-13 23:59:57,488 DEV : loss 0.11892345547676086 - f1-score (micro avg) 0.7438
2023-10-13 23:59:57,509 saving best model
2023-10-13 23:59:58,051 ----------------------------------------------------------------------------------------------------
2023-10-14 00:00:07,100 epoch 4 - iter 198/1984 - loss 0.06258502 - time (sec): 9.05 - samples/sec: 1759.68 - lr: 0.000038 - momentum: 0.000000
2023-10-14 00:00:16,056 epoch 4 - iter 396/1984 - loss 0.06650697 - time (sec): 18.00 - samples/sec: 1818.05 - lr: 0.000038 - momentum: 0.000000
2023-10-14 00:00:24,941 epoch 4 - iter 594/1984 - loss 0.06754876 - time (sec): 26.89 - samples/sec: 1775.71 - lr: 0.000037 - momentum: 0.000000
2023-10-14 00:00:33,992 epoch 4 - iter 792/1984 - loss 0.06932127 - time (sec): 35.94 - samples/sec: 1791.07 - lr: 0.000037 - momentum: 0.000000
2023-10-14 00:00:42,974 epoch 4 - iter 990/1984 - loss 0.06932810 - time (sec): 44.92 - samples/sec: 1802.23 - lr: 0.000036 - momentum: 0.000000
2023-10-14 00:00:52,014 epoch 4 - iter 1188/1984 - loss 0.07083132 - time (sec): 53.96 - samples/sec: 1813.38 - lr: 0.000036 - momentum: 0.000000
2023-10-14 00:01:01,168 epoch 4 - iter 1386/1984 - loss 0.07101762 - time (sec): 63.11 - samples/sec: 1805.21 - lr: 0.000035 - momentum: 0.000000
2023-10-14 00:01:10,193 epoch 4 - iter 1584/1984 - loss 0.07066029 - time (sec): 72.14 - samples/sec: 1799.01 - lr: 0.000034 - momentum: 0.000000
2023-10-14 00:01:19,160 epoch 4 - iter 1782/1984 - loss 0.07056362 - time (sec): 81.11 - samples/sec: 1804.70 - lr: 0.000034 - momentum: 0.000000
2023-10-14 00:01:28,184 epoch 4 - iter 1980/1984 - loss 0.06978300 - time (sec): 90.13 - samples/sec: 1816.23 - lr: 0.000033 - momentum: 0.000000
2023-10-14 00:01:28,365 ----------------------------------------------------------------------------------------------------
2023-10-14 00:01:28,365 EPOCH 4 done: loss 0.0697 - lr: 0.000033
2023-10-14 00:01:31,842 DEV : loss 0.15688975155353546 - f1-score (micro avg) 0.7467
2023-10-14 00:01:31,863 saving best model
2023-10-14 00:01:32,417 ----------------------------------------------------------------------------------------------------
2023-10-14 00:01:41,585 epoch 5 - iter 198/1984 - loss 0.05068501 - time (sec): 9.16 - samples/sec: 1761.62 - lr: 0.000033 - momentum: 0.000000
2023-10-14 00:01:50,548 epoch 5 - iter 396/1984 - loss 0.05451821 - time (sec): 18.13 - samples/sec: 1811.75 - lr: 0.000032 - momentum: 0.000000
2023-10-14 00:01:59,521 epoch 5 - iter 594/1984 - loss 0.05001998 - time (sec): 27.10 - samples/sec: 1843.55 - lr: 0.000032 - momentum: 0.000000
2023-10-14 00:02:08,376 epoch 5 - iter 792/1984 - loss 0.04929968 - time (sec): 35.96 - samples/sec: 1826.24 - lr: 0.000031 - momentum: 0.000000
2023-10-14 00:02:17,340 epoch 5 - iter 990/1984 - loss 0.04965478 - time (sec): 44.92 - samples/sec: 1813.73 - lr: 0.000031 - momentum: 0.000000
2023-10-14 00:02:26,433 epoch 5 - iter 1188/1984 - loss 0.04955068 - time (sec): 54.01 - samples/sec: 1816.53 - lr: 0.000030 - momentum: 0.000000
2023-10-14 00:02:35,499 epoch 5 - iter 1386/1984 - loss 0.04994146 - time (sec): 63.08 - samples/sec: 1821.68 - lr: 0.000029 - momentum: 0.000000
2023-10-14 00:02:44,636 epoch 5 - iter 1584/1984 - loss 0.05157217 - time (sec): 72.21 - samples/sec: 1829.33 - lr: 0.000029 - momentum: 0.000000
2023-10-14 00:02:53,623 epoch 5 - iter 1782/1984 - loss 0.04967315 - time (sec): 81.20 - samples/sec: 1822.14 - lr: 0.000028 - momentum: 0.000000
2023-10-14 00:03:02,598 epoch 5 - iter 1980/1984 - loss 0.04984569 - time (sec): 90.18 - samples/sec: 1813.98 - lr: 0.000028 - momentum: 0.000000
2023-10-14 00:03:02,783 ----------------------------------------------------------------------------------------------------
2023-10-14 00:03:02,783 EPOCH 5 done: loss 0.0501 - lr: 0.000028
2023-10-14 00:03:06,243 DEV : loss 0.18403209745883942 - f1-score (micro avg) 0.7202
2023-10-14 00:03:06,264 ----------------------------------------------------------------------------------------------------
2023-10-14 00:03:15,457 epoch 6 - iter 198/1984 - loss 0.04584627 - time (sec): 9.19 - samples/sec: 1898.94 - lr: 0.000027 - momentum: 0.000000
2023-10-14 00:03:24,449 epoch 6 - iter 396/1984 - loss 0.04186945 - time (sec): 18.18 - samples/sec: 1829.36 - lr: 0.000027 - momentum: 0.000000
2023-10-14 00:03:33,393 epoch 6 - iter 594/1984 - loss 0.03941596 - time (sec): 27.13 - samples/sec: 1800.46 - lr: 0.000026 - momentum: 0.000000
2023-10-14 00:03:42,411 epoch 6 - iter 792/1984 - loss 0.03879688 - time (sec): 36.15 - samples/sec: 1808.18 - lr: 0.000026 - momentum: 0.000000
2023-10-14 00:03:51,359 epoch 6 - iter 990/1984 - loss 0.03828610 - time (sec): 45.09 - samples/sec: 1799.34 - lr: 0.000025 - momentum: 0.000000
2023-10-14 00:04:00,278 epoch 6 - iter 1188/1984 - loss 0.03781164 - time (sec): 54.01 - samples/sec: 1796.84 - lr: 0.000024 - momentum: 0.000000
2023-10-14 00:04:09,404 epoch 6 - iter 1386/1984 - loss 0.03790898 - time (sec): 63.14 - samples/sec: 1807.80 - lr: 0.000024 - momentum: 0.000000
2023-10-14 00:04:18,592 epoch 6 - iter 1584/1984 - loss 0.03778105 - time (sec): 72.33 - samples/sec: 1803.97 - lr: 0.000023 - momentum: 0.000000
2023-10-14 00:04:27,635 epoch 6 - iter 1782/1984 - loss 0.03857277 - time (sec): 81.37 - samples/sec: 1807.22 - lr: 0.000023 - momentum: 0.000000
2023-10-14 00:04:37,083 epoch 6 - iter 1980/1984 - loss 0.03900797 - time (sec): 90.82 - samples/sec: 1802.90 - lr: 0.000022 - momentum: 0.000000
2023-10-14 00:04:37,255 ----------------------------------------------------------------------------------------------------
2023-10-14 00:04:37,255 EPOCH 6 done: loss 0.0390 - lr: 0.000022
2023-10-14 00:04:40,657 DEV : loss 0.18327702581882477 - f1-score (micro avg) 0.7536
2023-10-14 00:04:40,678 saving best model
2023-10-14 00:04:41,211 ----------------------------------------------------------------------------------------------------
2023-10-14 00:04:50,351 epoch 7 - iter 198/1984 - loss 0.02501832 - time (sec): 9.14 - samples/sec: 1831.27 - lr: 0.000022 - momentum: 0.000000
2023-10-14 00:04:59,277 epoch 7 - iter 396/1984 - loss 0.02322471 - time (sec): 18.06 - samples/sec: 1834.84 - lr: 0.000021 - momentum: 0.000000
2023-10-14 00:05:08,302 epoch 7 - iter 594/1984 - loss 0.02153630 - time (sec): 27.09 - samples/sec: 1838.56 - lr: 0.000021 - momentum: 0.000000
2023-10-14 00:05:17,224 epoch 7 - iter 792/1984 - loss 0.02269734 - time (sec): 36.01 - samples/sec: 1807.75 - lr: 0.000020 - momentum: 0.000000
2023-10-14 00:05:26,210 epoch 7 - iter 990/1984 - loss 0.02477186 - time (sec): 44.99 - samples/sec: 1823.24 - lr: 0.000019 - momentum: 0.000000
2023-10-14 00:05:35,217 epoch 7 - iter 1188/1984 - loss 0.02497735 - time (sec): 54.00 - samples/sec: 1816.37 - lr: 0.000019 - momentum: 0.000000
2023-10-14 00:05:44,152 epoch 7 - iter 1386/1984 - loss 0.02400522 - time (sec): 62.94 - samples/sec: 1812.54 - lr: 0.000018 - momentum: 0.000000
2023-10-14 00:05:53,131 epoch 7 - iter 1584/1984 - loss 0.02529192 - time (sec): 71.91 - samples/sec: 1811.38 - lr: 0.000018 - momentum: 0.000000
2023-10-14 00:06:02,211 epoch 7 - iter 1782/1984 - loss 0.02578243 - time (sec): 80.99 - samples/sec: 1813.40 - lr: 0.000017 - momentum: 0.000000
2023-10-14 00:06:11,197 epoch 7 - iter 1980/1984 - loss 0.02724683 - time (sec): 89.98 - samples/sec: 1819.80 - lr: 0.000017 - momentum: 0.000000
2023-10-14 00:06:11,373 ----------------------------------------------------------------------------------------------------
2023-10-14 00:06:11,373 EPOCH 7 done: loss 0.0272 - lr: 0.000017
2023-10-14 00:06:14,886 DEV : loss 0.19809222221374512 - f1-score (micro avg) 0.7694
2023-10-14 00:06:14,908 saving best model
2023-10-14 00:06:15,452 ----------------------------------------------------------------------------------------------------
2023-10-14 00:06:24,684 epoch 8 - iter 198/1984 - loss 0.02503918 - time (sec): 9.23 - samples/sec: 1846.98 - lr: 0.000016 - momentum: 0.000000
2023-10-14 00:06:33,673 epoch 8 - iter 396/1984 - loss 0.02450807 - time (sec): 18.22 - samples/sec: 1830.88 - lr: 0.000016 - momentum: 0.000000
2023-10-14 00:06:42,985 epoch 8 - iter 594/1984 - loss 0.02228166 - time (sec): 27.53 - samples/sec: 1833.48 - lr: 0.000015 - momentum: 0.000000
2023-10-14 00:06:51,897 epoch 8 - iter 792/1984 - loss 0.02090114 - time (sec): 36.44 - samples/sec: 1831.96 - lr: 0.000014 - momentum: 0.000000
2023-10-14 00:07:00,918 epoch 8 - iter 990/1984 - loss 0.02167482 - time (sec): 45.46 - samples/sec: 1805.64 - lr: 0.000014 - momentum: 0.000000
2023-10-14 00:07:10,067 epoch 8 - iter 1188/1984 - loss 0.02111459 - time (sec): 54.61 - samples/sec: 1802.67 - lr: 0.000013 - momentum: 0.000000
2023-10-14 00:07:19,289 epoch 8 - iter 1386/1984 - loss 0.02001664 - time (sec): 63.83 - samples/sec: 1799.08 - lr: 0.000013 - momentum: 0.000000
2023-10-14 00:07:28,458 epoch 8 - iter 1584/1984 - loss 0.01937469 - time (sec): 73.00 - samples/sec: 1800.62 - lr: 0.000012 - momentum: 0.000000
2023-10-14 00:07:37,598 epoch 8 - iter 1782/1984 - loss 0.01890756 - time (sec): 82.14 - samples/sec: 1803.09 - lr: 0.000012 - momentum: 0.000000
2023-10-14 00:07:46,695 epoch 8 - iter 1980/1984 - loss 0.01916413 - time (sec): 91.24 - samples/sec: 1793.56 - lr: 0.000011 - momentum: 0.000000
2023-10-14 00:07:46,880 ----------------------------------------------------------------------------------------------------
2023-10-14 00:07:46,880 EPOCH 8 done: loss 0.0191 - lr: 0.000011
2023-10-14 00:07:50,764 DEV : loss 0.20701323449611664 - f1-score (micro avg) 0.7508
2023-10-14 00:07:50,785 ----------------------------------------------------------------------------------------------------
2023-10-14 00:07:59,690 epoch 9 - iter 198/1984 - loss 0.01683448 - time (sec): 8.90 - samples/sec: 1786.85 - lr: 0.000011 - momentum: 0.000000
2023-10-14 00:08:08,744 epoch 9 - iter 396/1984 - loss 0.01424361 - time (sec): 17.96 - samples/sec: 1797.04 - lr: 0.000010 - momentum: 0.000000
2023-10-14 00:08:17,764 epoch 9 - iter 594/1984 - loss 0.01329647 - time (sec): 26.98 - samples/sec: 1831.07 - lr: 0.000009 - momentum: 0.000000
2023-10-14 00:08:26,715 epoch 9 - iter 792/1984 - loss 0.01361289 - time (sec): 35.93 - samples/sec: 1835.51 - lr: 0.000009 - momentum: 0.000000
2023-10-14 00:08:35,693 epoch 9 - iter 990/1984 - loss 0.01248769 - time (sec): 44.91 - samples/sec: 1828.13 - lr: 0.000008 - momentum: 0.000000
2023-10-14 00:08:44,665 epoch 9 - iter 1188/1984 - loss 0.01303478 - time (sec): 53.88 - samples/sec: 1828.78 - lr: 0.000008 - momentum: 0.000000
2023-10-14 00:08:53,901 epoch 9 - iter 1386/1984 - loss 0.01248985 - time (sec): 63.12 - samples/sec: 1814.35 - lr: 0.000007 - momentum: 0.000000
2023-10-14 00:09:02,963 epoch 9 - iter 1584/1984 - loss 0.01249750 - time (sec): 72.18 - samples/sec: 1813.72 - lr: 0.000007 - momentum: 0.000000
2023-10-14 00:09:12,129 epoch 9 - iter 1782/1984 - loss 0.01229768 - time (sec): 81.34 - samples/sec: 1806.58 - lr: 0.000006 - momentum: 0.000000
2023-10-14 00:09:21,106 epoch 9 - iter 1980/1984 - loss 0.01212657 - time (sec): 90.32 - samples/sec: 1811.73 - lr: 0.000006 - momentum: 0.000000
2023-10-14 00:09:21,297 ----------------------------------------------------------------------------------------------------
2023-10-14 00:09:21,297 EPOCH 9 done: loss 0.0122 - lr: 0.000006
2023-10-14 00:09:24,760 DEV : loss 0.24803374707698822 - f1-score (micro avg) 0.7598
2023-10-14 00:09:24,781 ----------------------------------------------------------------------------------------------------
2023-10-14 00:09:34,131 epoch 10 - iter 198/1984 - loss 0.01047072 - time (sec): 9.35 - samples/sec: 1835.03 - lr: 0.000005 - momentum: 0.000000
2023-10-14 00:09:43,100 epoch 10 - iter 396/1984 - loss 0.00852613 - time (sec): 18.32 - samples/sec: 1812.58 - lr: 0.000004 - momentum: 0.000000
2023-10-14 00:09:52,160 epoch 10 - iter 594/1984 - loss 0.00813734 - time (sec): 27.38 - samples/sec: 1818.87 - lr: 0.000004 - momentum: 0.000000
2023-10-14 00:10:01,309 epoch 10 - iter 792/1984 - loss 0.00748039 - time (sec): 36.53 - samples/sec: 1827.01 - lr: 0.000003 - momentum: 0.000000
2023-10-14 00:10:10,508 epoch 10 - iter 990/1984 - loss 0.00764780 - time (sec): 45.73 - samples/sec: 1837.44 - lr: 0.000003 - momentum: 0.000000
2023-10-14 00:10:19,392 epoch 10 - iter 1188/1984 - loss 0.00762812 - time (sec): 54.61 - samples/sec: 1832.58 - lr: 0.000002 - momentum: 0.000000
2023-10-14 00:10:28,228 epoch 10 - iter 1386/1984 - loss 0.00802417 - time (sec): 63.44 - samples/sec: 1820.05 - lr: 0.000002 - momentum: 0.000000
2023-10-14 00:10:37,140 epoch 10 - iter 1584/1984 - loss 0.00806189 - time (sec): 72.36 - samples/sec: 1809.06 - lr: 0.000001 - momentum: 0.000000
2023-10-14 00:10:46,228 epoch 10 - iter 1782/1984 - loss 0.00772889 - time (sec): 81.44 - samples/sec: 1803.08 - lr: 0.000001 - momentum: 0.000000
2023-10-14 00:10:55,215 epoch 10 - iter 1980/1984 - loss 0.00739086 - time (sec): 90.43 - samples/sec: 1810.01 - lr: 0.000000 - momentum: 0.000000
2023-10-14 00:10:55,397 ----------------------------------------------------------------------------------------------------
2023-10-14 00:10:55,397 EPOCH 10 done: loss 0.0074 - lr: 0.000000
2023-10-14 00:10:59,305 DEV : loss 0.24994361400604248 - f1-score (micro avg) 0.7672
2023-10-14 00:10:59,753 ----------------------------------------------------------------------------------------------------
2023-10-14 00:10:59,754 Loading model from best epoch ...
2023-10-14 00:11:01,093 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-14 00:11:04,428
Results:
- F-score (micro) 0.7713
- F-score (macro) 0.6774
- Accuracy 0.6506
By class:
precision recall f1-score support
LOC 0.8082 0.8427 0.8251 655
PER 0.6825 0.8386 0.7525 223
ORG 0.6338 0.3543 0.4545 127
micro avg 0.7626 0.7801 0.7713 1005
macro avg 0.7082 0.6785 0.6774 1005
weighted avg 0.7583 0.7801 0.7622 1005
2023-10-14 00:11:04,428 ----------------------------------------------------------------------------------------------------