stefan-it's picture
Upload folder using huggingface_hub
face203
2023-10-16 23:37:15,287 ----------------------------------------------------------------------------------------------------
2023-10-16 23:37:15,288 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 23:37:15,288 ----------------------------------------------------------------------------------------------------
2023-10-16 23:37:15,288 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-16 23:37:15,288 ----------------------------------------------------------------------------------------------------
2023-10-16 23:37:15,288 Train: 6183 sentences
2023-10-16 23:37:15,288 (train_with_dev=False, train_with_test=False)
2023-10-16 23:37:15,288 ----------------------------------------------------------------------------------------------------
2023-10-16 23:37:15,288 Training Params:
2023-10-16 23:37:15,288 - learning_rate: "3e-05"
2023-10-16 23:37:15,288 - mini_batch_size: "4"
2023-10-16 23:37:15,288 - max_epochs: "10"
2023-10-16 23:37:15,288 - shuffle: "True"
2023-10-16 23:37:15,288 ----------------------------------------------------------------------------------------------------
2023-10-16 23:37:15,288 Plugins:
2023-10-16 23:37:15,289 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 23:37:15,289 ----------------------------------------------------------------------------------------------------
2023-10-16 23:37:15,289 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 23:37:15,289 - metric: "('micro avg', 'f1-score')"
2023-10-16 23:37:15,289 ----------------------------------------------------------------------------------------------------
2023-10-16 23:37:15,289 Computation:
2023-10-16 23:37:15,289 - compute on device: cuda:0
2023-10-16 23:37:15,289 - embedding storage: none
2023-10-16 23:37:15,289 ----------------------------------------------------------------------------------------------------
2023-10-16 23:37:15,289 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-16 23:37:15,289 ----------------------------------------------------------------------------------------------------
2023-10-16 23:37:15,289 ----------------------------------------------------------------------------------------------------
2023-10-16 23:37:22,073 epoch 1 - iter 154/1546 - loss 1.92565161 - time (sec): 6.78 - samples/sec: 1898.22 - lr: 0.000003 - momentum: 0.000000
2023-10-16 23:37:28,810 epoch 1 - iter 308/1546 - loss 1.09983562 - time (sec): 13.52 - samples/sec: 1886.82 - lr: 0.000006 - momentum: 0.000000
2023-10-16 23:37:35,629 epoch 1 - iter 462/1546 - loss 0.79988559 - time (sec): 20.34 - samples/sec: 1855.62 - lr: 0.000009 - momentum: 0.000000
2023-10-16 23:37:42,402 epoch 1 - iter 616/1546 - loss 0.64149237 - time (sec): 27.11 - samples/sec: 1841.30 - lr: 0.000012 - momentum: 0.000000
2023-10-16 23:37:49,223 epoch 1 - iter 770/1546 - loss 0.53678387 - time (sec): 33.93 - samples/sec: 1842.84 - lr: 0.000015 - momentum: 0.000000
2023-10-16 23:37:56,159 epoch 1 - iter 924/1546 - loss 0.47121079 - time (sec): 40.87 - samples/sec: 1817.10 - lr: 0.000018 - momentum: 0.000000
2023-10-16 23:38:02,949 epoch 1 - iter 1078/1546 - loss 0.42227833 - time (sec): 47.66 - samples/sec: 1810.49 - lr: 0.000021 - momentum: 0.000000
2023-10-16 23:38:09,776 epoch 1 - iter 1232/1546 - loss 0.38574946 - time (sec): 54.49 - samples/sec: 1807.68 - lr: 0.000024 - momentum: 0.000000
2023-10-16 23:38:16,772 epoch 1 - iter 1386/1546 - loss 0.35423814 - time (sec): 61.48 - samples/sec: 1809.07 - lr: 0.000027 - momentum: 0.000000
2023-10-16 23:38:23,585 epoch 1 - iter 1540/1546 - loss 0.32780022 - time (sec): 68.30 - samples/sec: 1814.47 - lr: 0.000030 - momentum: 0.000000
2023-10-16 23:38:23,847 ----------------------------------------------------------------------------------------------------
2023-10-16 23:38:23,847 EPOCH 1 done: loss 0.3270 - lr: 0.000030
2023-10-16 23:38:25,866 DEV : loss 0.06755758821964264 - f1-score (micro avg) 0.7102
2023-10-16 23:38:25,894 saving best model
2023-10-16 23:38:26,224 ----------------------------------------------------------------------------------------------------
2023-10-16 23:38:33,203 epoch 2 - iter 154/1546 - loss 0.09083964 - time (sec): 6.98 - samples/sec: 1894.92 - lr: 0.000030 - momentum: 0.000000
2023-10-16 23:38:40,075 epoch 2 - iter 308/1546 - loss 0.08611689 - time (sec): 13.85 - samples/sec: 1864.36 - lr: 0.000029 - momentum: 0.000000
2023-10-16 23:38:46,893 epoch 2 - iter 462/1546 - loss 0.08435768 - time (sec): 20.67 - samples/sec: 1843.68 - lr: 0.000029 - momentum: 0.000000
2023-10-16 23:38:53,775 epoch 2 - iter 616/1546 - loss 0.08829396 - time (sec): 27.55 - samples/sec: 1815.83 - lr: 0.000029 - momentum: 0.000000
2023-10-16 23:39:00,586 epoch 2 - iter 770/1546 - loss 0.08874921 - time (sec): 34.36 - samples/sec: 1790.29 - lr: 0.000028 - momentum: 0.000000
2023-10-16 23:39:07,326 epoch 2 - iter 924/1546 - loss 0.08870459 - time (sec): 41.10 - samples/sec: 1801.42 - lr: 0.000028 - momentum: 0.000000
2023-10-16 23:39:14,168 epoch 2 - iter 1078/1546 - loss 0.08797980 - time (sec): 47.94 - samples/sec: 1809.00 - lr: 0.000028 - momentum: 0.000000
2023-10-16 23:39:21,230 epoch 2 - iter 1232/1546 - loss 0.08446684 - time (sec): 55.00 - samples/sec: 1812.34 - lr: 0.000027 - momentum: 0.000000
2023-10-16 23:39:28,023 epoch 2 - iter 1386/1546 - loss 0.08329303 - time (sec): 61.80 - samples/sec: 1803.61 - lr: 0.000027 - momentum: 0.000000
2023-10-16 23:39:34,880 epoch 2 - iter 1540/1546 - loss 0.08339448 - time (sec): 68.65 - samples/sec: 1805.87 - lr: 0.000027 - momentum: 0.000000
2023-10-16 23:39:35,138 ----------------------------------------------------------------------------------------------------
2023-10-16 23:39:35,138 EPOCH 2 done: loss 0.0833 - lr: 0.000027
2023-10-16 23:39:37,244 DEV : loss 0.06139129400253296 - f1-score (micro avg) 0.7623
2023-10-16 23:39:37,257 saving best model
2023-10-16 23:39:37,686 ----------------------------------------------------------------------------------------------------
2023-10-16 23:39:44,558 epoch 3 - iter 154/1546 - loss 0.04155378 - time (sec): 6.87 - samples/sec: 1859.19 - lr: 0.000026 - momentum: 0.000000
2023-10-16 23:39:51,450 epoch 3 - iter 308/1546 - loss 0.05960039 - time (sec): 13.76 - samples/sec: 1877.23 - lr: 0.000026 - momentum: 0.000000
2023-10-16 23:39:58,368 epoch 3 - iter 462/1546 - loss 0.05708094 - time (sec): 20.68 - samples/sec: 1894.00 - lr: 0.000026 - momentum: 0.000000
2023-10-16 23:40:05,183 epoch 3 - iter 616/1546 - loss 0.05432285 - time (sec): 27.50 - samples/sec: 1847.87 - lr: 0.000025 - momentum: 0.000000
2023-10-16 23:40:12,029 epoch 3 - iter 770/1546 - loss 0.05492115 - time (sec): 34.34 - samples/sec: 1830.11 - lr: 0.000025 - momentum: 0.000000
2023-10-16 23:40:18,815 epoch 3 - iter 924/1546 - loss 0.05565481 - time (sec): 41.13 - samples/sec: 1815.96 - lr: 0.000025 - momentum: 0.000000
2023-10-16 23:40:25,763 epoch 3 - iter 1078/1546 - loss 0.05690822 - time (sec): 48.08 - samples/sec: 1824.49 - lr: 0.000024 - momentum: 0.000000
2023-10-16 23:40:32,745 epoch 3 - iter 1232/1546 - loss 0.05538883 - time (sec): 55.06 - samples/sec: 1813.92 - lr: 0.000024 - momentum: 0.000000
2023-10-16 23:40:39,610 epoch 3 - iter 1386/1546 - loss 0.05683208 - time (sec): 61.92 - samples/sec: 1799.53 - lr: 0.000024 - momentum: 0.000000
2023-10-16 23:40:46,465 epoch 3 - iter 1540/1546 - loss 0.05646703 - time (sec): 68.78 - samples/sec: 1801.06 - lr: 0.000023 - momentum: 0.000000
2023-10-16 23:40:46,726 ----------------------------------------------------------------------------------------------------
2023-10-16 23:40:46,727 EPOCH 3 done: loss 0.0563 - lr: 0.000023
2023-10-16 23:40:49,096 DEV : loss 0.08413656055927277 - f1-score (micro avg) 0.7407
2023-10-16 23:40:49,109 ----------------------------------------------------------------------------------------------------
2023-10-16 23:40:56,043 epoch 4 - iter 154/1546 - loss 0.03911177 - time (sec): 6.93 - samples/sec: 1680.71 - lr: 0.000023 - momentum: 0.000000
2023-10-16 23:41:03,048 epoch 4 - iter 308/1546 - loss 0.03409785 - time (sec): 13.94 - samples/sec: 1698.01 - lr: 0.000023 - momentum: 0.000000
2023-10-16 23:41:09,927 epoch 4 - iter 462/1546 - loss 0.03599079 - time (sec): 20.82 - samples/sec: 1745.38 - lr: 0.000022 - momentum: 0.000000
2023-10-16 23:41:16,679 epoch 4 - iter 616/1546 - loss 0.03396381 - time (sec): 27.57 - samples/sec: 1762.13 - lr: 0.000022 - momentum: 0.000000
2023-10-16 23:41:23,254 epoch 4 - iter 770/1546 - loss 0.03546867 - time (sec): 34.14 - samples/sec: 1781.08 - lr: 0.000022 - momentum: 0.000000
2023-10-16 23:41:30,011 epoch 4 - iter 924/1546 - loss 0.03515207 - time (sec): 40.90 - samples/sec: 1780.02 - lr: 0.000021 - momentum: 0.000000
2023-10-16 23:41:36,908 epoch 4 - iter 1078/1546 - loss 0.03640270 - time (sec): 47.80 - samples/sec: 1787.74 - lr: 0.000021 - momentum: 0.000000
2023-10-16 23:41:43,799 epoch 4 - iter 1232/1546 - loss 0.03764293 - time (sec): 54.69 - samples/sec: 1786.11 - lr: 0.000021 - momentum: 0.000000
2023-10-16 23:41:50,658 epoch 4 - iter 1386/1546 - loss 0.03780300 - time (sec): 61.55 - samples/sec: 1794.32 - lr: 0.000020 - momentum: 0.000000
2023-10-16 23:41:57,608 epoch 4 - iter 1540/1546 - loss 0.03770330 - time (sec): 68.50 - samples/sec: 1805.56 - lr: 0.000020 - momentum: 0.000000
2023-10-16 23:41:57,872 ----------------------------------------------------------------------------------------------------
2023-10-16 23:41:57,872 EPOCH 4 done: loss 0.0376 - lr: 0.000020
2023-10-16 23:41:59,933 DEV : loss 0.08385952562093735 - f1-score (micro avg) 0.7728
2023-10-16 23:41:59,945 saving best model
2023-10-16 23:42:00,364 ----------------------------------------------------------------------------------------------------
2023-10-16 23:42:06,961 epoch 5 - iter 154/1546 - loss 0.01723144 - time (sec): 6.59 - samples/sec: 1877.96 - lr: 0.000020 - momentum: 0.000000
2023-10-16 23:42:13,855 epoch 5 - iter 308/1546 - loss 0.02130077 - time (sec): 13.49 - samples/sec: 1808.30 - lr: 0.000019 - momentum: 0.000000
2023-10-16 23:42:20,643 epoch 5 - iter 462/1546 - loss 0.02384876 - time (sec): 20.28 - samples/sec: 1816.13 - lr: 0.000019 - momentum: 0.000000
2023-10-16 23:42:27,408 epoch 5 - iter 616/1546 - loss 0.02637631 - time (sec): 27.04 - samples/sec: 1818.43 - lr: 0.000019 - momentum: 0.000000
2023-10-16 23:42:34,374 epoch 5 - iter 770/1546 - loss 0.02693432 - time (sec): 34.01 - samples/sec: 1824.58 - lr: 0.000018 - momentum: 0.000000
2023-10-16 23:42:41,338 epoch 5 - iter 924/1546 - loss 0.02773812 - time (sec): 40.97 - samples/sec: 1806.78 - lr: 0.000018 - momentum: 0.000000
2023-10-16 23:42:48,253 epoch 5 - iter 1078/1546 - loss 0.02756016 - time (sec): 47.89 - samples/sec: 1828.54 - lr: 0.000018 - momentum: 0.000000
2023-10-16 23:42:55,133 epoch 5 - iter 1232/1546 - loss 0.02690068 - time (sec): 54.77 - samples/sec: 1815.36 - lr: 0.000017 - momentum: 0.000000
2023-10-16 23:43:01,995 epoch 5 - iter 1386/1546 - loss 0.02760696 - time (sec): 61.63 - samples/sec: 1813.32 - lr: 0.000017 - momentum: 0.000000
2023-10-16 23:43:08,871 epoch 5 - iter 1540/1546 - loss 0.02778396 - time (sec): 68.50 - samples/sec: 1809.87 - lr: 0.000017 - momentum: 0.000000
2023-10-16 23:43:09,124 ----------------------------------------------------------------------------------------------------
2023-10-16 23:43:09,125 EPOCH 5 done: loss 0.0278 - lr: 0.000017
2023-10-16 23:43:11,166 DEV : loss 0.10244771093130112 - f1-score (micro avg) 0.7896
2023-10-16 23:43:11,178 saving best model
2023-10-16 23:43:11,591 ----------------------------------------------------------------------------------------------------
2023-10-16 23:43:18,310 epoch 6 - iter 154/1546 - loss 0.01205148 - time (sec): 6.72 - samples/sec: 1874.61 - lr: 0.000016 - momentum: 0.000000
2023-10-16 23:43:25,177 epoch 6 - iter 308/1546 - loss 0.01547760 - time (sec): 13.58 - samples/sec: 1868.15 - lr: 0.000016 - momentum: 0.000000
2023-10-16 23:43:32,027 epoch 6 - iter 462/1546 - loss 0.02030563 - time (sec): 20.44 - samples/sec: 1818.03 - lr: 0.000016 - momentum: 0.000000
2023-10-16 23:43:38,895 epoch 6 - iter 616/1546 - loss 0.02101298 - time (sec): 27.30 - samples/sec: 1818.55 - lr: 0.000015 - momentum: 0.000000
2023-10-16 23:43:45,757 epoch 6 - iter 770/1546 - loss 0.02068268 - time (sec): 34.16 - samples/sec: 1826.89 - lr: 0.000015 - momentum: 0.000000
2023-10-16 23:43:52,699 epoch 6 - iter 924/1546 - loss 0.02111511 - time (sec): 41.11 - samples/sec: 1806.61 - lr: 0.000015 - momentum: 0.000000
2023-10-16 23:43:59,574 epoch 6 - iter 1078/1546 - loss 0.01946876 - time (sec): 47.98 - samples/sec: 1811.25 - lr: 0.000014 - momentum: 0.000000
2023-10-16 23:44:06,428 epoch 6 - iter 1232/1546 - loss 0.01960139 - time (sec): 54.84 - samples/sec: 1782.03 - lr: 0.000014 - momentum: 0.000000
2023-10-16 23:44:13,375 epoch 6 - iter 1386/1546 - loss 0.01983819 - time (sec): 61.78 - samples/sec: 1791.03 - lr: 0.000014 - momentum: 0.000000
2023-10-16 23:44:20,351 epoch 6 - iter 1540/1546 - loss 0.01996004 - time (sec): 68.76 - samples/sec: 1802.90 - lr: 0.000013 - momentum: 0.000000
2023-10-16 23:44:20,625 ----------------------------------------------------------------------------------------------------
2023-10-16 23:44:20,625 EPOCH 6 done: loss 0.0200 - lr: 0.000013
2023-10-16 23:44:22,734 DEV : loss 0.10681257396936417 - f1-score (micro avg) 0.7824
2023-10-16 23:44:22,747 ----------------------------------------------------------------------------------------------------
2023-10-16 23:44:29,520 epoch 7 - iter 154/1546 - loss 0.01882747 - time (sec): 6.77 - samples/sec: 1707.23 - lr: 0.000013 - momentum: 0.000000
2023-10-16 23:44:36,364 epoch 7 - iter 308/1546 - loss 0.01786356 - time (sec): 13.62 - samples/sec: 1706.79 - lr: 0.000013 - momentum: 0.000000
2023-10-16 23:44:43,172 epoch 7 - iter 462/1546 - loss 0.01399142 - time (sec): 20.42 - samples/sec: 1720.07 - lr: 0.000012 - momentum: 0.000000
2023-10-16 23:44:50,128 epoch 7 - iter 616/1546 - loss 0.01287695 - time (sec): 27.38 - samples/sec: 1754.57 - lr: 0.000012 - momentum: 0.000000
2023-10-16 23:44:57,096 epoch 7 - iter 770/1546 - loss 0.01298667 - time (sec): 34.35 - samples/sec: 1771.84 - lr: 0.000012 - momentum: 0.000000
2023-10-16 23:45:04,180 epoch 7 - iter 924/1546 - loss 0.01203834 - time (sec): 41.43 - samples/sec: 1779.29 - lr: 0.000011 - momentum: 0.000000
2023-10-16 23:45:11,016 epoch 7 - iter 1078/1546 - loss 0.01282603 - time (sec): 48.27 - samples/sec: 1800.52 - lr: 0.000011 - momentum: 0.000000
2023-10-16 23:45:17,812 epoch 7 - iter 1232/1546 - loss 0.01248353 - time (sec): 55.06 - samples/sec: 1805.09 - lr: 0.000011 - momentum: 0.000000
2023-10-16 23:45:24,596 epoch 7 - iter 1386/1546 - loss 0.01265255 - time (sec): 61.85 - samples/sec: 1798.93 - lr: 0.000010 - momentum: 0.000000
2023-10-16 23:45:31,426 epoch 7 - iter 1540/1546 - loss 0.01287871 - time (sec): 68.68 - samples/sec: 1803.61 - lr: 0.000010 - momentum: 0.000000
2023-10-16 23:45:31,687 ----------------------------------------------------------------------------------------------------
2023-10-16 23:45:31,687 EPOCH 7 done: loss 0.0128 - lr: 0.000010
2023-10-16 23:45:33,866 DEV : loss 0.10830121487379074 - f1-score (micro avg) 0.8008
2023-10-16 23:45:33,880 saving best model
2023-10-16 23:45:34,323 ----------------------------------------------------------------------------------------------------
2023-10-16 23:45:41,641 epoch 8 - iter 154/1546 - loss 0.01179072 - time (sec): 7.32 - samples/sec: 1690.40 - lr: 0.000010 - momentum: 0.000000
2023-10-16 23:45:48,999 epoch 8 - iter 308/1546 - loss 0.01016285 - time (sec): 14.67 - samples/sec: 1759.76 - lr: 0.000009 - momentum: 0.000000
2023-10-16 23:45:55,984 epoch 8 - iter 462/1546 - loss 0.01018965 - time (sec): 21.66 - samples/sec: 1748.65 - lr: 0.000009 - momentum: 0.000000
2023-10-16 23:46:02,875 epoch 8 - iter 616/1546 - loss 0.00940054 - time (sec): 28.55 - samples/sec: 1776.30 - lr: 0.000009 - momentum: 0.000000
2023-10-16 23:46:09,813 epoch 8 - iter 770/1546 - loss 0.00901390 - time (sec): 35.49 - samples/sec: 1789.22 - lr: 0.000008 - momentum: 0.000000
2023-10-16 23:46:17,225 epoch 8 - iter 924/1546 - loss 0.00873861 - time (sec): 42.90 - samples/sec: 1782.49 - lr: 0.000008 - momentum: 0.000000
2023-10-16 23:46:24,038 epoch 8 - iter 1078/1546 - loss 0.00860591 - time (sec): 49.71 - samples/sec: 1770.95 - lr: 0.000008 - momentum: 0.000000
2023-10-16 23:46:30,861 epoch 8 - iter 1232/1546 - loss 0.00864071 - time (sec): 56.54 - samples/sec: 1759.08 - lr: 0.000007 - momentum: 0.000000
2023-10-16 23:46:37,745 epoch 8 - iter 1386/1546 - loss 0.00860292 - time (sec): 63.42 - samples/sec: 1768.59 - lr: 0.000007 - momentum: 0.000000
2023-10-16 23:46:44,604 epoch 8 - iter 1540/1546 - loss 0.00880895 - time (sec): 70.28 - samples/sec: 1762.21 - lr: 0.000007 - momentum: 0.000000
2023-10-16 23:46:44,874 ----------------------------------------------------------------------------------------------------
2023-10-16 23:46:44,875 EPOCH 8 done: loss 0.0088 - lr: 0.000007
2023-10-16 23:46:46,985 DEV : loss 0.11089599132537842 - f1-score (micro avg) 0.7927
2023-10-16 23:46:46,998 ----------------------------------------------------------------------------------------------------
2023-10-16 23:46:53,836 epoch 9 - iter 154/1546 - loss 0.01019047 - time (sec): 6.84 - samples/sec: 1789.49 - lr: 0.000006 - momentum: 0.000000
2023-10-16 23:47:00,712 epoch 9 - iter 308/1546 - loss 0.00780744 - time (sec): 13.71 - samples/sec: 1843.01 - lr: 0.000006 - momentum: 0.000000
2023-10-16 23:47:07,681 epoch 9 - iter 462/1546 - loss 0.00633308 - time (sec): 20.68 - samples/sec: 1856.35 - lr: 0.000006 - momentum: 0.000000
2023-10-16 23:47:14,504 epoch 9 - iter 616/1546 - loss 0.00589385 - time (sec): 27.51 - samples/sec: 1833.30 - lr: 0.000005 - momentum: 0.000000
2023-10-16 23:47:21,527 epoch 9 - iter 770/1546 - loss 0.00517419 - time (sec): 34.53 - samples/sec: 1816.10 - lr: 0.000005 - momentum: 0.000000
2023-10-16 23:47:28,487 epoch 9 - iter 924/1546 - loss 0.00498572 - time (sec): 41.49 - samples/sec: 1806.16 - lr: 0.000005 - momentum: 0.000000
2023-10-16 23:47:35,321 epoch 9 - iter 1078/1546 - loss 0.00495726 - time (sec): 48.32 - samples/sec: 1788.16 - lr: 0.000004 - momentum: 0.000000
2023-10-16 23:47:42,255 epoch 9 - iter 1232/1546 - loss 0.00485612 - time (sec): 55.26 - samples/sec: 1794.97 - lr: 0.000004 - momentum: 0.000000
2023-10-16 23:47:49,207 epoch 9 - iter 1386/1546 - loss 0.00481666 - time (sec): 62.21 - samples/sec: 1795.14 - lr: 0.000004 - momentum: 0.000000
2023-10-16 23:47:56,096 epoch 9 - iter 1540/1546 - loss 0.00484866 - time (sec): 69.10 - samples/sec: 1790.64 - lr: 0.000003 - momentum: 0.000000
2023-10-16 23:47:56,367 ----------------------------------------------------------------------------------------------------
2023-10-16 23:47:56,367 EPOCH 9 done: loss 0.0048 - lr: 0.000003
2023-10-16 23:47:58,475 DEV : loss 0.12125992029905319 - f1-score (micro avg) 0.7942
2023-10-16 23:47:58,488 ----------------------------------------------------------------------------------------------------
2023-10-16 23:48:05,559 epoch 10 - iter 154/1546 - loss 0.00554820 - time (sec): 7.07 - samples/sec: 1779.17 - lr: 0.000003 - momentum: 0.000000
2023-10-16 23:48:12,504 epoch 10 - iter 308/1546 - loss 0.00532913 - time (sec): 14.01 - samples/sec: 1782.70 - lr: 0.000003 - momentum: 0.000000
2023-10-16 23:48:19,380 epoch 10 - iter 462/1546 - loss 0.00538433 - time (sec): 20.89 - samples/sec: 1740.74 - lr: 0.000002 - momentum: 0.000000
2023-10-16 23:48:26,473 epoch 10 - iter 616/1546 - loss 0.00450212 - time (sec): 27.98 - samples/sec: 1758.70 - lr: 0.000002 - momentum: 0.000000
2023-10-16 23:48:33,548 epoch 10 - iter 770/1546 - loss 0.00405722 - time (sec): 35.06 - samples/sec: 1785.83 - lr: 0.000002 - momentum: 0.000000
2023-10-16 23:48:40,558 epoch 10 - iter 924/1546 - loss 0.00351806 - time (sec): 42.07 - samples/sec: 1799.92 - lr: 0.000001 - momentum: 0.000000
2023-10-16 23:48:47,503 epoch 10 - iter 1078/1546 - loss 0.00321229 - time (sec): 49.01 - samples/sec: 1795.61 - lr: 0.000001 - momentum: 0.000000
2023-10-16 23:48:54,315 epoch 10 - iter 1232/1546 - loss 0.00327509 - time (sec): 55.83 - samples/sec: 1784.03 - lr: 0.000001 - momentum: 0.000000
2023-10-16 23:49:01,187 epoch 10 - iter 1386/1546 - loss 0.00338661 - time (sec): 62.70 - samples/sec: 1784.67 - lr: 0.000000 - momentum: 0.000000
2023-10-16 23:49:08,070 epoch 10 - iter 1540/1546 - loss 0.00333392 - time (sec): 69.58 - samples/sec: 1779.95 - lr: 0.000000 - momentum: 0.000000
2023-10-16 23:49:08,338 ----------------------------------------------------------------------------------------------------
2023-10-16 23:49:08,338 EPOCH 10 done: loss 0.0033 - lr: 0.000000
2023-10-16 23:49:10,368 DEV : loss 0.12140633165836334 - f1-score (micro avg) 0.8065
2023-10-16 23:49:10,380 saving best model
2023-10-16 23:49:11,227 ----------------------------------------------------------------------------------------------------
2023-10-16 23:49:11,228 Loading model from best epoch ...
2023-10-16 23:49:12,839 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-16 23:49:18,915
Results:
- F-score (micro) 0.798
- F-score (macro) 0.6998
- Accuracy 0.6823
By class:
precision recall f1-score support
LOC 0.8416 0.8647 0.8530 946
BUILDING 0.5440 0.5351 0.5395 185
STREET 0.6833 0.7321 0.7069 56
micro avg 0.7891 0.8071 0.7980 1187
macro avg 0.6896 0.7107 0.6998 1187
weighted avg 0.7877 0.8071 0.7972 1187
2023-10-16 23:49:18,915 ----------------------------------------------------------------------------------------------------