stefan-it's picture
Upload folder using huggingface_hub
f1d14dc
2023-10-13 23:00:26,509 ----------------------------------------------------------------------------------------------------
2023-10-13 23:00:26,510 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 23:00:26,510 ----------------------------------------------------------------------------------------------------
2023-10-13 23:00:26,510 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-13 23:00:26,510 ----------------------------------------------------------------------------------------------------
2023-10-13 23:00:26,510 Train: 7936 sentences
2023-10-13 23:00:26,510 (train_with_dev=False, train_with_test=False)
2023-10-13 23:00:26,510 ----------------------------------------------------------------------------------------------------
2023-10-13 23:00:26,510 Training Params:
2023-10-13 23:00:26,510 - learning_rate: "5e-05"
2023-10-13 23:00:26,510 - mini_batch_size: "4"
2023-10-13 23:00:26,510 - max_epochs: "10"
2023-10-13 23:00:26,510 - shuffle: "True"
2023-10-13 23:00:26,510 ----------------------------------------------------------------------------------------------------
2023-10-13 23:00:26,510 Plugins:
2023-10-13 23:00:26,510 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 23:00:26,511 ----------------------------------------------------------------------------------------------------
2023-10-13 23:00:26,511 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 23:00:26,511 - metric: "('micro avg', 'f1-score')"
2023-10-13 23:00:26,511 ----------------------------------------------------------------------------------------------------
2023-10-13 23:00:26,511 Computation:
2023-10-13 23:00:26,511 - compute on device: cuda:0
2023-10-13 23:00:26,511 - embedding storage: none
2023-10-13 23:00:26,511 ----------------------------------------------------------------------------------------------------
2023-10-13 23:00:26,511 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-13 23:00:26,511 ----------------------------------------------------------------------------------------------------
2023-10-13 23:00:26,511 ----------------------------------------------------------------------------------------------------
2023-10-13 23:00:35,649 epoch 1 - iter 198/1984 - loss 1.53200532 - time (sec): 9.14 - samples/sec: 1823.26 - lr: 0.000005 - momentum: 0.000000
2023-10-13 23:00:44,754 epoch 1 - iter 396/1984 - loss 0.92240948 - time (sec): 18.24 - samples/sec: 1796.50 - lr: 0.000010 - momentum: 0.000000
2023-10-13 23:00:53,903 epoch 1 - iter 594/1984 - loss 0.67814623 - time (sec): 27.39 - samples/sec: 1807.56 - lr: 0.000015 - momentum: 0.000000
2023-10-13 23:01:02,885 epoch 1 - iter 792/1984 - loss 0.55782862 - time (sec): 36.37 - samples/sec: 1806.87 - lr: 0.000020 - momentum: 0.000000
2023-10-13 23:01:11,952 epoch 1 - iter 990/1984 - loss 0.47888698 - time (sec): 45.44 - samples/sec: 1813.93 - lr: 0.000025 - momentum: 0.000000
2023-10-13 23:01:20,896 epoch 1 - iter 1188/1984 - loss 0.42277515 - time (sec): 54.38 - samples/sec: 1820.52 - lr: 0.000030 - momentum: 0.000000
2023-10-13 23:01:29,837 epoch 1 - iter 1386/1984 - loss 0.38265139 - time (sec): 63.32 - samples/sec: 1819.70 - lr: 0.000035 - momentum: 0.000000
2023-10-13 23:01:39,012 epoch 1 - iter 1584/1984 - loss 0.35331171 - time (sec): 72.50 - samples/sec: 1817.71 - lr: 0.000040 - momentum: 0.000000
2023-10-13 23:01:47,956 epoch 1 - iter 1782/1984 - loss 0.33036547 - time (sec): 81.44 - samples/sec: 1814.81 - lr: 0.000045 - momentum: 0.000000
2023-10-13 23:01:56,804 epoch 1 - iter 1980/1984 - loss 0.31198950 - time (sec): 90.29 - samples/sec: 1813.61 - lr: 0.000050 - momentum: 0.000000
2023-10-13 23:01:56,982 ----------------------------------------------------------------------------------------------------
2023-10-13 23:01:56,982 EPOCH 1 done: loss 0.3119 - lr: 0.000050
2023-10-13 23:02:00,569 DEV : loss 0.10949771106243134 - f1-score (micro avg) 0.7302
2023-10-13 23:02:00,590 saving best model
2023-10-13 23:02:01,058 ----------------------------------------------------------------------------------------------------
2023-10-13 23:02:10,199 epoch 2 - iter 198/1984 - loss 0.14375295 - time (sec): 9.14 - samples/sec: 1926.56 - lr: 0.000049 - momentum: 0.000000
2023-10-13 23:02:19,752 epoch 2 - iter 396/1984 - loss 0.12984777 - time (sec): 18.69 - samples/sec: 1797.99 - lr: 0.000049 - momentum: 0.000000
2023-10-13 23:02:28,731 epoch 2 - iter 594/1984 - loss 0.12586745 - time (sec): 27.67 - samples/sec: 1811.01 - lr: 0.000048 - momentum: 0.000000
2023-10-13 23:02:37,904 epoch 2 - iter 792/1984 - loss 0.12549565 - time (sec): 36.84 - samples/sec: 1779.69 - lr: 0.000048 - momentum: 0.000000
2023-10-13 23:02:46,872 epoch 2 - iter 990/1984 - loss 0.12375022 - time (sec): 45.81 - samples/sec: 1783.17 - lr: 0.000047 - momentum: 0.000000
2023-10-13 23:02:55,929 epoch 2 - iter 1188/1984 - loss 0.12078530 - time (sec): 54.87 - samples/sec: 1782.05 - lr: 0.000047 - momentum: 0.000000
2023-10-13 23:03:04,904 epoch 2 - iter 1386/1984 - loss 0.11956853 - time (sec): 63.85 - samples/sec: 1782.93 - lr: 0.000046 - momentum: 0.000000
2023-10-13 23:03:13,896 epoch 2 - iter 1584/1984 - loss 0.11985721 - time (sec): 72.84 - samples/sec: 1793.85 - lr: 0.000046 - momentum: 0.000000
2023-10-13 23:03:22,849 epoch 2 - iter 1782/1984 - loss 0.11912594 - time (sec): 81.79 - samples/sec: 1787.41 - lr: 0.000045 - momentum: 0.000000
2023-10-13 23:03:32,027 epoch 2 - iter 1980/1984 - loss 0.11760525 - time (sec): 90.97 - samples/sec: 1800.06 - lr: 0.000044 - momentum: 0.000000
2023-10-13 23:03:32,205 ----------------------------------------------------------------------------------------------------
2023-10-13 23:03:32,205 EPOCH 2 done: loss 0.1176 - lr: 0.000044
2023-10-13 23:03:35,599 DEV : loss 0.11232058703899384 - f1-score (micro avg) 0.6962
2023-10-13 23:03:35,621 ----------------------------------------------------------------------------------------------------
2023-10-13 23:03:44,703 epoch 3 - iter 198/1984 - loss 0.08030450 - time (sec): 9.08 - samples/sec: 1799.35 - lr: 0.000044 - momentum: 0.000000
2023-10-13 23:03:53,632 epoch 3 - iter 396/1984 - loss 0.08304766 - time (sec): 18.01 - samples/sec: 1792.67 - lr: 0.000043 - momentum: 0.000000
2023-10-13 23:04:02,572 epoch 3 - iter 594/1984 - loss 0.08615871 - time (sec): 26.95 - samples/sec: 1794.60 - lr: 0.000043 - momentum: 0.000000
2023-10-13 23:04:11,544 epoch 3 - iter 792/1984 - loss 0.08894807 - time (sec): 35.92 - samples/sec: 1818.29 - lr: 0.000042 - momentum: 0.000000
2023-10-13 23:04:20,649 epoch 3 - iter 990/1984 - loss 0.09055154 - time (sec): 45.03 - samples/sec: 1825.12 - lr: 0.000042 - momentum: 0.000000
2023-10-13 23:04:29,740 epoch 3 - iter 1188/1984 - loss 0.09094457 - time (sec): 54.12 - samples/sec: 1815.30 - lr: 0.000041 - momentum: 0.000000
2023-10-13 23:04:39,034 epoch 3 - iter 1386/1984 - loss 0.09153909 - time (sec): 63.41 - samples/sec: 1807.72 - lr: 0.000041 - momentum: 0.000000
2023-10-13 23:04:48,123 epoch 3 - iter 1584/1984 - loss 0.09196429 - time (sec): 72.50 - samples/sec: 1800.25 - lr: 0.000040 - momentum: 0.000000
2023-10-13 23:04:57,084 epoch 3 - iter 1782/1984 - loss 0.09198296 - time (sec): 81.46 - samples/sec: 1803.02 - lr: 0.000039 - momentum: 0.000000
2023-10-13 23:05:06,038 epoch 3 - iter 1980/1984 - loss 0.09023290 - time (sec): 90.42 - samples/sec: 1810.05 - lr: 0.000039 - momentum: 0.000000
2023-10-13 23:05:06,220 ----------------------------------------------------------------------------------------------------
2023-10-13 23:05:06,220 EPOCH 3 done: loss 0.0902 - lr: 0.000039
2023-10-13 23:05:10,128 DEV : loss 0.12935516238212585 - f1-score (micro avg) 0.7379
2023-10-13 23:05:10,149 saving best model
2023-10-13 23:05:10,718 ----------------------------------------------------------------------------------------------------
2023-10-13 23:05:19,839 epoch 4 - iter 198/1984 - loss 0.05482600 - time (sec): 9.12 - samples/sec: 1868.83 - lr: 0.000038 - momentum: 0.000000
2023-10-13 23:05:28,950 epoch 4 - iter 396/1984 - loss 0.05830678 - time (sec): 18.23 - samples/sec: 1818.26 - lr: 0.000038 - momentum: 0.000000
2023-10-13 23:05:38,170 epoch 4 - iter 594/1984 - loss 0.06290395 - time (sec): 27.45 - samples/sec: 1782.50 - lr: 0.000037 - momentum: 0.000000
2023-10-13 23:05:47,220 epoch 4 - iter 792/1984 - loss 0.06423461 - time (sec): 36.50 - samples/sec: 1789.34 - lr: 0.000037 - momentum: 0.000000
2023-10-13 23:05:56,349 epoch 4 - iter 990/1984 - loss 0.06461218 - time (sec): 45.63 - samples/sec: 1787.16 - lr: 0.000036 - momentum: 0.000000
2023-10-13 23:06:05,421 epoch 4 - iter 1188/1984 - loss 0.06665001 - time (sec): 54.70 - samples/sec: 1793.26 - lr: 0.000036 - momentum: 0.000000
2023-10-13 23:06:14,380 epoch 4 - iter 1386/1984 - loss 0.06627246 - time (sec): 63.66 - samples/sec: 1800.70 - lr: 0.000035 - momentum: 0.000000
2023-10-13 23:06:23,383 epoch 4 - iter 1584/1984 - loss 0.06723070 - time (sec): 72.66 - samples/sec: 1798.85 - lr: 0.000034 - momentum: 0.000000
2023-10-13 23:06:32,353 epoch 4 - iter 1782/1984 - loss 0.06769783 - time (sec): 81.63 - samples/sec: 1807.58 - lr: 0.000034 - momentum: 0.000000
2023-10-13 23:06:41,405 epoch 4 - iter 1980/1984 - loss 0.06677309 - time (sec): 90.68 - samples/sec: 1806.89 - lr: 0.000033 - momentum: 0.000000
2023-10-13 23:06:41,584 ----------------------------------------------------------------------------------------------------
2023-10-13 23:06:41,584 EPOCH 4 done: loss 0.0668 - lr: 0.000033
2023-10-13 23:06:45,111 DEV : loss 0.17647738754749298 - f1-score (micro avg) 0.7355
2023-10-13 23:06:45,144 ----------------------------------------------------------------------------------------------------
2023-10-13 23:06:54,849 epoch 5 - iter 198/1984 - loss 0.04106947 - time (sec): 9.70 - samples/sec: 1748.03 - lr: 0.000033 - momentum: 0.000000
2023-10-13 23:07:03,858 epoch 5 - iter 396/1984 - loss 0.04655195 - time (sec): 18.71 - samples/sec: 1766.51 - lr: 0.000032 - momentum: 0.000000
2023-10-13 23:07:12,979 epoch 5 - iter 594/1984 - loss 0.04653325 - time (sec): 27.83 - samples/sec: 1812.33 - lr: 0.000032 - momentum: 0.000000
2023-10-13 23:07:21,950 epoch 5 - iter 792/1984 - loss 0.04794018 - time (sec): 36.80 - samples/sec: 1805.15 - lr: 0.000031 - momentum: 0.000000
2023-10-13 23:07:30,926 epoch 5 - iter 990/1984 - loss 0.04840007 - time (sec): 45.78 - samples/sec: 1811.22 - lr: 0.000031 - momentum: 0.000000
2023-10-13 23:07:39,949 epoch 5 - iter 1188/1984 - loss 0.05004291 - time (sec): 54.80 - samples/sec: 1804.77 - lr: 0.000030 - momentum: 0.000000
2023-10-13 23:07:48,863 epoch 5 - iter 1386/1984 - loss 0.05058820 - time (sec): 63.72 - samples/sec: 1796.18 - lr: 0.000029 - momentum: 0.000000
2023-10-13 23:07:57,887 epoch 5 - iter 1584/1984 - loss 0.04983358 - time (sec): 72.74 - samples/sec: 1795.40 - lr: 0.000029 - momentum: 0.000000
2023-10-13 23:08:06,980 epoch 5 - iter 1782/1984 - loss 0.05083169 - time (sec): 81.83 - samples/sec: 1808.28 - lr: 0.000028 - momentum: 0.000000
2023-10-13 23:08:15,845 epoch 5 - iter 1980/1984 - loss 0.05227801 - time (sec): 90.70 - samples/sec: 1802.51 - lr: 0.000028 - momentum: 0.000000
2023-10-13 23:08:16,081 ----------------------------------------------------------------------------------------------------
2023-10-13 23:08:16,081 EPOCH 5 done: loss 0.0523 - lr: 0.000028
2023-10-13 23:08:19,603 DEV : loss 0.1757950484752655 - f1-score (micro avg) 0.7443
2023-10-13 23:08:19,628 saving best model
2023-10-13 23:08:20,177 ----------------------------------------------------------------------------------------------------
2023-10-13 23:08:29,321 epoch 6 - iter 198/1984 - loss 0.03528770 - time (sec): 9.14 - samples/sec: 1719.36 - lr: 0.000027 - momentum: 0.000000
2023-10-13 23:08:38,245 epoch 6 - iter 396/1984 - loss 0.03920138 - time (sec): 18.06 - samples/sec: 1763.57 - lr: 0.000027 - momentum: 0.000000
2023-10-13 23:08:47,305 epoch 6 - iter 594/1984 - loss 0.03595233 - time (sec): 27.12 - samples/sec: 1781.96 - lr: 0.000026 - momentum: 0.000000
2023-10-13 23:08:56,590 epoch 6 - iter 792/1984 - loss 0.03547612 - time (sec): 36.41 - samples/sec: 1789.17 - lr: 0.000026 - momentum: 0.000000
2023-10-13 23:09:05,553 epoch 6 - iter 990/1984 - loss 0.03608526 - time (sec): 45.37 - samples/sec: 1796.82 - lr: 0.000025 - momentum: 0.000000
2023-10-13 23:09:14,598 epoch 6 - iter 1188/1984 - loss 0.03810704 - time (sec): 54.42 - samples/sec: 1800.27 - lr: 0.000024 - momentum: 0.000000
2023-10-13 23:09:24,131 epoch 6 - iter 1386/1984 - loss 0.03855158 - time (sec): 63.95 - samples/sec: 1795.20 - lr: 0.000024 - momentum: 0.000000
2023-10-13 23:09:33,149 epoch 6 - iter 1584/1984 - loss 0.03809866 - time (sec): 72.97 - samples/sec: 1796.32 - lr: 0.000023 - momentum: 0.000000
2023-10-13 23:09:42,072 epoch 6 - iter 1782/1984 - loss 0.03755127 - time (sec): 81.89 - samples/sec: 1789.44 - lr: 0.000023 - momentum: 0.000000
2023-10-13 23:09:51,030 epoch 6 - iter 1980/1984 - loss 0.03702409 - time (sec): 90.85 - samples/sec: 1799.62 - lr: 0.000022 - momentum: 0.000000
2023-10-13 23:09:51,217 ----------------------------------------------------------------------------------------------------
2023-10-13 23:09:51,217 EPOCH 6 done: loss 0.0370 - lr: 0.000022
2023-10-13 23:09:54,643 DEV : loss 0.19313980638980865 - f1-score (micro avg) 0.7517
2023-10-13 23:09:54,664 saving best model
2023-10-13 23:09:55,240 ----------------------------------------------------------------------------------------------------
2023-10-13 23:10:04,360 epoch 7 - iter 198/1984 - loss 0.02702124 - time (sec): 9.12 - samples/sec: 1881.77 - lr: 0.000022 - momentum: 0.000000
2023-10-13 23:10:13,373 epoch 7 - iter 396/1984 - loss 0.03084283 - time (sec): 18.13 - samples/sec: 1860.44 - lr: 0.000021 - momentum: 0.000000
2023-10-13 23:10:22,333 epoch 7 - iter 594/1984 - loss 0.02973583 - time (sec): 27.09 - samples/sec: 1859.23 - lr: 0.000021 - momentum: 0.000000
2023-10-13 23:10:31,305 epoch 7 - iter 792/1984 - loss 0.02758317 - time (sec): 36.06 - samples/sec: 1855.17 - lr: 0.000020 - momentum: 0.000000
2023-10-13 23:10:40,287 epoch 7 - iter 990/1984 - loss 0.02869188 - time (sec): 45.04 - samples/sec: 1847.92 - lr: 0.000019 - momentum: 0.000000
2023-10-13 23:10:49,298 epoch 7 - iter 1188/1984 - loss 0.02940933 - time (sec): 54.06 - samples/sec: 1857.29 - lr: 0.000019 - momentum: 0.000000
2023-10-13 23:10:58,316 epoch 7 - iter 1386/1984 - loss 0.02944737 - time (sec): 63.07 - samples/sec: 1836.08 - lr: 0.000018 - momentum: 0.000000
2023-10-13 23:11:07,290 epoch 7 - iter 1584/1984 - loss 0.02850037 - time (sec): 72.05 - samples/sec: 1830.44 - lr: 0.000018 - momentum: 0.000000
2023-10-13 23:11:16,230 epoch 7 - iter 1782/1984 - loss 0.02829835 - time (sec): 80.99 - samples/sec: 1830.09 - lr: 0.000017 - momentum: 0.000000
2023-10-13 23:11:25,052 epoch 7 - iter 1980/1984 - loss 0.02778118 - time (sec): 89.81 - samples/sec: 1822.07 - lr: 0.000017 - momentum: 0.000000
2023-10-13 23:11:25,233 ----------------------------------------------------------------------------------------------------
2023-10-13 23:11:25,233 EPOCH 7 done: loss 0.0277 - lr: 0.000017
2023-10-13 23:11:28,674 DEV : loss 0.21592207252979279 - f1-score (micro avg) 0.7685
2023-10-13 23:11:28,702 saving best model
2023-10-13 23:11:29,202 ----------------------------------------------------------------------------------------------------
2023-10-13 23:11:38,502 epoch 8 - iter 198/1984 - loss 0.03265154 - time (sec): 9.30 - samples/sec: 1784.08 - lr: 0.000016 - momentum: 0.000000
2023-10-13 23:11:47,488 epoch 8 - iter 396/1984 - loss 0.02692312 - time (sec): 18.28 - samples/sec: 1794.02 - lr: 0.000016 - momentum: 0.000000
2023-10-13 23:11:56,547 epoch 8 - iter 594/1984 - loss 0.02493041 - time (sec): 27.34 - samples/sec: 1802.68 - lr: 0.000015 - momentum: 0.000000
2023-10-13 23:12:05,472 epoch 8 - iter 792/1984 - loss 0.02398548 - time (sec): 36.27 - samples/sec: 1800.13 - lr: 0.000014 - momentum: 0.000000
2023-10-13 23:12:14,426 epoch 8 - iter 990/1984 - loss 0.02207354 - time (sec): 45.22 - samples/sec: 1797.01 - lr: 0.000014 - momentum: 0.000000
2023-10-13 23:12:23,548 epoch 8 - iter 1188/1984 - loss 0.02261054 - time (sec): 54.34 - samples/sec: 1804.03 - lr: 0.000013 - momentum: 0.000000
2023-10-13 23:12:33,150 epoch 8 - iter 1386/1984 - loss 0.02246274 - time (sec): 63.94 - samples/sec: 1797.62 - lr: 0.000013 - momentum: 0.000000
2023-10-13 23:12:42,232 epoch 8 - iter 1584/1984 - loss 0.02217940 - time (sec): 73.02 - samples/sec: 1793.62 - lr: 0.000012 - momentum: 0.000000
2023-10-13 23:12:51,634 epoch 8 - iter 1782/1984 - loss 0.02147734 - time (sec): 82.43 - samples/sec: 1783.43 - lr: 0.000012 - momentum: 0.000000
2023-10-13 23:13:00,581 epoch 8 - iter 1980/1984 - loss 0.02100390 - time (sec): 91.37 - samples/sec: 1790.99 - lr: 0.000011 - momentum: 0.000000
2023-10-13 23:13:00,764 ----------------------------------------------------------------------------------------------------
2023-10-13 23:13:00,764 EPOCH 8 done: loss 0.0211 - lr: 0.000011
2023-10-13 23:13:04,598 DEV : loss 0.21091753244400024 - f1-score (micro avg) 0.7553
2023-10-13 23:13:04,618 ----------------------------------------------------------------------------------------------------
2023-10-13 23:13:13,513 epoch 9 - iter 198/1984 - loss 0.00733228 - time (sec): 8.89 - samples/sec: 1730.22 - lr: 0.000011 - momentum: 0.000000
2023-10-13 23:13:22,696 epoch 9 - iter 396/1984 - loss 0.01112380 - time (sec): 18.08 - samples/sec: 1725.38 - lr: 0.000010 - momentum: 0.000000
2023-10-13 23:13:31,864 epoch 9 - iter 594/1984 - loss 0.01020664 - time (sec): 27.24 - samples/sec: 1773.39 - lr: 0.000009 - momentum: 0.000000
2023-10-13 23:13:40,951 epoch 9 - iter 792/1984 - loss 0.01052024 - time (sec): 36.33 - samples/sec: 1789.51 - lr: 0.000009 - momentum: 0.000000
2023-10-13 23:13:50,232 epoch 9 - iter 990/1984 - loss 0.01260968 - time (sec): 45.61 - samples/sec: 1813.60 - lr: 0.000008 - momentum: 0.000000
2023-10-13 23:13:59,464 epoch 9 - iter 1188/1984 - loss 0.01177559 - time (sec): 54.84 - samples/sec: 1806.82 - lr: 0.000008 - momentum: 0.000000
2023-10-13 23:14:08,417 epoch 9 - iter 1386/1984 - loss 0.01184590 - time (sec): 63.80 - samples/sec: 1806.92 - lr: 0.000007 - momentum: 0.000000
2023-10-13 23:14:17,430 epoch 9 - iter 1584/1984 - loss 0.01232048 - time (sec): 72.81 - samples/sec: 1806.87 - lr: 0.000007 - momentum: 0.000000
2023-10-13 23:14:26,329 epoch 9 - iter 1782/1984 - loss 0.01255026 - time (sec): 81.71 - samples/sec: 1801.94 - lr: 0.000006 - momentum: 0.000000
2023-10-13 23:14:35,399 epoch 9 - iter 1980/1984 - loss 0.01253169 - time (sec): 90.78 - samples/sec: 1803.20 - lr: 0.000006 - momentum: 0.000000
2023-10-13 23:14:35,581 ----------------------------------------------------------------------------------------------------
2023-10-13 23:14:35,581 EPOCH 9 done: loss 0.0126 - lr: 0.000006
2023-10-13 23:14:39,097 DEV : loss 0.22775374352931976 - f1-score (micro avg) 0.7603
2023-10-13 23:14:39,119 ----------------------------------------------------------------------------------------------------
2023-10-13 23:14:48,264 epoch 10 - iter 198/1984 - loss 0.00535780 - time (sec): 9.14 - samples/sec: 1683.98 - lr: 0.000005 - momentum: 0.000000
2023-10-13 23:14:57,214 epoch 10 - iter 396/1984 - loss 0.00645843 - time (sec): 18.09 - samples/sec: 1743.85 - lr: 0.000004 - momentum: 0.000000
2023-10-13 23:15:06,164 epoch 10 - iter 594/1984 - loss 0.00674720 - time (sec): 27.04 - samples/sec: 1745.58 - lr: 0.000004 - momentum: 0.000000
2023-10-13 23:15:15,417 epoch 10 - iter 792/1984 - loss 0.00765501 - time (sec): 36.30 - samples/sec: 1746.87 - lr: 0.000003 - momentum: 0.000000
2023-10-13 23:15:24,647 epoch 10 - iter 990/1984 - loss 0.00817023 - time (sec): 45.53 - samples/sec: 1751.12 - lr: 0.000003 - momentum: 0.000000
2023-10-13 23:15:33,679 epoch 10 - iter 1188/1984 - loss 0.00874557 - time (sec): 54.56 - samples/sec: 1774.18 - lr: 0.000002 - momentum: 0.000000
2023-10-13 23:15:42,787 epoch 10 - iter 1386/1984 - loss 0.00917449 - time (sec): 63.67 - samples/sec: 1798.37 - lr: 0.000002 - momentum: 0.000000
2023-10-13 23:15:51,803 epoch 10 - iter 1584/1984 - loss 0.00867989 - time (sec): 72.68 - samples/sec: 1805.44 - lr: 0.000001 - momentum: 0.000000
2023-10-13 23:16:00,811 epoch 10 - iter 1782/1984 - loss 0.00885292 - time (sec): 81.69 - samples/sec: 1803.59 - lr: 0.000001 - momentum: 0.000000
2023-10-13 23:16:09,829 epoch 10 - iter 1980/1984 - loss 0.00933134 - time (sec): 90.71 - samples/sec: 1804.64 - lr: 0.000000 - momentum: 0.000000
2023-10-13 23:16:10,010 ----------------------------------------------------------------------------------------------------
2023-10-13 23:16:10,010 EPOCH 10 done: loss 0.0093 - lr: 0.000000
2023-10-13 23:16:13,937 DEV : loss 0.2285892367362976 - f1-score (micro avg) 0.7591
2023-10-13 23:16:14,432 ----------------------------------------------------------------------------------------------------
2023-10-13 23:16:14,434 Loading model from best epoch ...
2023-10-13 23:16:15,912 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-13 23:16:19,423
Results:
- F-score (micro) 0.783
- F-score (macro) 0.676
- Accuracy 0.6661
By class:
precision recall f1-score support
LOC 0.8229 0.8794 0.8502 655
PER 0.7125 0.7668 0.7387 223
ORG 0.5769 0.3543 0.4390 127
micro avg 0.7780 0.7881 0.7830 1005
macro avg 0.7041 0.6668 0.6760 1005
weighted avg 0.7673 0.7881 0.7735 1005
2023-10-13 23:16:19,423 ----------------------------------------------------------------------------------------------------