|
2023-10-13 23:00:26,509 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:00:26,510 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 23:00:26,510 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:00:26,510 MultiCorpus: 7936 train + 992 dev + 992 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr |
|
2023-10-13 23:00:26,510 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:00:26,510 Train: 7936 sentences |
|
2023-10-13 23:00:26,510 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 23:00:26,510 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:00:26,510 Training Params: |
|
2023-10-13 23:00:26,510 - learning_rate: "5e-05" |
|
2023-10-13 23:00:26,510 - mini_batch_size: "4" |
|
2023-10-13 23:00:26,510 - max_epochs: "10" |
|
2023-10-13 23:00:26,510 - shuffle: "True" |
|
2023-10-13 23:00:26,510 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:00:26,510 Plugins: |
|
2023-10-13 23:00:26,510 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 23:00:26,511 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:00:26,511 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 23:00:26,511 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 23:00:26,511 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:00:26,511 Computation: |
|
2023-10-13 23:00:26,511 - compute on device: cuda:0 |
|
2023-10-13 23:00:26,511 - embedding storage: none |
|
2023-10-13 23:00:26,511 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:00:26,511 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-13 23:00:26,511 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:00:26,511 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:00:35,649 epoch 1 - iter 198/1984 - loss 1.53200532 - time (sec): 9.14 - samples/sec: 1823.26 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 23:00:44,754 epoch 1 - iter 396/1984 - loss 0.92240948 - time (sec): 18.24 - samples/sec: 1796.50 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 23:00:53,903 epoch 1 - iter 594/1984 - loss 0.67814623 - time (sec): 27.39 - samples/sec: 1807.56 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 23:01:02,885 epoch 1 - iter 792/1984 - loss 0.55782862 - time (sec): 36.37 - samples/sec: 1806.87 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 23:01:11,952 epoch 1 - iter 990/1984 - loss 0.47888698 - time (sec): 45.44 - samples/sec: 1813.93 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 23:01:20,896 epoch 1 - iter 1188/1984 - loss 0.42277515 - time (sec): 54.38 - samples/sec: 1820.52 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 23:01:29,837 epoch 1 - iter 1386/1984 - loss 0.38265139 - time (sec): 63.32 - samples/sec: 1819.70 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 23:01:39,012 epoch 1 - iter 1584/1984 - loss 0.35331171 - time (sec): 72.50 - samples/sec: 1817.71 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 23:01:47,956 epoch 1 - iter 1782/1984 - loss 0.33036547 - time (sec): 81.44 - samples/sec: 1814.81 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 23:01:56,804 epoch 1 - iter 1980/1984 - loss 0.31198950 - time (sec): 90.29 - samples/sec: 1813.61 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-13 23:01:56,982 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:01:56,982 EPOCH 1 done: loss 0.3119 - lr: 0.000050 |
|
2023-10-13 23:02:00,569 DEV : loss 0.10949771106243134 - f1-score (micro avg) 0.7302 |
|
2023-10-13 23:02:00,590 saving best model |
|
2023-10-13 23:02:01,058 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:02:10,199 epoch 2 - iter 198/1984 - loss 0.14375295 - time (sec): 9.14 - samples/sec: 1926.56 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 23:02:19,752 epoch 2 - iter 396/1984 - loss 0.12984777 - time (sec): 18.69 - samples/sec: 1797.99 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 23:02:28,731 epoch 2 - iter 594/1984 - loss 0.12586745 - time (sec): 27.67 - samples/sec: 1811.01 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 23:02:37,904 epoch 2 - iter 792/1984 - loss 0.12549565 - time (sec): 36.84 - samples/sec: 1779.69 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 23:02:46,872 epoch 2 - iter 990/1984 - loss 0.12375022 - time (sec): 45.81 - samples/sec: 1783.17 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 23:02:55,929 epoch 2 - iter 1188/1984 - loss 0.12078530 - time (sec): 54.87 - samples/sec: 1782.05 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 23:03:04,904 epoch 2 - iter 1386/1984 - loss 0.11956853 - time (sec): 63.85 - samples/sec: 1782.93 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 23:03:13,896 epoch 2 - iter 1584/1984 - loss 0.11985721 - time (sec): 72.84 - samples/sec: 1793.85 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 23:03:22,849 epoch 2 - iter 1782/1984 - loss 0.11912594 - time (sec): 81.79 - samples/sec: 1787.41 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 23:03:32,027 epoch 2 - iter 1980/1984 - loss 0.11760525 - time (sec): 90.97 - samples/sec: 1800.06 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-13 23:03:32,205 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:03:32,205 EPOCH 2 done: loss 0.1176 - lr: 0.000044 |
|
2023-10-13 23:03:35,599 DEV : loss 0.11232058703899384 - f1-score (micro avg) 0.6962 |
|
2023-10-13 23:03:35,621 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:03:44,703 epoch 3 - iter 198/1984 - loss 0.08030450 - time (sec): 9.08 - samples/sec: 1799.35 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-13 23:03:53,632 epoch 3 - iter 396/1984 - loss 0.08304766 - time (sec): 18.01 - samples/sec: 1792.67 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 23:04:02,572 epoch 3 - iter 594/1984 - loss 0.08615871 - time (sec): 26.95 - samples/sec: 1794.60 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 23:04:11,544 epoch 3 - iter 792/1984 - loss 0.08894807 - time (sec): 35.92 - samples/sec: 1818.29 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 23:04:20,649 epoch 3 - iter 990/1984 - loss 0.09055154 - time (sec): 45.03 - samples/sec: 1825.12 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 23:04:29,740 epoch 3 - iter 1188/1984 - loss 0.09094457 - time (sec): 54.12 - samples/sec: 1815.30 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 23:04:39,034 epoch 3 - iter 1386/1984 - loss 0.09153909 - time (sec): 63.41 - samples/sec: 1807.72 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 23:04:48,123 epoch 3 - iter 1584/1984 - loss 0.09196429 - time (sec): 72.50 - samples/sec: 1800.25 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 23:04:57,084 epoch 3 - iter 1782/1984 - loss 0.09198296 - time (sec): 81.46 - samples/sec: 1803.02 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 23:05:06,038 epoch 3 - iter 1980/1984 - loss 0.09023290 - time (sec): 90.42 - samples/sec: 1810.05 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 23:05:06,220 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:05:06,220 EPOCH 3 done: loss 0.0902 - lr: 0.000039 |
|
2023-10-13 23:05:10,128 DEV : loss 0.12935516238212585 - f1-score (micro avg) 0.7379 |
|
2023-10-13 23:05:10,149 saving best model |
|
2023-10-13 23:05:10,718 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:05:19,839 epoch 4 - iter 198/1984 - loss 0.05482600 - time (sec): 9.12 - samples/sec: 1868.83 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 23:05:28,950 epoch 4 - iter 396/1984 - loss 0.05830678 - time (sec): 18.23 - samples/sec: 1818.26 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 23:05:38,170 epoch 4 - iter 594/1984 - loss 0.06290395 - time (sec): 27.45 - samples/sec: 1782.50 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 23:05:47,220 epoch 4 - iter 792/1984 - loss 0.06423461 - time (sec): 36.50 - samples/sec: 1789.34 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 23:05:56,349 epoch 4 - iter 990/1984 - loss 0.06461218 - time (sec): 45.63 - samples/sec: 1787.16 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 23:06:05,421 epoch 4 - iter 1188/1984 - loss 0.06665001 - time (sec): 54.70 - samples/sec: 1793.26 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 23:06:14,380 epoch 4 - iter 1386/1984 - loss 0.06627246 - time (sec): 63.66 - samples/sec: 1800.70 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 23:06:23,383 epoch 4 - iter 1584/1984 - loss 0.06723070 - time (sec): 72.66 - samples/sec: 1798.85 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 23:06:32,353 epoch 4 - iter 1782/1984 - loss 0.06769783 - time (sec): 81.63 - samples/sec: 1807.58 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 23:06:41,405 epoch 4 - iter 1980/1984 - loss 0.06677309 - time (sec): 90.68 - samples/sec: 1806.89 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 23:06:41,584 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:06:41,584 EPOCH 4 done: loss 0.0668 - lr: 0.000033 |
|
2023-10-13 23:06:45,111 DEV : loss 0.17647738754749298 - f1-score (micro avg) 0.7355 |
|
2023-10-13 23:06:45,144 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:06:54,849 epoch 5 - iter 198/1984 - loss 0.04106947 - time (sec): 9.70 - samples/sec: 1748.03 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 23:07:03,858 epoch 5 - iter 396/1984 - loss 0.04655195 - time (sec): 18.71 - samples/sec: 1766.51 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 23:07:12,979 epoch 5 - iter 594/1984 - loss 0.04653325 - time (sec): 27.83 - samples/sec: 1812.33 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 23:07:21,950 epoch 5 - iter 792/1984 - loss 0.04794018 - time (sec): 36.80 - samples/sec: 1805.15 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-13 23:07:30,926 epoch 5 - iter 990/1984 - loss 0.04840007 - time (sec): 45.78 - samples/sec: 1811.22 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-13 23:07:39,949 epoch 5 - iter 1188/1984 - loss 0.05004291 - time (sec): 54.80 - samples/sec: 1804.77 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 23:07:48,863 epoch 5 - iter 1386/1984 - loss 0.05058820 - time (sec): 63.72 - samples/sec: 1796.18 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 23:07:57,887 epoch 5 - iter 1584/1984 - loss 0.04983358 - time (sec): 72.74 - samples/sec: 1795.40 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 23:08:06,980 epoch 5 - iter 1782/1984 - loss 0.05083169 - time (sec): 81.83 - samples/sec: 1808.28 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 23:08:15,845 epoch 5 - iter 1980/1984 - loss 0.05227801 - time (sec): 90.70 - samples/sec: 1802.51 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 23:08:16,081 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:08:16,081 EPOCH 5 done: loss 0.0523 - lr: 0.000028 |
|
2023-10-13 23:08:19,603 DEV : loss 0.1757950484752655 - f1-score (micro avg) 0.7443 |
|
2023-10-13 23:08:19,628 saving best model |
|
2023-10-13 23:08:20,177 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:08:29,321 epoch 6 - iter 198/1984 - loss 0.03528770 - time (sec): 9.14 - samples/sec: 1719.36 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 23:08:38,245 epoch 6 - iter 396/1984 - loss 0.03920138 - time (sec): 18.06 - samples/sec: 1763.57 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 23:08:47,305 epoch 6 - iter 594/1984 - loss 0.03595233 - time (sec): 27.12 - samples/sec: 1781.96 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 23:08:56,590 epoch 6 - iter 792/1984 - loss 0.03547612 - time (sec): 36.41 - samples/sec: 1789.17 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 23:09:05,553 epoch 6 - iter 990/1984 - loss 0.03608526 - time (sec): 45.37 - samples/sec: 1796.82 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 23:09:14,598 epoch 6 - iter 1188/1984 - loss 0.03810704 - time (sec): 54.42 - samples/sec: 1800.27 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 23:09:24,131 epoch 6 - iter 1386/1984 - loss 0.03855158 - time (sec): 63.95 - samples/sec: 1795.20 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 23:09:33,149 epoch 6 - iter 1584/1984 - loss 0.03809866 - time (sec): 72.97 - samples/sec: 1796.32 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 23:09:42,072 epoch 6 - iter 1782/1984 - loss 0.03755127 - time (sec): 81.89 - samples/sec: 1789.44 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 23:09:51,030 epoch 6 - iter 1980/1984 - loss 0.03702409 - time (sec): 90.85 - samples/sec: 1799.62 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 23:09:51,217 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:09:51,217 EPOCH 6 done: loss 0.0370 - lr: 0.000022 |
|
2023-10-13 23:09:54,643 DEV : loss 0.19313980638980865 - f1-score (micro avg) 0.7517 |
|
2023-10-13 23:09:54,664 saving best model |
|
2023-10-13 23:09:55,240 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:10:04,360 epoch 7 - iter 198/1984 - loss 0.02702124 - time (sec): 9.12 - samples/sec: 1881.77 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 23:10:13,373 epoch 7 - iter 396/1984 - loss 0.03084283 - time (sec): 18.13 - samples/sec: 1860.44 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 23:10:22,333 epoch 7 - iter 594/1984 - loss 0.02973583 - time (sec): 27.09 - samples/sec: 1859.23 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 23:10:31,305 epoch 7 - iter 792/1984 - loss 0.02758317 - time (sec): 36.06 - samples/sec: 1855.17 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 23:10:40,287 epoch 7 - iter 990/1984 - loss 0.02869188 - time (sec): 45.04 - samples/sec: 1847.92 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 23:10:49,298 epoch 7 - iter 1188/1984 - loss 0.02940933 - time (sec): 54.06 - samples/sec: 1857.29 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 23:10:58,316 epoch 7 - iter 1386/1984 - loss 0.02944737 - time (sec): 63.07 - samples/sec: 1836.08 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 23:11:07,290 epoch 7 - iter 1584/1984 - loss 0.02850037 - time (sec): 72.05 - samples/sec: 1830.44 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 23:11:16,230 epoch 7 - iter 1782/1984 - loss 0.02829835 - time (sec): 80.99 - samples/sec: 1830.09 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 23:11:25,052 epoch 7 - iter 1980/1984 - loss 0.02778118 - time (sec): 89.81 - samples/sec: 1822.07 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 23:11:25,233 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:11:25,233 EPOCH 7 done: loss 0.0277 - lr: 0.000017 |
|
2023-10-13 23:11:28,674 DEV : loss 0.21592207252979279 - f1-score (micro avg) 0.7685 |
|
2023-10-13 23:11:28,702 saving best model |
|
2023-10-13 23:11:29,202 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:11:38,502 epoch 8 - iter 198/1984 - loss 0.03265154 - time (sec): 9.30 - samples/sec: 1784.08 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 23:11:47,488 epoch 8 - iter 396/1984 - loss 0.02692312 - time (sec): 18.28 - samples/sec: 1794.02 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 23:11:56,547 epoch 8 - iter 594/1984 - loss 0.02493041 - time (sec): 27.34 - samples/sec: 1802.68 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 23:12:05,472 epoch 8 - iter 792/1984 - loss 0.02398548 - time (sec): 36.27 - samples/sec: 1800.13 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 23:12:14,426 epoch 8 - iter 990/1984 - loss 0.02207354 - time (sec): 45.22 - samples/sec: 1797.01 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 23:12:23,548 epoch 8 - iter 1188/1984 - loss 0.02261054 - time (sec): 54.34 - samples/sec: 1804.03 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 23:12:33,150 epoch 8 - iter 1386/1984 - loss 0.02246274 - time (sec): 63.94 - samples/sec: 1797.62 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 23:12:42,232 epoch 8 - iter 1584/1984 - loss 0.02217940 - time (sec): 73.02 - samples/sec: 1793.62 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 23:12:51,634 epoch 8 - iter 1782/1984 - loss 0.02147734 - time (sec): 82.43 - samples/sec: 1783.43 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 23:13:00,581 epoch 8 - iter 1980/1984 - loss 0.02100390 - time (sec): 91.37 - samples/sec: 1790.99 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 23:13:00,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:13:00,764 EPOCH 8 done: loss 0.0211 - lr: 0.000011 |
|
2023-10-13 23:13:04,598 DEV : loss 0.21091753244400024 - f1-score (micro avg) 0.7553 |
|
2023-10-13 23:13:04,618 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:13:13,513 epoch 9 - iter 198/1984 - loss 0.00733228 - time (sec): 8.89 - samples/sec: 1730.22 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 23:13:22,696 epoch 9 - iter 396/1984 - loss 0.01112380 - time (sec): 18.08 - samples/sec: 1725.38 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 23:13:31,864 epoch 9 - iter 594/1984 - loss 0.01020664 - time (sec): 27.24 - samples/sec: 1773.39 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 23:13:40,951 epoch 9 - iter 792/1984 - loss 0.01052024 - time (sec): 36.33 - samples/sec: 1789.51 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 23:13:50,232 epoch 9 - iter 990/1984 - loss 0.01260968 - time (sec): 45.61 - samples/sec: 1813.60 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 23:13:59,464 epoch 9 - iter 1188/1984 - loss 0.01177559 - time (sec): 54.84 - samples/sec: 1806.82 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 23:14:08,417 epoch 9 - iter 1386/1984 - loss 0.01184590 - time (sec): 63.80 - samples/sec: 1806.92 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 23:14:17,430 epoch 9 - iter 1584/1984 - loss 0.01232048 - time (sec): 72.81 - samples/sec: 1806.87 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 23:14:26,329 epoch 9 - iter 1782/1984 - loss 0.01255026 - time (sec): 81.71 - samples/sec: 1801.94 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 23:14:35,399 epoch 9 - iter 1980/1984 - loss 0.01253169 - time (sec): 90.78 - samples/sec: 1803.20 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 23:14:35,581 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:14:35,581 EPOCH 9 done: loss 0.0126 - lr: 0.000006 |
|
2023-10-13 23:14:39,097 DEV : loss 0.22775374352931976 - f1-score (micro avg) 0.7603 |
|
2023-10-13 23:14:39,119 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:14:48,264 epoch 10 - iter 198/1984 - loss 0.00535780 - time (sec): 9.14 - samples/sec: 1683.98 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 23:14:57,214 epoch 10 - iter 396/1984 - loss 0.00645843 - time (sec): 18.09 - samples/sec: 1743.85 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 23:15:06,164 epoch 10 - iter 594/1984 - loss 0.00674720 - time (sec): 27.04 - samples/sec: 1745.58 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 23:15:15,417 epoch 10 - iter 792/1984 - loss 0.00765501 - time (sec): 36.30 - samples/sec: 1746.87 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 23:15:24,647 epoch 10 - iter 990/1984 - loss 0.00817023 - time (sec): 45.53 - samples/sec: 1751.12 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 23:15:33,679 epoch 10 - iter 1188/1984 - loss 0.00874557 - time (sec): 54.56 - samples/sec: 1774.18 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 23:15:42,787 epoch 10 - iter 1386/1984 - loss 0.00917449 - time (sec): 63.67 - samples/sec: 1798.37 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 23:15:51,803 epoch 10 - iter 1584/1984 - loss 0.00867989 - time (sec): 72.68 - samples/sec: 1805.44 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 23:16:00,811 epoch 10 - iter 1782/1984 - loss 0.00885292 - time (sec): 81.69 - samples/sec: 1803.59 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 23:16:09,829 epoch 10 - iter 1980/1984 - loss 0.00933134 - time (sec): 90.71 - samples/sec: 1804.64 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 23:16:10,010 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:16:10,010 EPOCH 10 done: loss 0.0093 - lr: 0.000000 |
|
2023-10-13 23:16:13,937 DEV : loss 0.2285892367362976 - f1-score (micro avg) 0.7591 |
|
2023-10-13 23:16:14,432 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 23:16:14,434 Loading model from best epoch ... |
|
2023-10-13 23:16:15,912 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-13 23:16:19,423 |
|
Results: |
|
- F-score (micro) 0.783 |
|
- F-score (macro) 0.676 |
|
- Accuracy 0.6661 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8229 0.8794 0.8502 655 |
|
PER 0.7125 0.7668 0.7387 223 |
|
ORG 0.5769 0.3543 0.4390 127 |
|
|
|
micro avg 0.7780 0.7881 0.7830 1005 |
|
macro avg 0.7041 0.6668 0.6760 1005 |
|
weighted avg 0.7673 0.7881 0.7735 1005 |
|
|
|
2023-10-13 23:16:19,423 ---------------------------------------------------------------------------------------------------- |
|
|