2023-10-17 12:30:13,946 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:30:13,947 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 12:30:13,947 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:30:13,947 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-17 12:30:13,947 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:30:13,947 Train: 7936 sentences 2023-10-17 12:30:13,947 (train_with_dev=False, train_with_test=False) 2023-10-17 12:30:13,947 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:30:13,948 Training Params: 2023-10-17 12:30:13,948 - learning_rate: "3e-05" 2023-10-17 12:30:13,948 - mini_batch_size: "4" 2023-10-17 12:30:13,948 - max_epochs: "10" 2023-10-17 12:30:13,948 - shuffle: "True" 2023-10-17 12:30:13,948 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:30:13,948 Plugins: 2023-10-17 12:30:13,948 - TensorboardLogger 2023-10-17 12:30:13,948 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 12:30:13,948 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:30:13,948 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 12:30:13,948 - metric: "('micro avg', 'f1-score')" 2023-10-17 12:30:13,948 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:30:13,948 Computation: 2023-10-17 12:30:13,948 - compute on device: cuda:0 2023-10-17 12:30:13,948 - embedding storage: none 2023-10-17 12:30:13,948 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:30:13,948 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 12:30:13,948 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:30:13,948 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:30:13,948 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 12:30:23,424 epoch 1 - iter 198/1984 - loss 2.38234386 - time (sec): 9.47 - samples/sec: 1802.33 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:30:32,782 epoch 1 - iter 396/1984 - loss 1.42791489 - time (sec): 18.83 - samples/sec: 1754.21 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:30:42,378 epoch 1 - iter 594/1984 - loss 1.04332496 - time (sec): 28.43 - samples/sec: 1760.25 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:30:51,670 epoch 1 - iter 792/1984 - loss 0.84433293 - time (sec): 37.72 - samples/sec: 1746.56 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:31:01,111 epoch 1 - iter 990/1984 - loss 0.70291310 - time (sec): 47.16 - samples/sec: 1760.29 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:31:10,437 epoch 1 - iter 1188/1984 - loss 0.60723553 - time (sec): 56.49 - samples/sec: 1779.05 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:31:19,931 epoch 1 - iter 1386/1984 - loss 0.53958743 - time (sec): 65.98 - samples/sec: 1783.04 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:31:29,162 epoch 1 - iter 1584/1984 - loss 0.49563480 - time (sec): 75.21 - samples/sec: 1767.27 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:31:38,675 epoch 1 - iter 1782/1984 - loss 0.46032639 - time (sec): 84.73 - samples/sec: 1747.00 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:31:48,550 epoch 1 - iter 1980/1984 - loss 0.42801054 - time (sec): 94.60 - samples/sec: 1730.52 - lr: 0.000030 - momentum: 0.000000 2023-10-17 12:31:48,732 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:31:48,733 EPOCH 1 done: loss 0.4277 - lr: 0.000030 2023-10-17 12:31:52,355 DEV : loss 0.08888775110244751 - f1-score (micro avg) 0.7141 2023-10-17 12:31:52,388 saving best model 2023-10-17 12:31:52,812 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:32:02,080 epoch 2 - iter 198/1984 - loss 0.12840545 - time (sec): 9.27 - samples/sec: 1627.82 - lr: 0.000030 - momentum: 0.000000 2023-10-17 12:32:11,544 epoch 2 - iter 396/1984 - loss 0.12995343 - time (sec): 18.73 - samples/sec: 1685.97 - lr: 0.000029 - momentum: 0.000000 2023-10-17 12:32:21,075 epoch 2 - iter 594/1984 - loss 0.13177287 - time (sec): 28.26 - samples/sec: 1661.53 - lr: 0.000029 - momentum: 0.000000 2023-10-17 12:32:30,451 epoch 2 - iter 792/1984 - loss 0.12718319 - time (sec): 37.64 - samples/sec: 1686.76 - lr: 0.000029 - momentum: 0.000000 2023-10-17 12:32:39,752 epoch 2 - iter 990/1984 - loss 0.12715837 - time (sec): 46.94 - samples/sec: 1710.38 - lr: 0.000028 - momentum: 0.000000 2023-10-17 12:32:48,979 epoch 2 - iter 1188/1984 - loss 0.12525218 - time (sec): 56.17 - samples/sec: 1721.81 - lr: 0.000028 - momentum: 0.000000 2023-10-17 12:32:58,260 epoch 2 - iter 1386/1984 - loss 0.12111745 - time (sec): 65.45 - samples/sec: 1721.22 - lr: 0.000028 - momentum: 0.000000 2023-10-17 12:33:07,791 epoch 2 - iter 1584/1984 - loss 0.12050247 - time (sec): 74.98 - samples/sec: 1725.69 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:33:17,055 epoch 2 - iter 1782/1984 - loss 0.11986072 - time (sec): 84.24 - samples/sec: 1742.62 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:33:26,450 epoch 2 - iter 1980/1984 - loss 0.11870868 - time (sec): 93.64 - samples/sec: 1748.11 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:33:26,633 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:33:26,633 EPOCH 2 done: loss 0.1185 - lr: 0.000027 2023-10-17 12:33:30,335 DEV : loss 0.10451896488666534 - f1-score (micro avg) 0.7209 2023-10-17 12:33:30,357 saving best model 2023-10-17 12:33:31,401 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:33:40,719 epoch 3 - iter 198/1984 - loss 0.08984418 - time (sec): 9.31 - samples/sec: 1711.83 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:33:50,267 epoch 3 - iter 396/1984 - loss 0.08580776 - time (sec): 18.86 - samples/sec: 1733.27 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:33:59,560 epoch 3 - iter 594/1984 - loss 0.08686906 - time (sec): 28.15 - samples/sec: 1740.78 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:34:08,951 epoch 3 - iter 792/1984 - loss 0.08819915 - time (sec): 37.54 - samples/sec: 1738.69 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:34:18,364 epoch 3 - iter 990/1984 - loss 0.08990454 - time (sec): 46.95 - samples/sec: 1721.05 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:34:27,571 epoch 3 - iter 1188/1984 - loss 0.08992539 - time (sec): 56.16 - samples/sec: 1725.40 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:34:36,738 epoch 3 - iter 1386/1984 - loss 0.08943909 - time (sec): 65.33 - samples/sec: 1739.46 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:34:46,042 epoch 3 - iter 1584/1984 - loss 0.08839958 - time (sec): 74.63 - samples/sec: 1763.54 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:34:55,179 epoch 3 - iter 1782/1984 - loss 0.08821887 - time (sec): 83.77 - samples/sec: 1763.87 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:35:04,417 epoch 3 - iter 1980/1984 - loss 0.08832888 - time (sec): 93.01 - samples/sec: 1759.91 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:35:04,605 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:35:04,606 EPOCH 3 done: loss 0.0882 - lr: 0.000023 2023-10-17 12:35:08,254 DEV : loss 0.11844022572040558 - f1-score (micro avg) 0.7373 2023-10-17 12:35:08,282 saving best model 2023-10-17 12:35:08,809 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:35:18,197 epoch 4 - iter 198/1984 - loss 0.07187323 - time (sec): 9.39 - samples/sec: 1824.51 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:35:27,464 epoch 4 - iter 396/1984 - loss 0.06422557 - time (sec): 18.65 - samples/sec: 1765.13 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:35:36,725 epoch 4 - iter 594/1984 - loss 0.05702430 - time (sec): 27.91 - samples/sec: 1754.53 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:35:46,077 epoch 4 - iter 792/1984 - loss 0.05974918 - time (sec): 37.27 - samples/sec: 1745.22 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:35:55,423 epoch 4 - iter 990/1984 - loss 0.05976304 - time (sec): 46.61 - samples/sec: 1756.14 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:36:04,760 epoch 4 - iter 1188/1984 - loss 0.06152848 - time (sec): 55.95 - samples/sec: 1743.51 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:36:14,344 epoch 4 - iter 1386/1984 - loss 0.06363359 - time (sec): 65.53 - samples/sec: 1746.73 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:36:23,697 epoch 4 - iter 1584/1984 - loss 0.06505279 - time (sec): 74.89 - samples/sec: 1749.25 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:36:33,075 epoch 4 - iter 1782/1984 - loss 0.06419124 - time (sec): 84.26 - samples/sec: 1749.67 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:36:42,532 epoch 4 - iter 1980/1984 - loss 0.06501747 - time (sec): 93.72 - samples/sec: 1746.19 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:36:42,710 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:36:42,711 EPOCH 4 done: loss 0.0650 - lr: 0.000020 2023-10-17 12:36:46,392 DEV : loss 0.1447092890739441 - f1-score (micro avg) 0.74 2023-10-17 12:36:46,416 saving best model 2023-10-17 12:36:47,006 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:36:56,365 epoch 5 - iter 198/1984 - loss 0.04826648 - time (sec): 9.36 - samples/sec: 1762.56 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:37:05,582 epoch 5 - iter 396/1984 - loss 0.04460051 - time (sec): 18.57 - samples/sec: 1778.58 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:37:14,761 epoch 5 - iter 594/1984 - loss 0.04859344 - time (sec): 27.75 - samples/sec: 1754.84 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:37:24,133 epoch 5 - iter 792/1984 - loss 0.04917495 - time (sec): 37.13 - samples/sec: 1747.55 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:37:33,311 epoch 5 - iter 990/1984 - loss 0.05024337 - time (sec): 46.30 - samples/sec: 1758.51 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:37:42,390 epoch 5 - iter 1188/1984 - loss 0.04949250 - time (sec): 55.38 - samples/sec: 1746.11 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:37:51,559 epoch 5 - iter 1386/1984 - loss 0.04899176 - time (sec): 64.55 - samples/sec: 1763.07 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:38:01,063 epoch 5 - iter 1584/1984 - loss 0.04885371 - time (sec): 74.06 - samples/sec: 1757.50 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:38:11,039 epoch 5 - iter 1782/1984 - loss 0.04868603 - time (sec): 84.03 - samples/sec: 1746.55 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:38:20,399 epoch 5 - iter 1980/1984 - loss 0.04905152 - time (sec): 93.39 - samples/sec: 1752.32 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:38:20,600 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:38:20,600 EPOCH 5 done: loss 0.0490 - lr: 0.000017 2023-10-17 12:38:24,317 DEV : loss 0.16858862340450287 - f1-score (micro avg) 0.7723 2023-10-17 12:38:24,353 saving best model 2023-10-17 12:38:24,882 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:38:34,941 epoch 6 - iter 198/1984 - loss 0.03580324 - time (sec): 10.05 - samples/sec: 1626.98 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:38:44,376 epoch 6 - iter 396/1984 - loss 0.03921053 - time (sec): 19.49 - samples/sec: 1675.47 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:38:53,527 epoch 6 - iter 594/1984 - loss 0.04029851 - time (sec): 28.64 - samples/sec: 1719.10 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:39:02,801 epoch 6 - iter 792/1984 - loss 0.03891771 - time (sec): 37.91 - samples/sec: 1733.83 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:39:12,060 epoch 6 - iter 990/1984 - loss 0.03830666 - time (sec): 47.17 - samples/sec: 1727.45 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:39:21,102 epoch 6 - iter 1188/1984 - loss 0.03802511 - time (sec): 56.21 - samples/sec: 1716.50 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:39:30,203 epoch 6 - iter 1386/1984 - loss 0.03724256 - time (sec): 65.32 - samples/sec: 1728.66 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:39:39,417 epoch 6 - iter 1584/1984 - loss 0.03704625 - time (sec): 74.53 - samples/sec: 1749.30 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:39:48,403 epoch 6 - iter 1782/1984 - loss 0.03731915 - time (sec): 83.52 - samples/sec: 1751.47 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:39:57,620 epoch 6 - iter 1980/1984 - loss 0.03699281 - time (sec): 92.73 - samples/sec: 1764.63 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:39:57,804 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:39:57,804 EPOCH 6 done: loss 0.0370 - lr: 0.000013 2023-10-17 12:40:01,835 DEV : loss 0.18626417219638824 - f1-score (micro avg) 0.7649 2023-10-17 12:40:01,856 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:40:10,895 epoch 7 - iter 198/1984 - loss 0.02448812 - time (sec): 9.04 - samples/sec: 1728.08 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:40:20,200 epoch 7 - iter 396/1984 - loss 0.02596145 - time (sec): 18.34 - samples/sec: 1808.38 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:40:29,294 epoch 7 - iter 594/1984 - loss 0.02560325 - time (sec): 27.44 - samples/sec: 1795.83 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:40:38,357 epoch 7 - iter 792/1984 - loss 0.02967938 - time (sec): 36.50 - samples/sec: 1784.59 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:40:47,456 epoch 7 - iter 990/1984 - loss 0.02859136 - time (sec): 45.60 - samples/sec: 1786.09 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:40:56,662 epoch 7 - iter 1188/1984 - loss 0.02775822 - time (sec): 54.81 - samples/sec: 1792.79 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:41:05,904 epoch 7 - iter 1386/1984 - loss 0.02726516 - time (sec): 64.05 - samples/sec: 1799.31 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:41:15,602 epoch 7 - iter 1584/1984 - loss 0.02669701 - time (sec): 73.75 - samples/sec: 1781.74 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:41:24,753 epoch 7 - iter 1782/1984 - loss 0.02701107 - time (sec): 82.90 - samples/sec: 1779.19 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:41:33,839 epoch 7 - iter 1980/1984 - loss 0.02701648 - time (sec): 91.98 - samples/sec: 1779.84 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:41:34,032 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:41:34,033 EPOCH 7 done: loss 0.0270 - lr: 0.000010 2023-10-17 12:41:37,645 DEV : loss 0.21058925986289978 - f1-score (micro avg) 0.7653 2023-10-17 12:41:37,668 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:41:46,762 epoch 8 - iter 198/1984 - loss 0.01705926 - time (sec): 9.09 - samples/sec: 1768.31 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:41:56,004 epoch 8 - iter 396/1984 - loss 0.01786307 - time (sec): 18.33 - samples/sec: 1751.47 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:42:05,265 epoch 8 - iter 594/1984 - loss 0.01741473 - time (sec): 27.60 - samples/sec: 1751.47 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:42:14,328 epoch 8 - iter 792/1984 - loss 0.01765238 - time (sec): 36.66 - samples/sec: 1770.48 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:42:23,346 epoch 8 - iter 990/1984 - loss 0.01703398 - time (sec): 45.68 - samples/sec: 1791.83 - lr: 0.000008 - momentum: 0.000000 2023-10-17 12:42:32,394 epoch 8 - iter 1188/1984 - loss 0.01847245 - time (sec): 54.72 - samples/sec: 1777.38 - lr: 0.000008 - momentum: 0.000000 2023-10-17 12:42:41,750 epoch 8 - iter 1386/1984 - loss 0.01810602 - time (sec): 64.08 - samples/sec: 1784.70 - lr: 0.000008 - momentum: 0.000000 2023-10-17 12:42:50,814 epoch 8 - iter 1584/1984 - loss 0.01757736 - time (sec): 73.14 - samples/sec: 1797.17 - lr: 0.000007 - momentum: 0.000000 2023-10-17 12:43:00,069 epoch 8 - iter 1782/1984 - loss 0.01826162 - time (sec): 82.40 - samples/sec: 1793.46 - lr: 0.000007 - momentum: 0.000000 2023-10-17 12:43:09,366 epoch 8 - iter 1980/1984 - loss 0.01810935 - time (sec): 91.70 - samples/sec: 1784.56 - lr: 0.000007 - momentum: 0.000000 2023-10-17 12:43:09,566 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:43:09,566 EPOCH 8 done: loss 0.0181 - lr: 0.000007 2023-10-17 12:43:13,069 DEV : loss 0.2294856458902359 - f1-score (micro avg) 0.7625 2023-10-17 12:43:13,092 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:43:22,433 epoch 9 - iter 198/1984 - loss 0.01158194 - time (sec): 9.34 - samples/sec: 1781.47 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:43:31,554 epoch 9 - iter 396/1984 - loss 0.01046138 - time (sec): 18.46 - samples/sec: 1815.26 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:43:40,615 epoch 9 - iter 594/1984 - loss 0.01195194 - time (sec): 27.52 - samples/sec: 1778.57 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:43:49,584 epoch 9 - iter 792/1984 - loss 0.01270015 - time (sec): 36.49 - samples/sec: 1793.86 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:43:58,665 epoch 9 - iter 990/1984 - loss 0.01275360 - time (sec): 45.57 - samples/sec: 1798.68 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:44:08,177 epoch 9 - iter 1188/1984 - loss 0.01293569 - time (sec): 55.08 - samples/sec: 1798.77 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:44:17,253 epoch 9 - iter 1386/1984 - loss 0.01193801 - time (sec): 64.16 - samples/sec: 1785.93 - lr: 0.000004 - momentum: 0.000000 2023-10-17 12:44:26,657 epoch 9 - iter 1584/1984 - loss 0.01171246 - time (sec): 73.56 - samples/sec: 1779.80 - lr: 0.000004 - momentum: 0.000000 2023-10-17 12:44:35,793 epoch 9 - iter 1782/1984 - loss 0.01204442 - time (sec): 82.70 - samples/sec: 1778.13 - lr: 0.000004 - momentum: 0.000000 2023-10-17 12:44:44,963 epoch 9 - iter 1980/1984 - loss 0.01199424 - time (sec): 91.87 - samples/sec: 1779.72 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:44:45,158 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:44:45,158 EPOCH 9 done: loss 0.0120 - lr: 0.000003 2023-10-17 12:44:48,773 DEV : loss 0.239894837141037 - f1-score (micro avg) 0.766 2023-10-17 12:44:48,802 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:44:58,606 epoch 10 - iter 198/1984 - loss 0.00778125 - time (sec): 9.80 - samples/sec: 1640.50 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:45:07,977 epoch 10 - iter 396/1984 - loss 0.01047784 - time (sec): 19.17 - samples/sec: 1692.10 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:45:17,019 epoch 10 - iter 594/1984 - loss 0.00869626 - time (sec): 28.21 - samples/sec: 1768.46 - lr: 0.000002 - momentum: 0.000000 2023-10-17 12:45:26,416 epoch 10 - iter 792/1984 - loss 0.00853870 - time (sec): 37.61 - samples/sec: 1758.41 - lr: 0.000002 - momentum: 0.000000 2023-10-17 12:45:35,450 epoch 10 - iter 990/1984 - loss 0.00825831 - time (sec): 46.65 - samples/sec: 1761.77 - lr: 0.000002 - momentum: 0.000000 2023-10-17 12:45:44,585 epoch 10 - iter 1188/1984 - loss 0.00873510 - time (sec): 55.78 - samples/sec: 1765.22 - lr: 0.000001 - momentum: 0.000000 2023-10-17 12:45:53,971 epoch 10 - iter 1386/1984 - loss 0.00826154 - time (sec): 65.17 - samples/sec: 1758.30 - lr: 0.000001 - momentum: 0.000000 2023-10-17 12:46:03,208 epoch 10 - iter 1584/1984 - loss 0.00813699 - time (sec): 74.40 - samples/sec: 1771.11 - lr: 0.000001 - momentum: 0.000000 2023-10-17 12:46:12,401 epoch 10 - iter 1782/1984 - loss 0.00820321 - time (sec): 83.60 - samples/sec: 1769.26 - lr: 0.000000 - momentum: 0.000000 2023-10-17 12:46:21,338 epoch 10 - iter 1980/1984 - loss 0.00880791 - time (sec): 92.53 - samples/sec: 1769.43 - lr: 0.000000 - momentum: 0.000000 2023-10-17 12:46:21,524 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:46:21,524 EPOCH 10 done: loss 0.0089 - lr: 0.000000 2023-10-17 12:46:25,165 DEV : loss 0.24632978439331055 - f1-score (micro avg) 0.7697 2023-10-17 12:46:25,612 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:46:25,614 Loading model from best epoch ... 2023-10-17 12:46:27,982 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-17 12:46:31,129 Results: - F-score (micro) 0.788 - F-score (macro) 0.7077 - Accuracy 0.6683 By class: precision recall f1-score support LOC 0.8242 0.8733 0.8480 655 PER 0.7200 0.8072 0.7611 223 ORG 0.5246 0.5039 0.5141 127 micro avg 0.7655 0.8119 0.7880 1005 macro avg 0.6896 0.7281 0.7077 1005 weighted avg 0.7632 0.8119 0.7865 1005 2023-10-17 12:46:31,129 ----------------------------------------------------------------------------------------------------