2023-10-17 14:38:20,069 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:38:20,070 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 14:38:20,070 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:38:20,071 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-17 14:38:20,071 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:38:20,071 Train: 7936 sentences 2023-10-17 14:38:20,071 (train_with_dev=False, train_with_test=False) 2023-10-17 14:38:20,071 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:38:20,071 Training Params: 2023-10-17 14:38:20,071 - learning_rate: "5e-05" 2023-10-17 14:38:20,071 - mini_batch_size: "4" 2023-10-17 14:38:20,071 - max_epochs: "10" 2023-10-17 14:38:20,071 - shuffle: "True" 2023-10-17 14:38:20,071 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:38:20,071 Plugins: 2023-10-17 14:38:20,071 - TensorboardLogger 2023-10-17 14:38:20,071 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 14:38:20,071 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:38:20,071 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 14:38:20,071 - metric: "('micro avg', 'f1-score')" 2023-10-17 14:38:20,071 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:38:20,071 Computation: 2023-10-17 14:38:20,071 - compute on device: cuda:0 2023-10-17 14:38:20,071 - embedding storage: none 2023-10-17 14:38:20,071 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:38:20,071 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-17 14:38:20,071 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:38:20,071 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:38:20,071 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 14:38:29,225 epoch 1 - iter 198/1984 - loss 1.95545665 - time (sec): 9.15 - samples/sec: 1762.25 - lr: 0.000005 - momentum: 0.000000 2023-10-17 14:38:38,397 epoch 1 - iter 396/1984 - loss 1.12220599 - time (sec): 18.32 - samples/sec: 1831.40 - lr: 0.000010 - momentum: 0.000000 2023-10-17 14:38:47,433 epoch 1 - iter 594/1984 - loss 0.83052688 - time (sec): 27.36 - samples/sec: 1800.67 - lr: 0.000015 - momentum: 0.000000 2023-10-17 14:38:56,632 epoch 1 - iter 792/1984 - loss 0.67249115 - time (sec): 36.56 - samples/sec: 1784.22 - lr: 0.000020 - momentum: 0.000000 2023-10-17 14:39:05,428 epoch 1 - iter 990/1984 - loss 0.57894481 - time (sec): 45.36 - samples/sec: 1777.03 - lr: 0.000025 - momentum: 0.000000 2023-10-17 14:39:13,948 epoch 1 - iter 1188/1984 - loss 0.50617737 - time (sec): 53.88 - samples/sec: 1799.57 - lr: 0.000030 - momentum: 0.000000 2023-10-17 14:39:23,095 epoch 1 - iter 1386/1984 - loss 0.45160565 - time (sec): 63.02 - samples/sec: 1807.03 - lr: 0.000035 - momentum: 0.000000 2023-10-17 14:39:32,404 epoch 1 - iter 1584/1984 - loss 0.41081686 - time (sec): 72.33 - samples/sec: 1804.38 - lr: 0.000040 - momentum: 0.000000 2023-10-17 14:39:42,888 epoch 1 - iter 1782/1984 - loss 0.38018847 - time (sec): 82.82 - samples/sec: 1775.37 - lr: 0.000045 - momentum: 0.000000 2023-10-17 14:39:53,047 epoch 1 - iter 1980/1984 - loss 0.35425717 - time (sec): 92.97 - samples/sec: 1760.16 - lr: 0.000050 - momentum: 0.000000 2023-10-17 14:39:53,226 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:39:53,227 EPOCH 1 done: loss 0.3537 - lr: 0.000050 2023-10-17 14:39:56,547 DEV : loss 0.11494190245866776 - f1-score (micro avg) 0.6519 2023-10-17 14:39:56,569 saving best model 2023-10-17 14:39:56,912 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:40:05,963 epoch 2 - iter 198/1984 - loss 0.11040149 - time (sec): 9.05 - samples/sec: 1883.05 - lr: 0.000049 - momentum: 0.000000 2023-10-17 14:40:14,960 epoch 2 - iter 396/1984 - loss 0.11430138 - time (sec): 18.05 - samples/sec: 1851.91 - lr: 0.000049 - momentum: 0.000000 2023-10-17 14:40:23,983 epoch 2 - iter 594/1984 - loss 0.12555273 - time (sec): 27.07 - samples/sec: 1837.77 - lr: 0.000048 - momentum: 0.000000 2023-10-17 14:40:33,040 epoch 2 - iter 792/1984 - loss 0.13261370 - time (sec): 36.13 - samples/sec: 1806.71 - lr: 0.000048 - momentum: 0.000000 2023-10-17 14:40:42,279 epoch 2 - iter 990/1984 - loss 0.13507266 - time (sec): 45.37 - samples/sec: 1809.43 - lr: 0.000047 - momentum: 0.000000 2023-10-17 14:40:51,368 epoch 2 - iter 1188/1984 - loss 0.13131044 - time (sec): 54.45 - samples/sec: 1798.92 - lr: 0.000047 - momentum: 0.000000 2023-10-17 14:41:00,548 epoch 2 - iter 1386/1984 - loss 0.13302616 - time (sec): 63.63 - samples/sec: 1791.07 - lr: 0.000046 - momentum: 0.000000 2023-10-17 14:41:09,476 epoch 2 - iter 1584/1984 - loss 0.13118869 - time (sec): 72.56 - samples/sec: 1793.98 - lr: 0.000046 - momentum: 0.000000 2023-10-17 14:41:18,829 epoch 2 - iter 1782/1984 - loss 0.13064899 - time (sec): 81.92 - samples/sec: 1797.43 - lr: 0.000045 - momentum: 0.000000 2023-10-17 14:41:28,044 epoch 2 - iter 1980/1984 - loss 0.13048013 - time (sec): 91.13 - samples/sec: 1796.27 - lr: 0.000044 - momentum: 0.000000 2023-10-17 14:41:28,226 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:41:28,226 EPOCH 2 done: loss 0.1303 - lr: 0.000044 2023-10-17 14:41:32,192 DEV : loss 0.08985628187656403 - f1-score (micro avg) 0.7683 2023-10-17 14:41:32,213 saving best model 2023-10-17 14:41:32,698 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:41:41,476 epoch 3 - iter 198/1984 - loss 0.09836698 - time (sec): 8.78 - samples/sec: 1731.66 - lr: 0.000044 - momentum: 0.000000 2023-10-17 14:41:50,622 epoch 3 - iter 396/1984 - loss 0.09517513 - time (sec): 17.92 - samples/sec: 1757.29 - lr: 0.000043 - momentum: 0.000000 2023-10-17 14:41:59,657 epoch 3 - iter 594/1984 - loss 0.09416783 - time (sec): 26.96 - samples/sec: 1794.17 - lr: 0.000043 - momentum: 0.000000 2023-10-17 14:42:08,870 epoch 3 - iter 792/1984 - loss 0.09163933 - time (sec): 36.17 - samples/sec: 1801.78 - lr: 0.000042 - momentum: 0.000000 2023-10-17 14:42:18,170 epoch 3 - iter 990/1984 - loss 0.09142898 - time (sec): 45.47 - samples/sec: 1793.64 - lr: 0.000042 - momentum: 0.000000 2023-10-17 14:42:28,214 epoch 3 - iter 1188/1984 - loss 0.09272082 - time (sec): 55.51 - samples/sec: 1776.56 - lr: 0.000041 - momentum: 0.000000 2023-10-17 14:42:38,994 epoch 3 - iter 1386/1984 - loss 0.09277418 - time (sec): 66.29 - samples/sec: 1725.49 - lr: 0.000041 - momentum: 0.000000 2023-10-17 14:42:49,340 epoch 3 - iter 1584/1984 - loss 0.09372745 - time (sec): 76.64 - samples/sec: 1702.85 - lr: 0.000040 - momentum: 0.000000 2023-10-17 14:42:58,868 epoch 3 - iter 1782/1984 - loss 0.09155199 - time (sec): 86.17 - samples/sec: 1712.32 - lr: 0.000039 - momentum: 0.000000 2023-10-17 14:43:08,051 epoch 3 - iter 1980/1984 - loss 0.09272496 - time (sec): 95.35 - samples/sec: 1716.29 - lr: 0.000039 - momentum: 0.000000 2023-10-17 14:43:08,231 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:43:08,231 EPOCH 3 done: loss 0.0926 - lr: 0.000039 2023-10-17 14:43:12,013 DEV : loss 0.11382433772087097 - f1-score (micro avg) 0.7468 2023-10-17 14:43:12,043 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:43:21,936 epoch 4 - iter 198/1984 - loss 0.07511519 - time (sec): 9.89 - samples/sec: 1594.72 - lr: 0.000038 - momentum: 0.000000 2023-10-17 14:43:31,544 epoch 4 - iter 396/1984 - loss 0.07526871 - time (sec): 19.50 - samples/sec: 1670.87 - lr: 0.000038 - momentum: 0.000000 2023-10-17 14:43:41,930 epoch 4 - iter 594/1984 - loss 0.07569617 - time (sec): 29.89 - samples/sec: 1621.54 - lr: 0.000037 - momentum: 0.000000 2023-10-17 14:43:51,570 epoch 4 - iter 792/1984 - loss 0.07339105 - time (sec): 39.53 - samples/sec: 1647.73 - lr: 0.000037 - momentum: 0.000000 2023-10-17 14:44:00,490 epoch 4 - iter 990/1984 - loss 0.07413341 - time (sec): 48.45 - samples/sec: 1685.78 - lr: 0.000036 - momentum: 0.000000 2023-10-17 14:44:09,434 epoch 4 - iter 1188/1984 - loss 0.07541001 - time (sec): 57.39 - samples/sec: 1703.83 - lr: 0.000036 - momentum: 0.000000 2023-10-17 14:44:18,710 epoch 4 - iter 1386/1984 - loss 0.07462380 - time (sec): 66.66 - samples/sec: 1714.27 - lr: 0.000035 - momentum: 0.000000 2023-10-17 14:44:28,053 epoch 4 - iter 1584/1984 - loss 0.07558972 - time (sec): 76.01 - samples/sec: 1716.80 - lr: 0.000034 - momentum: 0.000000 2023-10-17 14:44:37,952 epoch 4 - iter 1782/1984 - loss 0.07546181 - time (sec): 85.91 - samples/sec: 1714.79 - lr: 0.000034 - momentum: 0.000000 2023-10-17 14:44:48,067 epoch 4 - iter 1980/1984 - loss 0.07339469 - time (sec): 96.02 - samples/sec: 1704.56 - lr: 0.000033 - momentum: 0.000000 2023-10-17 14:44:48,248 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:44:48,248 EPOCH 4 done: loss 0.0736 - lr: 0.000033 2023-10-17 14:44:51,859 DEV : loss 0.17536717653274536 - f1-score (micro avg) 0.737 2023-10-17 14:44:51,882 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:45:00,911 epoch 5 - iter 198/1984 - loss 0.05079454 - time (sec): 9.03 - samples/sec: 1773.03 - lr: 0.000033 - momentum: 0.000000 2023-10-17 14:45:10,378 epoch 5 - iter 396/1984 - loss 0.05612183 - time (sec): 18.49 - samples/sec: 1774.34 - lr: 0.000032 - momentum: 0.000000 2023-10-17 14:45:20,089 epoch 5 - iter 594/1984 - loss 0.05327248 - time (sec): 28.21 - samples/sec: 1777.20 - lr: 0.000032 - momentum: 0.000000 2023-10-17 14:45:29,458 epoch 5 - iter 792/1984 - loss 0.05519592 - time (sec): 37.57 - samples/sec: 1764.35 - lr: 0.000031 - momentum: 0.000000 2023-10-17 14:45:38,763 epoch 5 - iter 990/1984 - loss 0.05814401 - time (sec): 46.88 - samples/sec: 1784.98 - lr: 0.000031 - momentum: 0.000000 2023-10-17 14:45:48,192 epoch 5 - iter 1188/1984 - loss 0.05928625 - time (sec): 56.31 - samples/sec: 1766.93 - lr: 0.000030 - momentum: 0.000000 2023-10-17 14:45:57,790 epoch 5 - iter 1386/1984 - loss 0.05868169 - time (sec): 65.91 - samples/sec: 1765.22 - lr: 0.000029 - momentum: 0.000000 2023-10-17 14:46:06,665 epoch 5 - iter 1584/1984 - loss 0.05899803 - time (sec): 74.78 - samples/sec: 1764.21 - lr: 0.000029 - momentum: 0.000000 2023-10-17 14:46:15,619 epoch 5 - iter 1782/1984 - loss 0.05870097 - time (sec): 83.73 - samples/sec: 1763.35 - lr: 0.000028 - momentum: 0.000000 2023-10-17 14:46:24,650 epoch 5 - iter 1980/1984 - loss 0.05779940 - time (sec): 92.77 - samples/sec: 1764.14 - lr: 0.000028 - momentum: 0.000000 2023-10-17 14:46:24,834 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:46:24,834 EPOCH 5 done: loss 0.0577 - lr: 0.000028 2023-10-17 14:46:28,553 DEV : loss 0.1740676909685135 - f1-score (micro avg) 0.7693 2023-10-17 14:46:28,576 saving best model 2023-10-17 14:46:29,022 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:46:38,264 epoch 6 - iter 198/1984 - loss 0.03690461 - time (sec): 9.24 - samples/sec: 1792.31 - lr: 0.000027 - momentum: 0.000000 2023-10-17 14:46:47,589 epoch 6 - iter 396/1984 - loss 0.03910030 - time (sec): 18.57 - samples/sec: 1802.43 - lr: 0.000027 - momentum: 0.000000 2023-10-17 14:46:56,854 epoch 6 - iter 594/1984 - loss 0.03925035 - time (sec): 27.83 - samples/sec: 1802.69 - lr: 0.000026 - momentum: 0.000000 2023-10-17 14:47:07,108 epoch 6 - iter 792/1984 - loss 0.03792416 - time (sec): 38.08 - samples/sec: 1754.85 - lr: 0.000026 - momentum: 0.000000 2023-10-17 14:47:16,414 epoch 6 - iter 990/1984 - loss 0.03876486 - time (sec): 47.39 - samples/sec: 1755.06 - lr: 0.000025 - momentum: 0.000000 2023-10-17 14:47:25,956 epoch 6 - iter 1188/1984 - loss 0.03957422 - time (sec): 56.93 - samples/sec: 1764.00 - lr: 0.000024 - momentum: 0.000000 2023-10-17 14:47:36,075 epoch 6 - iter 1386/1984 - loss 0.04093735 - time (sec): 67.05 - samples/sec: 1728.86 - lr: 0.000024 - momentum: 0.000000 2023-10-17 14:47:45,389 epoch 6 - iter 1584/1984 - loss 0.04172085 - time (sec): 76.36 - samples/sec: 1722.22 - lr: 0.000023 - momentum: 0.000000 2023-10-17 14:47:54,555 epoch 6 - iter 1782/1984 - loss 0.04124955 - time (sec): 85.53 - samples/sec: 1727.67 - lr: 0.000023 - momentum: 0.000000 2023-10-17 14:48:03,703 epoch 6 - iter 1980/1984 - loss 0.04232429 - time (sec): 94.68 - samples/sec: 1728.57 - lr: 0.000022 - momentum: 0.000000 2023-10-17 14:48:03,877 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:48:03,877 EPOCH 6 done: loss 0.0424 - lr: 0.000022 2023-10-17 14:48:07,504 DEV : loss 0.20554155111312866 - f1-score (micro avg) 0.7584 2023-10-17 14:48:07,528 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:48:16,741 epoch 7 - iter 198/1984 - loss 0.02950479 - time (sec): 9.21 - samples/sec: 1724.69 - lr: 0.000022 - momentum: 0.000000 2023-10-17 14:48:26,412 epoch 7 - iter 396/1984 - loss 0.03032516 - time (sec): 18.88 - samples/sec: 1722.14 - lr: 0.000021 - momentum: 0.000000 2023-10-17 14:48:35,780 epoch 7 - iter 594/1984 - loss 0.03120726 - time (sec): 28.25 - samples/sec: 1730.28 - lr: 0.000021 - momentum: 0.000000 2023-10-17 14:48:45,211 epoch 7 - iter 792/1984 - loss 0.03102425 - time (sec): 37.68 - samples/sec: 1762.04 - lr: 0.000020 - momentum: 0.000000 2023-10-17 14:48:54,671 epoch 7 - iter 990/1984 - loss 0.03038851 - time (sec): 47.14 - samples/sec: 1788.78 - lr: 0.000019 - momentum: 0.000000 2023-10-17 14:49:03,867 epoch 7 - iter 1188/1984 - loss 0.02992404 - time (sec): 56.34 - samples/sec: 1783.21 - lr: 0.000019 - momentum: 0.000000 2023-10-17 14:49:12,915 epoch 7 - iter 1386/1984 - loss 0.02931818 - time (sec): 65.39 - samples/sec: 1777.03 - lr: 0.000018 - momentum: 0.000000 2023-10-17 14:49:22,413 epoch 7 - iter 1584/1984 - loss 0.02916391 - time (sec): 74.88 - samples/sec: 1765.34 - lr: 0.000018 - momentum: 0.000000 2023-10-17 14:49:31,394 epoch 7 - iter 1782/1984 - loss 0.03026648 - time (sec): 83.87 - samples/sec: 1767.12 - lr: 0.000017 - momentum: 0.000000 2023-10-17 14:49:40,661 epoch 7 - iter 1980/1984 - loss 0.02921112 - time (sec): 93.13 - samples/sec: 1756.70 - lr: 0.000017 - momentum: 0.000000 2023-10-17 14:49:40,851 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:49:40,851 EPOCH 7 done: loss 0.0292 - lr: 0.000017 2023-10-17 14:49:45,123 DEV : loss 0.2290586233139038 - f1-score (micro avg) 0.7505 2023-10-17 14:49:45,146 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:49:54,826 epoch 8 - iter 198/1984 - loss 0.01513390 - time (sec): 9.68 - samples/sec: 1680.82 - lr: 0.000016 - momentum: 0.000000 2023-10-17 14:50:04,078 epoch 8 - iter 396/1984 - loss 0.01803984 - time (sec): 18.93 - samples/sec: 1706.25 - lr: 0.000016 - momentum: 0.000000 2023-10-17 14:50:13,291 epoch 8 - iter 594/1984 - loss 0.01882815 - time (sec): 28.14 - samples/sec: 1733.21 - lr: 0.000015 - momentum: 0.000000 2023-10-17 14:50:22,747 epoch 8 - iter 792/1984 - loss 0.01927639 - time (sec): 37.60 - samples/sec: 1736.55 - lr: 0.000014 - momentum: 0.000000 2023-10-17 14:50:32,072 epoch 8 - iter 990/1984 - loss 0.01968799 - time (sec): 46.92 - samples/sec: 1735.11 - lr: 0.000014 - momentum: 0.000000 2023-10-17 14:50:41,324 epoch 8 - iter 1188/1984 - loss 0.01995151 - time (sec): 56.18 - samples/sec: 1729.33 - lr: 0.000013 - momentum: 0.000000 2023-10-17 14:50:50,348 epoch 8 - iter 1386/1984 - loss 0.02024781 - time (sec): 65.20 - samples/sec: 1743.15 - lr: 0.000013 - momentum: 0.000000 2023-10-17 14:50:59,511 epoch 8 - iter 1584/1984 - loss 0.02068668 - time (sec): 74.36 - samples/sec: 1753.81 - lr: 0.000012 - momentum: 0.000000 2023-10-17 14:51:09,432 epoch 8 - iter 1782/1984 - loss 0.02062173 - time (sec): 84.28 - samples/sec: 1735.43 - lr: 0.000012 - momentum: 0.000000 2023-10-17 14:51:18,868 epoch 8 - iter 1980/1984 - loss 0.02039143 - time (sec): 93.72 - samples/sec: 1745.87 - lr: 0.000011 - momentum: 0.000000 2023-10-17 14:51:19,051 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:51:19,051 EPOCH 8 done: loss 0.0204 - lr: 0.000011 2023-10-17 14:51:22,854 DEV : loss 0.24203334748744965 - f1-score (micro avg) 0.754 2023-10-17 14:51:22,894 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:51:32,128 epoch 9 - iter 198/1984 - loss 0.01084987 - time (sec): 9.23 - samples/sec: 1823.34 - lr: 0.000011 - momentum: 0.000000 2023-10-17 14:51:41,212 epoch 9 - iter 396/1984 - loss 0.01141896 - time (sec): 18.32 - samples/sec: 1843.65 - lr: 0.000010 - momentum: 0.000000 2023-10-17 14:51:50,845 epoch 9 - iter 594/1984 - loss 0.01119116 - time (sec): 27.95 - samples/sec: 1795.12 - lr: 0.000009 - momentum: 0.000000 2023-10-17 14:52:00,215 epoch 9 - iter 792/1984 - loss 0.01278662 - time (sec): 37.32 - samples/sec: 1777.27 - lr: 0.000009 - momentum: 0.000000 2023-10-17 14:52:09,971 epoch 9 - iter 990/1984 - loss 0.01250870 - time (sec): 47.08 - samples/sec: 1768.96 - lr: 0.000008 - momentum: 0.000000 2023-10-17 14:52:19,310 epoch 9 - iter 1188/1984 - loss 0.01336412 - time (sec): 56.41 - samples/sec: 1757.02 - lr: 0.000008 - momentum: 0.000000 2023-10-17 14:52:28,724 epoch 9 - iter 1386/1984 - loss 0.01389645 - time (sec): 65.83 - samples/sec: 1756.38 - lr: 0.000007 - momentum: 0.000000 2023-10-17 14:52:37,914 epoch 9 - iter 1584/1984 - loss 0.01428654 - time (sec): 75.02 - samples/sec: 1756.92 - lr: 0.000007 - momentum: 0.000000 2023-10-17 14:52:47,117 epoch 9 - iter 1782/1984 - loss 0.01382085 - time (sec): 84.22 - samples/sec: 1754.81 - lr: 0.000006 - momentum: 0.000000 2023-10-17 14:52:56,208 epoch 9 - iter 1980/1984 - loss 0.01418063 - time (sec): 93.31 - samples/sec: 1754.30 - lr: 0.000006 - momentum: 0.000000 2023-10-17 14:52:56,394 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:52:56,394 EPOCH 9 done: loss 0.0142 - lr: 0.000006 2023-10-17 14:52:59,962 DEV : loss 0.25322186946868896 - f1-score (micro avg) 0.7667 2023-10-17 14:52:59,990 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:53:09,490 epoch 10 - iter 198/1984 - loss 0.01078046 - time (sec): 9.50 - samples/sec: 1780.74 - lr: 0.000005 - momentum: 0.000000 2023-10-17 14:53:18,596 epoch 10 - iter 396/1984 - loss 0.01032270 - time (sec): 18.61 - samples/sec: 1793.76 - lr: 0.000004 - momentum: 0.000000 2023-10-17 14:53:28,045 epoch 10 - iter 594/1984 - loss 0.01021630 - time (sec): 28.05 - samples/sec: 1788.38 - lr: 0.000004 - momentum: 0.000000 2023-10-17 14:53:38,161 epoch 10 - iter 792/1984 - loss 0.00952196 - time (sec): 38.17 - samples/sec: 1717.08 - lr: 0.000003 - momentum: 0.000000 2023-10-17 14:53:47,799 epoch 10 - iter 990/1984 - loss 0.01024037 - time (sec): 47.81 - samples/sec: 1726.62 - lr: 0.000003 - momentum: 0.000000 2023-10-17 14:53:57,540 epoch 10 - iter 1188/1984 - loss 0.01046950 - time (sec): 57.55 - samples/sec: 1708.01 - lr: 0.000002 - momentum: 0.000000 2023-10-17 14:54:06,228 epoch 10 - iter 1386/1984 - loss 0.01083885 - time (sec): 66.24 - samples/sec: 1730.07 - lr: 0.000002 - momentum: 0.000000 2023-10-17 14:54:15,164 epoch 10 - iter 1584/1984 - loss 0.01034329 - time (sec): 75.17 - samples/sec: 1735.55 - lr: 0.000001 - momentum: 0.000000 2023-10-17 14:54:23,815 epoch 10 - iter 1782/1984 - loss 0.00998575 - time (sec): 83.82 - samples/sec: 1742.01 - lr: 0.000001 - momentum: 0.000000 2023-10-17 14:54:33,267 epoch 10 - iter 1980/1984 - loss 0.00987809 - time (sec): 93.28 - samples/sec: 1755.56 - lr: 0.000000 - momentum: 0.000000 2023-10-17 14:54:33,445 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:54:33,445 EPOCH 10 done: loss 0.0099 - lr: 0.000000 2023-10-17 14:54:37,004 DEV : loss 0.2603410482406616 - f1-score (micro avg) 0.7629 2023-10-17 14:54:37,394 ---------------------------------------------------------------------------------------------------- 2023-10-17 14:54:37,395 Loading model from best epoch ... 2023-10-17 14:54:38,770 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-17 14:54:42,106 Results: - F-score (micro) 0.7645 - F-score (macro) 0.6852 - Accuracy 0.6492 By class: precision recall f1-score support LOC 0.7888 0.8611 0.8234 655 PER 0.7399 0.7399 0.7399 223 ORG 0.4884 0.4961 0.4922 127 micro avg 0.7423 0.7881 0.7645 1005 macro avg 0.6724 0.6990 0.6852 1005 weighted avg 0.7400 0.7881 0.7630 1005 2023-10-17 14:54:42,106 ----------------------------------------------------------------------------------------------------