stefan-it's picture
Upload folder using huggingface_hub
70e96e2
2023-10-17 13:42:53,754 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,755 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 13:42:53,755 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,755 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-17 13:42:53,756 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,756 Train: 7936 sentences
2023-10-17 13:42:53,756 (train_with_dev=False, train_with_test=False)
2023-10-17 13:42:53,756 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,756 Training Params:
2023-10-17 13:42:53,756 - learning_rate: "5e-05"
2023-10-17 13:42:53,756 - mini_batch_size: "4"
2023-10-17 13:42:53,756 - max_epochs: "10"
2023-10-17 13:42:53,756 - shuffle: "True"
2023-10-17 13:42:53,756 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,756 Plugins:
2023-10-17 13:42:53,756 - TensorboardLogger
2023-10-17 13:42:53,756 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 13:42:53,756 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,756 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 13:42:53,756 - metric: "('micro avg', 'f1-score')"
2023-10-17 13:42:53,756 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,756 Computation:
2023-10-17 13:42:53,756 - compute on device: cuda:0
2023-10-17 13:42:53,756 - embedding storage: none
2023-10-17 13:42:53,756 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,756 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 13:42:53,756 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,756 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,756 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 13:43:02,902 epoch 1 - iter 198/1984 - loss 1.99194816 - time (sec): 9.14 - samples/sec: 1812.16 - lr: 0.000005 - momentum: 0.000000
2023-10-17 13:43:12,040 epoch 1 - iter 396/1984 - loss 1.12529108 - time (sec): 18.28 - samples/sec: 1834.45 - lr: 0.000010 - momentum: 0.000000
2023-10-17 13:43:20,571 epoch 1 - iter 594/1984 - loss 0.84075243 - time (sec): 26.81 - samples/sec: 1828.97 - lr: 0.000015 - momentum: 0.000000
2023-10-17 13:43:29,636 epoch 1 - iter 792/1984 - loss 0.67948072 - time (sec): 35.88 - samples/sec: 1809.73 - lr: 0.000020 - momentum: 0.000000
2023-10-17 13:43:38,961 epoch 1 - iter 990/1984 - loss 0.56666412 - time (sec): 45.20 - samples/sec: 1828.99 - lr: 0.000025 - momentum: 0.000000
2023-10-17 13:43:47,711 epoch 1 - iter 1188/1984 - loss 0.50179248 - time (sec): 53.95 - samples/sec: 1829.62 - lr: 0.000030 - momentum: 0.000000
2023-10-17 13:43:56,331 epoch 1 - iter 1386/1984 - loss 0.45691609 - time (sec): 62.57 - samples/sec: 1836.93 - lr: 0.000035 - momentum: 0.000000
2023-10-17 13:44:05,242 epoch 1 - iter 1584/1984 - loss 0.41865425 - time (sec): 71.48 - samples/sec: 1841.02 - lr: 0.000040 - momentum: 0.000000
2023-10-17 13:44:14,524 epoch 1 - iter 1782/1984 - loss 0.38769661 - time (sec): 80.77 - samples/sec: 1834.44 - lr: 0.000045 - momentum: 0.000000
2023-10-17 13:44:23,569 epoch 1 - iter 1980/1984 - loss 0.36361201 - time (sec): 89.81 - samples/sec: 1822.66 - lr: 0.000050 - momentum: 0.000000
2023-10-17 13:44:23,744 ----------------------------------------------------------------------------------------------------
2023-10-17 13:44:23,744 EPOCH 1 done: loss 0.3635 - lr: 0.000050
2023-10-17 13:44:26,902 DEV : loss 0.1107097640633583 - f1-score (micro avg) 0.7147
2023-10-17 13:44:26,923 saving best model
2023-10-17 13:44:27,399 ----------------------------------------------------------------------------------------------------
2023-10-17 13:44:36,392 epoch 2 - iter 198/1984 - loss 0.12198282 - time (sec): 8.99 - samples/sec: 1816.59 - lr: 0.000049 - momentum: 0.000000
2023-10-17 13:44:45,686 epoch 2 - iter 396/1984 - loss 0.12288858 - time (sec): 18.29 - samples/sec: 1802.37 - lr: 0.000049 - momentum: 0.000000
2023-10-17 13:44:54,485 epoch 2 - iter 594/1984 - loss 0.12158637 - time (sec): 27.08 - samples/sec: 1790.77 - lr: 0.000048 - momentum: 0.000000
2023-10-17 13:45:03,453 epoch 2 - iter 792/1984 - loss 0.12378321 - time (sec): 36.05 - samples/sec: 1793.41 - lr: 0.000048 - momentum: 0.000000
2023-10-17 13:45:12,572 epoch 2 - iter 990/1984 - loss 0.11981922 - time (sec): 45.17 - samples/sec: 1793.52 - lr: 0.000047 - momentum: 0.000000
2023-10-17 13:45:21,725 epoch 2 - iter 1188/1984 - loss 0.11892503 - time (sec): 54.32 - samples/sec: 1795.43 - lr: 0.000047 - momentum: 0.000000
2023-10-17 13:45:30,864 epoch 2 - iter 1386/1984 - loss 0.11995066 - time (sec): 63.46 - samples/sec: 1807.05 - lr: 0.000046 - momentum: 0.000000
2023-10-17 13:45:40,092 epoch 2 - iter 1584/1984 - loss 0.11971087 - time (sec): 72.69 - samples/sec: 1801.71 - lr: 0.000046 - momentum: 0.000000
2023-10-17 13:45:49,162 epoch 2 - iter 1782/1984 - loss 0.11939199 - time (sec): 81.76 - samples/sec: 1797.41 - lr: 0.000045 - momentum: 0.000000
2023-10-17 13:45:58,300 epoch 2 - iter 1980/1984 - loss 0.11867051 - time (sec): 90.90 - samples/sec: 1802.10 - lr: 0.000044 - momentum: 0.000000
2023-10-17 13:45:58,474 ----------------------------------------------------------------------------------------------------
2023-10-17 13:45:58,474 EPOCH 2 done: loss 0.1188 - lr: 0.000044
2023-10-17 13:46:02,349 DEV : loss 0.09292253851890564 - f1-score (micro avg) 0.7461
2023-10-17 13:46:02,372 saving best model
2023-10-17 13:46:02,884 ----------------------------------------------------------------------------------------------------
2023-10-17 13:46:12,137 epoch 3 - iter 198/1984 - loss 0.08093006 - time (sec): 9.25 - samples/sec: 1833.94 - lr: 0.000044 - momentum: 0.000000
2023-10-17 13:46:21,419 epoch 3 - iter 396/1984 - loss 0.08776510 - time (sec): 18.53 - samples/sec: 1800.42 - lr: 0.000043 - momentum: 0.000000
2023-10-17 13:46:30,417 epoch 3 - iter 594/1984 - loss 0.08770434 - time (sec): 27.53 - samples/sec: 1795.44 - lr: 0.000043 - momentum: 0.000000
2023-10-17 13:46:39,552 epoch 3 - iter 792/1984 - loss 0.08713512 - time (sec): 36.67 - samples/sec: 1797.16 - lr: 0.000042 - momentum: 0.000000
2023-10-17 13:46:48,582 epoch 3 - iter 990/1984 - loss 0.08915958 - time (sec): 45.70 - samples/sec: 1808.80 - lr: 0.000042 - momentum: 0.000000
2023-10-17 13:46:57,705 epoch 3 - iter 1188/1984 - loss 0.08975126 - time (sec): 54.82 - samples/sec: 1817.72 - lr: 0.000041 - momentum: 0.000000
2023-10-17 13:47:06,969 epoch 3 - iter 1386/1984 - loss 0.08933047 - time (sec): 64.08 - samples/sec: 1822.52 - lr: 0.000041 - momentum: 0.000000
2023-10-17 13:47:16,293 epoch 3 - iter 1584/1984 - loss 0.08968171 - time (sec): 73.41 - samples/sec: 1814.20 - lr: 0.000040 - momentum: 0.000000
2023-10-17 13:47:25,592 epoch 3 - iter 1782/1984 - loss 0.09022258 - time (sec): 82.71 - samples/sec: 1798.16 - lr: 0.000039 - momentum: 0.000000
2023-10-17 13:47:34,674 epoch 3 - iter 1980/1984 - loss 0.08989172 - time (sec): 91.79 - samples/sec: 1783.16 - lr: 0.000039 - momentum: 0.000000
2023-10-17 13:47:34,856 ----------------------------------------------------------------------------------------------------
2023-10-17 13:47:34,856 EPOCH 3 done: loss 0.0899 - lr: 0.000039
2023-10-17 13:47:38,254 DEV : loss 0.11529310792684555 - f1-score (micro avg) 0.7554
2023-10-17 13:47:38,275 saving best model
2023-10-17 13:47:38,842 ----------------------------------------------------------------------------------------------------
2023-10-17 13:47:47,509 epoch 4 - iter 198/1984 - loss 0.05901225 - time (sec): 8.66 - samples/sec: 1941.15 - lr: 0.000038 - momentum: 0.000000
2023-10-17 13:47:56,529 epoch 4 - iter 396/1984 - loss 0.07172775 - time (sec): 17.68 - samples/sec: 1856.82 - lr: 0.000038 - momentum: 0.000000
2023-10-17 13:48:05,245 epoch 4 - iter 594/1984 - loss 0.06965845 - time (sec): 26.40 - samples/sec: 1845.67 - lr: 0.000037 - momentum: 0.000000
2023-10-17 13:48:13,933 epoch 4 - iter 792/1984 - loss 0.07214150 - time (sec): 35.09 - samples/sec: 1873.43 - lr: 0.000037 - momentum: 0.000000
2023-10-17 13:48:22,554 epoch 4 - iter 990/1984 - loss 0.07156799 - time (sec): 43.71 - samples/sec: 1878.28 - lr: 0.000036 - momentum: 0.000000
2023-10-17 13:48:31,704 epoch 4 - iter 1188/1984 - loss 0.07472710 - time (sec): 52.86 - samples/sec: 1874.11 - lr: 0.000036 - momentum: 0.000000
2023-10-17 13:48:40,821 epoch 4 - iter 1386/1984 - loss 0.07315463 - time (sec): 61.97 - samples/sec: 1855.64 - lr: 0.000035 - momentum: 0.000000
2023-10-17 13:48:49,969 epoch 4 - iter 1584/1984 - loss 0.07446253 - time (sec): 71.12 - samples/sec: 1846.18 - lr: 0.000034 - momentum: 0.000000
2023-10-17 13:48:59,484 epoch 4 - iter 1782/1984 - loss 0.07318315 - time (sec): 80.64 - samples/sec: 1826.73 - lr: 0.000034 - momentum: 0.000000
2023-10-17 13:49:09,059 epoch 4 - iter 1980/1984 - loss 0.07161599 - time (sec): 90.21 - samples/sec: 1815.30 - lr: 0.000033 - momentum: 0.000000
2023-10-17 13:49:09,239 ----------------------------------------------------------------------------------------------------
2023-10-17 13:49:09,239 EPOCH 4 done: loss 0.0716 - lr: 0.000033
2023-10-17 13:49:12,748 DEV : loss 0.16965167224407196 - f1-score (micro avg) 0.7562
2023-10-17 13:49:12,770 saving best model
2023-10-17 13:49:13,358 ----------------------------------------------------------------------------------------------------
2023-10-17 13:49:22,504 epoch 5 - iter 198/1984 - loss 0.05142115 - time (sec): 9.14 - samples/sec: 1744.82 - lr: 0.000033 - momentum: 0.000000
2023-10-17 13:49:31,639 epoch 5 - iter 396/1984 - loss 0.05326253 - time (sec): 18.28 - samples/sec: 1783.51 - lr: 0.000032 - momentum: 0.000000
2023-10-17 13:49:40,726 epoch 5 - iter 594/1984 - loss 0.05234995 - time (sec): 27.36 - samples/sec: 1766.40 - lr: 0.000032 - momentum: 0.000000
2023-10-17 13:49:49,869 epoch 5 - iter 792/1984 - loss 0.05493024 - time (sec): 36.51 - samples/sec: 1766.69 - lr: 0.000031 - momentum: 0.000000
2023-10-17 13:49:59,222 epoch 5 - iter 990/1984 - loss 0.05547255 - time (sec): 45.86 - samples/sec: 1768.21 - lr: 0.000031 - momentum: 0.000000
2023-10-17 13:50:08,703 epoch 5 - iter 1188/1984 - loss 0.05535415 - time (sec): 55.34 - samples/sec: 1775.28 - lr: 0.000030 - momentum: 0.000000
2023-10-17 13:50:17,849 epoch 5 - iter 1386/1984 - loss 0.05553716 - time (sec): 64.49 - samples/sec: 1780.41 - lr: 0.000029 - momentum: 0.000000
2023-10-17 13:50:27,041 epoch 5 - iter 1584/1984 - loss 0.05438664 - time (sec): 73.68 - samples/sec: 1785.59 - lr: 0.000029 - momentum: 0.000000
2023-10-17 13:50:36,058 epoch 5 - iter 1782/1984 - loss 0.05420342 - time (sec): 82.70 - samples/sec: 1785.78 - lr: 0.000028 - momentum: 0.000000
2023-10-17 13:50:45,233 epoch 5 - iter 1980/1984 - loss 0.05491134 - time (sec): 91.87 - samples/sec: 1780.76 - lr: 0.000028 - momentum: 0.000000
2023-10-17 13:50:45,422 ----------------------------------------------------------------------------------------------------
2023-10-17 13:50:45,423 EPOCH 5 done: loss 0.0548 - lr: 0.000028
2023-10-17 13:50:48,854 DEV : loss 0.17186634242534637 - f1-score (micro avg) 0.7583
2023-10-17 13:50:48,875 saving best model
2023-10-17 13:50:49,394 ----------------------------------------------------------------------------------------------------
2023-10-17 13:50:58,709 epoch 6 - iter 198/1984 - loss 0.04210687 - time (sec): 9.31 - samples/sec: 1797.63 - lr: 0.000027 - momentum: 0.000000
2023-10-17 13:51:07,945 epoch 6 - iter 396/1984 - loss 0.04558327 - time (sec): 18.55 - samples/sec: 1786.69 - lr: 0.000027 - momentum: 0.000000
2023-10-17 13:51:17,296 epoch 6 - iter 594/1984 - loss 0.04411621 - time (sec): 27.90 - samples/sec: 1776.05 - lr: 0.000026 - momentum: 0.000000
2023-10-17 13:51:26,833 epoch 6 - iter 792/1984 - loss 0.04189826 - time (sec): 37.44 - samples/sec: 1766.81 - lr: 0.000026 - momentum: 0.000000
2023-10-17 13:51:35,758 epoch 6 - iter 990/1984 - loss 0.04186020 - time (sec): 46.36 - samples/sec: 1790.47 - lr: 0.000025 - momentum: 0.000000
2023-10-17 13:51:44,892 epoch 6 - iter 1188/1984 - loss 0.04179555 - time (sec): 55.50 - samples/sec: 1784.39 - lr: 0.000024 - momentum: 0.000000
2023-10-17 13:51:54,014 epoch 6 - iter 1386/1984 - loss 0.04062803 - time (sec): 64.62 - samples/sec: 1802.35 - lr: 0.000024 - momentum: 0.000000
2023-10-17 13:52:03,187 epoch 6 - iter 1584/1984 - loss 0.04047753 - time (sec): 73.79 - samples/sec: 1794.79 - lr: 0.000023 - momentum: 0.000000
2023-10-17 13:52:12,464 epoch 6 - iter 1782/1984 - loss 0.04096866 - time (sec): 83.07 - samples/sec: 1783.58 - lr: 0.000023 - momentum: 0.000000
2023-10-17 13:52:21,511 epoch 6 - iter 1980/1984 - loss 0.04137447 - time (sec): 92.11 - samples/sec: 1776.94 - lr: 0.000022 - momentum: 0.000000
2023-10-17 13:52:21,692 ----------------------------------------------------------------------------------------------------
2023-10-17 13:52:21,692 EPOCH 6 done: loss 0.0414 - lr: 0.000022
2023-10-17 13:52:25,752 DEV : loss 0.1979617029428482 - f1-score (micro avg) 0.7635
2023-10-17 13:52:25,775 saving best model
2023-10-17 13:52:26,295 ----------------------------------------------------------------------------------------------------
2023-10-17 13:52:35,522 epoch 7 - iter 198/1984 - loss 0.02932924 - time (sec): 9.22 - samples/sec: 1697.08 - lr: 0.000022 - momentum: 0.000000
2023-10-17 13:52:45,099 epoch 7 - iter 396/1984 - loss 0.02557373 - time (sec): 18.80 - samples/sec: 1741.96 - lr: 0.000021 - momentum: 0.000000
2023-10-17 13:52:54,288 epoch 7 - iter 594/1984 - loss 0.02664735 - time (sec): 27.99 - samples/sec: 1753.83 - lr: 0.000021 - momentum: 0.000000
2023-10-17 13:53:03,396 epoch 7 - iter 792/1984 - loss 0.02794794 - time (sec): 37.10 - samples/sec: 1758.17 - lr: 0.000020 - momentum: 0.000000
2023-10-17 13:53:12,526 epoch 7 - iter 990/1984 - loss 0.02878967 - time (sec): 46.23 - samples/sec: 1767.18 - lr: 0.000019 - momentum: 0.000000
2023-10-17 13:53:21,604 epoch 7 - iter 1188/1984 - loss 0.02864416 - time (sec): 55.30 - samples/sec: 1761.33 - lr: 0.000019 - momentum: 0.000000
2023-10-17 13:53:30,974 epoch 7 - iter 1386/1984 - loss 0.02877632 - time (sec): 64.67 - samples/sec: 1751.38 - lr: 0.000018 - momentum: 0.000000
2023-10-17 13:53:40,189 epoch 7 - iter 1584/1984 - loss 0.02875581 - time (sec): 73.89 - samples/sec: 1754.20 - lr: 0.000018 - momentum: 0.000000
2023-10-17 13:53:49,370 epoch 7 - iter 1782/1984 - loss 0.02874451 - time (sec): 83.07 - samples/sec: 1755.94 - lr: 0.000017 - momentum: 0.000000
2023-10-17 13:53:58,828 epoch 7 - iter 1980/1984 - loss 0.02854042 - time (sec): 92.53 - samples/sec: 1768.83 - lr: 0.000017 - momentum: 0.000000
2023-10-17 13:53:59,013 ----------------------------------------------------------------------------------------------------
2023-10-17 13:53:59,014 EPOCH 7 done: loss 0.0285 - lr: 0.000017
2023-10-17 13:54:02,405 DEV : loss 0.21675720810890198 - f1-score (micro avg) 0.756
2023-10-17 13:54:02,426 ----------------------------------------------------------------------------------------------------
2023-10-17 13:54:11,532 epoch 8 - iter 198/1984 - loss 0.02131513 - time (sec): 9.10 - samples/sec: 1755.79 - lr: 0.000016 - momentum: 0.000000
2023-10-17 13:54:20,688 epoch 8 - iter 396/1984 - loss 0.02216696 - time (sec): 18.26 - samples/sec: 1771.05 - lr: 0.000016 - momentum: 0.000000
2023-10-17 13:54:29,689 epoch 8 - iter 594/1984 - loss 0.02326467 - time (sec): 27.26 - samples/sec: 1759.32 - lr: 0.000015 - momentum: 0.000000
2023-10-17 13:54:38,755 epoch 8 - iter 792/1984 - loss 0.02241368 - time (sec): 36.33 - samples/sec: 1763.98 - lr: 0.000014 - momentum: 0.000000
2023-10-17 13:54:47,967 epoch 8 - iter 990/1984 - loss 0.02005196 - time (sec): 45.54 - samples/sec: 1771.30 - lr: 0.000014 - momentum: 0.000000
2023-10-17 13:54:57,291 epoch 8 - iter 1188/1984 - loss 0.01961340 - time (sec): 54.86 - samples/sec: 1775.08 - lr: 0.000013 - momentum: 0.000000
2023-10-17 13:55:06,861 epoch 8 - iter 1386/1984 - loss 0.02145710 - time (sec): 64.43 - samples/sec: 1753.37 - lr: 0.000013 - momentum: 0.000000
2023-10-17 13:55:16,241 epoch 8 - iter 1584/1984 - loss 0.02064329 - time (sec): 73.81 - samples/sec: 1762.80 - lr: 0.000012 - momentum: 0.000000
2023-10-17 13:55:25,486 epoch 8 - iter 1782/1984 - loss 0.01981026 - time (sec): 83.06 - samples/sec: 1766.73 - lr: 0.000012 - momentum: 0.000000
2023-10-17 13:55:34,752 epoch 8 - iter 1980/1984 - loss 0.02031700 - time (sec): 92.32 - samples/sec: 1772.93 - lr: 0.000011 - momentum: 0.000000
2023-10-17 13:55:34,925 ----------------------------------------------------------------------------------------------------
2023-10-17 13:55:34,925 EPOCH 8 done: loss 0.0203 - lr: 0.000011
2023-10-17 13:55:38,336 DEV : loss 0.22816428542137146 - f1-score (micro avg) 0.7689
2023-10-17 13:55:38,358 saving best model
2023-10-17 13:55:38,938 ----------------------------------------------------------------------------------------------------
2023-10-17 13:55:47,982 epoch 9 - iter 198/1984 - loss 0.01192882 - time (sec): 9.04 - samples/sec: 1734.20 - lr: 0.000011 - momentum: 0.000000
2023-10-17 13:55:56,918 epoch 9 - iter 396/1984 - loss 0.01286841 - time (sec): 17.97 - samples/sec: 1798.99 - lr: 0.000010 - momentum: 0.000000
2023-10-17 13:56:05,849 epoch 9 - iter 594/1984 - loss 0.01438189 - time (sec): 26.91 - samples/sec: 1772.52 - lr: 0.000009 - momentum: 0.000000
2023-10-17 13:56:14,942 epoch 9 - iter 792/1984 - loss 0.01318238 - time (sec): 36.00 - samples/sec: 1770.21 - lr: 0.000009 - momentum: 0.000000
2023-10-17 13:56:24,030 epoch 9 - iter 990/1984 - loss 0.01351039 - time (sec): 45.09 - samples/sec: 1778.59 - lr: 0.000008 - momentum: 0.000000
2023-10-17 13:56:32,852 epoch 9 - iter 1188/1984 - loss 0.01319278 - time (sec): 53.91 - samples/sec: 1793.12 - lr: 0.000008 - momentum: 0.000000
2023-10-17 13:56:41,731 epoch 9 - iter 1386/1984 - loss 0.01309930 - time (sec): 62.79 - samples/sec: 1811.55 - lr: 0.000007 - momentum: 0.000000
2023-10-17 13:56:51,161 epoch 9 - iter 1584/1984 - loss 0.01255843 - time (sec): 72.22 - samples/sec: 1810.15 - lr: 0.000007 - momentum: 0.000000
2023-10-17 13:57:00,359 epoch 9 - iter 1782/1984 - loss 0.01307612 - time (sec): 81.42 - samples/sec: 1803.41 - lr: 0.000006 - momentum: 0.000000
2023-10-17 13:57:09,450 epoch 9 - iter 1980/1984 - loss 0.01305751 - time (sec): 90.51 - samples/sec: 1807.63 - lr: 0.000006 - momentum: 0.000000
2023-10-17 13:57:09,640 ----------------------------------------------------------------------------------------------------
2023-10-17 13:57:09,640 EPOCH 9 done: loss 0.0130 - lr: 0.000006
2023-10-17 13:57:13,058 DEV : loss 0.23943665623664856 - f1-score (micro avg) 0.7711
2023-10-17 13:57:13,081 saving best model
2023-10-17 13:57:13,702 ----------------------------------------------------------------------------------------------------
2023-10-17 13:57:22,874 epoch 10 - iter 198/1984 - loss 0.00790504 - time (sec): 9.17 - samples/sec: 1775.44 - lr: 0.000005 - momentum: 0.000000
2023-10-17 13:57:32,117 epoch 10 - iter 396/1984 - loss 0.00810567 - time (sec): 18.41 - samples/sec: 1819.26 - lr: 0.000004 - momentum: 0.000000
2023-10-17 13:57:41,199 epoch 10 - iter 594/1984 - loss 0.00780535 - time (sec): 27.49 - samples/sec: 1803.46 - lr: 0.000004 - momentum: 0.000000
2023-10-17 13:57:50,230 epoch 10 - iter 792/1984 - loss 0.00784589 - time (sec): 36.53 - samples/sec: 1788.64 - lr: 0.000003 - momentum: 0.000000
2023-10-17 13:57:59,203 epoch 10 - iter 990/1984 - loss 0.00744212 - time (sec): 45.50 - samples/sec: 1799.38 - lr: 0.000003 - momentum: 0.000000
2023-10-17 13:58:08,284 epoch 10 - iter 1188/1984 - loss 0.00828425 - time (sec): 54.58 - samples/sec: 1802.08 - lr: 0.000002 - momentum: 0.000000
2023-10-17 13:58:17,328 epoch 10 - iter 1386/1984 - loss 0.00823271 - time (sec): 63.62 - samples/sec: 1805.08 - lr: 0.000002 - momentum: 0.000000
2023-10-17 13:58:26,441 epoch 10 - iter 1584/1984 - loss 0.00807062 - time (sec): 72.74 - samples/sec: 1805.28 - lr: 0.000001 - momentum: 0.000000
2023-10-17 13:58:35,408 epoch 10 - iter 1782/1984 - loss 0.00842240 - time (sec): 81.70 - samples/sec: 1803.89 - lr: 0.000001 - momentum: 0.000000
2023-10-17 13:58:44,477 epoch 10 - iter 1980/1984 - loss 0.00880008 - time (sec): 90.77 - samples/sec: 1803.88 - lr: 0.000000 - momentum: 0.000000
2023-10-17 13:58:44,651 ----------------------------------------------------------------------------------------------------
2023-10-17 13:58:44,651 EPOCH 10 done: loss 0.0088 - lr: 0.000000
2023-10-17 13:58:48,192 DEV : loss 0.2471495419740677 - f1-score (micro avg) 0.7779
2023-10-17 13:58:48,221 saving best model
2023-10-17 13:58:49,148 ----------------------------------------------------------------------------------------------------
2023-10-17 13:58:49,150 Loading model from best epoch ...
2023-10-17 13:58:51,929 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 13:58:54,795
Results:
- F-score (micro) 0.7672
- F-score (macro) 0.6813
- Accuracy 0.653
By class:
precision recall f1-score support
LOC 0.8318 0.8382 0.8350 655
PER 0.6811 0.7758 0.7254 223
ORG 0.5043 0.4646 0.4836 127
micro avg 0.7575 0.7771 0.7672 1005
macro avg 0.6724 0.6928 0.6813 1005
weighted avg 0.7570 0.7771 0.7663 1005
2023-10-17 13:58:54,795 ----------------------------------------------------------------------------------------------------