stefan-it's picture
Upload ./training.log with huggingface_hub
f08c8dc verified
2024-03-26 09:36:08,496 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,496 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(31103, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2024-03-26 09:36:08,496 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,496 Corpus: 758 train + 94 dev + 96 test sentences
2024-03-26 09:36:08,496 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,497 Train: 758 sentences
2024-03-26 09:36:08,497 (train_with_dev=False, train_with_test=False)
2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,497 Training Params:
2024-03-26 09:36:08,497 - learning_rate: "3e-05"
2024-03-26 09:36:08,497 - mini_batch_size: "8"
2024-03-26 09:36:08,497 - max_epochs: "10"
2024-03-26 09:36:08,497 - shuffle: "True"
2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,497 Plugins:
2024-03-26 09:36:08,497 - TensorboardLogger
2024-03-26 09:36:08,497 - LinearScheduler | warmup_fraction: '0.1'
2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,497 Final evaluation on model from best epoch (best-model.pt)
2024-03-26 09:36:08,497 - metric: "('micro avg', 'f1-score')"
2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,497 Computation:
2024-03-26 09:36:08,497 - compute on device: cuda:0
2024-03-26 09:36:08,497 - embedding storage: none
2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,497 Model training base path: "flair-co-funer-gbert_base-bs8-e10-lr3e-05-1"
2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,497 Logging anything other than scalars to TensorBoard is currently not supported.
2024-03-26 09:36:10,077 epoch 1 - iter 9/95 - loss 3.07430171 - time (sec): 1.58 - samples/sec: 1948.72 - lr: 0.000003 - momentum: 0.000000
2024-03-26 09:36:11,598 epoch 1 - iter 18/95 - loss 2.93863503 - time (sec): 3.10 - samples/sec: 2015.76 - lr: 0.000005 - momentum: 0.000000
2024-03-26 09:36:13,982 epoch 1 - iter 27/95 - loss 2.72796059 - time (sec): 5.48 - samples/sec: 1867.09 - lr: 0.000008 - momentum: 0.000000
2024-03-26 09:36:16,197 epoch 1 - iter 36/95 - loss 2.54599224 - time (sec): 7.70 - samples/sec: 1815.54 - lr: 0.000011 - momentum: 0.000000
2024-03-26 09:36:18,079 epoch 1 - iter 45/95 - loss 2.40451416 - time (sec): 9.58 - samples/sec: 1822.54 - lr: 0.000014 - momentum: 0.000000
2024-03-26 09:36:19,302 epoch 1 - iter 54/95 - loss 2.28598600 - time (sec): 10.80 - samples/sec: 1863.97 - lr: 0.000017 - momentum: 0.000000
2024-03-26 09:36:21,006 epoch 1 - iter 63/95 - loss 2.17685156 - time (sec): 12.51 - samples/sec: 1859.95 - lr: 0.000020 - momentum: 0.000000
2024-03-26 09:36:22,291 epoch 1 - iter 72/95 - loss 2.07889595 - time (sec): 13.79 - samples/sec: 1888.50 - lr: 0.000022 - momentum: 0.000000
2024-03-26 09:36:24,262 epoch 1 - iter 81/95 - loss 1.95823440 - time (sec): 15.76 - samples/sec: 1878.77 - lr: 0.000025 - momentum: 0.000000
2024-03-26 09:36:25,582 epoch 1 - iter 90/95 - loss 1.86194154 - time (sec): 17.08 - samples/sec: 1898.79 - lr: 0.000028 - momentum: 0.000000
2024-03-26 09:36:26,799 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:26,799 EPOCH 1 done: loss 1.7902 - lr: 0.000028
2024-03-26 09:36:27,629 DEV : loss 0.5363726615905762 - f1-score (micro avg) 0.6574
2024-03-26 09:36:27,631 saving best model
2024-03-26 09:36:27,890 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:29,937 epoch 2 - iter 9/95 - loss 0.60790463 - time (sec): 2.05 - samples/sec: 1804.37 - lr: 0.000030 - momentum: 0.000000
2024-03-26 09:36:31,613 epoch 2 - iter 18/95 - loss 0.61325425 - time (sec): 3.72 - samples/sec: 1948.78 - lr: 0.000029 - momentum: 0.000000
2024-03-26 09:36:33,424 epoch 2 - iter 27/95 - loss 0.57820829 - time (sec): 5.53 - samples/sec: 1863.01 - lr: 0.000029 - momentum: 0.000000
2024-03-26 09:36:35,185 epoch 2 - iter 36/95 - loss 0.55677353 - time (sec): 7.29 - samples/sec: 1832.91 - lr: 0.000029 - momentum: 0.000000
2024-03-26 09:36:37,083 epoch 2 - iter 45/95 - loss 0.52219029 - time (sec): 9.19 - samples/sec: 1842.27 - lr: 0.000028 - momentum: 0.000000
2024-03-26 09:36:39,277 epoch 2 - iter 54/95 - loss 0.48907984 - time (sec): 11.39 - samples/sec: 1813.28 - lr: 0.000028 - momentum: 0.000000
2024-03-26 09:36:40,585 epoch 2 - iter 63/95 - loss 0.48219691 - time (sec): 12.69 - samples/sec: 1855.74 - lr: 0.000028 - momentum: 0.000000
2024-03-26 09:36:41,911 epoch 2 - iter 72/95 - loss 0.46533736 - time (sec): 14.02 - samples/sec: 1886.75 - lr: 0.000028 - momentum: 0.000000
2024-03-26 09:36:43,701 epoch 2 - iter 81/95 - loss 0.45301630 - time (sec): 15.81 - samples/sec: 1872.24 - lr: 0.000027 - momentum: 0.000000
2024-03-26 09:36:45,350 epoch 2 - iter 90/95 - loss 0.44355465 - time (sec): 17.46 - samples/sec: 1869.23 - lr: 0.000027 - momentum: 0.000000
2024-03-26 09:36:46,274 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:46,274 EPOCH 2 done: loss 0.4355 - lr: 0.000027
2024-03-26 09:36:47,165 DEV : loss 0.2855934500694275 - f1-score (micro avg) 0.828
2024-03-26 09:36:47,166 saving best model
2024-03-26 09:36:47,592 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:49,533 epoch 3 - iter 9/95 - loss 0.34907051 - time (sec): 1.94 - samples/sec: 1730.91 - lr: 0.000026 - momentum: 0.000000
2024-03-26 09:36:51,457 epoch 3 - iter 18/95 - loss 0.29994757 - time (sec): 3.86 - samples/sec: 1741.95 - lr: 0.000026 - momentum: 0.000000
2024-03-26 09:36:52,799 epoch 3 - iter 27/95 - loss 0.27836668 - time (sec): 5.21 - samples/sec: 1837.71 - lr: 0.000026 - momentum: 0.000000
2024-03-26 09:36:55,264 epoch 3 - iter 36/95 - loss 0.27034037 - time (sec): 7.67 - samples/sec: 1762.67 - lr: 0.000025 - momentum: 0.000000
2024-03-26 09:36:57,482 epoch 3 - iter 45/95 - loss 0.25836891 - time (sec): 9.89 - samples/sec: 1795.48 - lr: 0.000025 - momentum: 0.000000
2024-03-26 09:36:58,646 epoch 3 - iter 54/95 - loss 0.25201034 - time (sec): 11.05 - samples/sec: 1853.93 - lr: 0.000025 - momentum: 0.000000
2024-03-26 09:37:00,551 epoch 3 - iter 63/95 - loss 0.24155981 - time (sec): 12.96 - samples/sec: 1838.34 - lr: 0.000025 - momentum: 0.000000
2024-03-26 09:37:02,161 epoch 3 - iter 72/95 - loss 0.23080357 - time (sec): 14.57 - samples/sec: 1843.73 - lr: 0.000024 - momentum: 0.000000
2024-03-26 09:37:03,897 epoch 3 - iter 81/95 - loss 0.23000286 - time (sec): 16.30 - samples/sec: 1834.77 - lr: 0.000024 - momentum: 0.000000
2024-03-26 09:37:06,051 epoch 3 - iter 90/95 - loss 0.22191158 - time (sec): 18.46 - samples/sec: 1804.80 - lr: 0.000024 - momentum: 0.000000
2024-03-26 09:37:06,522 ----------------------------------------------------------------------------------------------------
2024-03-26 09:37:06,522 EPOCH 3 done: loss 0.2220 - lr: 0.000024
2024-03-26 09:37:07,412 DEV : loss 0.24312740564346313 - f1-score (micro avg) 0.8552
2024-03-26 09:37:07,413 saving best model
2024-03-26 09:37:07,838 ----------------------------------------------------------------------------------------------------
2024-03-26 09:37:09,430 epoch 4 - iter 9/95 - loss 0.21131525 - time (sec): 1.59 - samples/sec: 2025.19 - lr: 0.000023 - momentum: 0.000000
2024-03-26 09:37:11,440 epoch 4 - iter 18/95 - loss 0.17490390 - time (sec): 3.60 - samples/sec: 1791.42 - lr: 0.000023 - momentum: 0.000000
2024-03-26 09:37:13,208 epoch 4 - iter 27/95 - loss 0.17565118 - time (sec): 5.37 - samples/sec: 1814.83 - lr: 0.000022 - momentum: 0.000000
2024-03-26 09:37:15,745 epoch 4 - iter 36/95 - loss 0.15298936 - time (sec): 7.91 - samples/sec: 1742.92 - lr: 0.000022 - momentum: 0.000000
2024-03-26 09:37:17,412 epoch 4 - iter 45/95 - loss 0.15724109 - time (sec): 9.57 - samples/sec: 1763.81 - lr: 0.000022 - momentum: 0.000000
2024-03-26 09:37:18,943 epoch 4 - iter 54/95 - loss 0.15626722 - time (sec): 11.10 - samples/sec: 1816.57 - lr: 0.000022 - momentum: 0.000000
2024-03-26 09:37:20,780 epoch 4 - iter 63/95 - loss 0.15817304 - time (sec): 12.94 - samples/sec: 1839.64 - lr: 0.000021 - momentum: 0.000000
2024-03-26 09:37:22,037 epoch 4 - iter 72/95 - loss 0.15929185 - time (sec): 14.20 - samples/sec: 1871.53 - lr: 0.000021 - momentum: 0.000000
2024-03-26 09:37:23,744 epoch 4 - iter 81/95 - loss 0.15813202 - time (sec): 15.90 - samples/sec: 1860.66 - lr: 0.000021 - momentum: 0.000000
2024-03-26 09:37:25,226 epoch 4 - iter 90/95 - loss 0.15491116 - time (sec): 17.39 - samples/sec: 1881.62 - lr: 0.000020 - momentum: 0.000000
2024-03-26 09:37:26,124 ----------------------------------------------------------------------------------------------------
2024-03-26 09:37:26,124 EPOCH 4 done: loss 0.1537 - lr: 0.000020
2024-03-26 09:37:27,018 DEV : loss 0.19133110344409943 - f1-score (micro avg) 0.8897
2024-03-26 09:37:27,019 saving best model
2024-03-26 09:37:27,449 ----------------------------------------------------------------------------------------------------
2024-03-26 09:37:29,172 epoch 5 - iter 9/95 - loss 0.10369406 - time (sec): 1.72 - samples/sec: 1839.54 - lr: 0.000020 - momentum: 0.000000
2024-03-26 09:37:31,302 epoch 5 - iter 18/95 - loss 0.11158756 - time (sec): 3.85 - samples/sec: 1740.67 - lr: 0.000019 - momentum: 0.000000
2024-03-26 09:37:32,861 epoch 5 - iter 27/95 - loss 0.10769290 - time (sec): 5.41 - samples/sec: 1792.98 - lr: 0.000019 - momentum: 0.000000
2024-03-26 09:37:34,526 epoch 5 - iter 36/95 - loss 0.10479897 - time (sec): 7.07 - samples/sec: 1783.06 - lr: 0.000019 - momentum: 0.000000
2024-03-26 09:37:36,200 epoch 5 - iter 45/95 - loss 0.11422202 - time (sec): 8.75 - samples/sec: 1833.65 - lr: 0.000019 - momentum: 0.000000
2024-03-26 09:37:37,790 epoch 5 - iter 54/95 - loss 0.11984042 - time (sec): 10.34 - samples/sec: 1881.06 - lr: 0.000018 - momentum: 0.000000
2024-03-26 09:37:39,623 epoch 5 - iter 63/95 - loss 0.11827238 - time (sec): 12.17 - samples/sec: 1861.37 - lr: 0.000018 - momentum: 0.000000
2024-03-26 09:37:41,837 epoch 5 - iter 72/95 - loss 0.10954916 - time (sec): 14.39 - samples/sec: 1886.33 - lr: 0.000018 - momentum: 0.000000
2024-03-26 09:37:43,075 epoch 5 - iter 81/95 - loss 0.11054361 - time (sec): 15.62 - samples/sec: 1906.03 - lr: 0.000017 - momentum: 0.000000
2024-03-26 09:37:45,210 epoch 5 - iter 90/95 - loss 0.10558766 - time (sec): 17.76 - samples/sec: 1864.66 - lr: 0.000017 - momentum: 0.000000
2024-03-26 09:37:45,834 ----------------------------------------------------------------------------------------------------
2024-03-26 09:37:45,834 EPOCH 5 done: loss 0.1061 - lr: 0.000017
2024-03-26 09:37:46,727 DEV : loss 0.19185248017311096 - f1-score (micro avg) 0.8844
2024-03-26 09:37:46,728 ----------------------------------------------------------------------------------------------------
2024-03-26 09:37:48,293 epoch 6 - iter 9/95 - loss 0.06238215 - time (sec): 1.56 - samples/sec: 1848.13 - lr: 0.000016 - momentum: 0.000000
2024-03-26 09:37:50,283 epoch 6 - iter 18/95 - loss 0.08071087 - time (sec): 3.55 - samples/sec: 1845.65 - lr: 0.000016 - momentum: 0.000000
2024-03-26 09:37:51,956 epoch 6 - iter 27/95 - loss 0.09114191 - time (sec): 5.23 - samples/sec: 1880.61 - lr: 0.000016 - momentum: 0.000000
2024-03-26 09:37:53,597 epoch 6 - iter 36/95 - loss 0.08852203 - time (sec): 6.87 - samples/sec: 1844.90 - lr: 0.000016 - momentum: 0.000000
2024-03-26 09:37:55,180 epoch 6 - iter 45/95 - loss 0.09173645 - time (sec): 8.45 - samples/sec: 1860.77 - lr: 0.000015 - momentum: 0.000000
2024-03-26 09:37:57,168 epoch 6 - iter 54/95 - loss 0.09174916 - time (sec): 10.44 - samples/sec: 1841.69 - lr: 0.000015 - momentum: 0.000000
2024-03-26 09:37:58,732 epoch 6 - iter 63/95 - loss 0.09285388 - time (sec): 12.00 - samples/sec: 1841.65 - lr: 0.000015 - momentum: 0.000000
2024-03-26 09:38:01,527 epoch 6 - iter 72/95 - loss 0.08563516 - time (sec): 14.80 - samples/sec: 1802.02 - lr: 0.000014 - momentum: 0.000000
2024-03-26 09:38:03,369 epoch 6 - iter 81/95 - loss 0.08246803 - time (sec): 16.64 - samples/sec: 1809.72 - lr: 0.000014 - momentum: 0.000000
2024-03-26 09:38:05,028 epoch 6 - iter 90/95 - loss 0.08342385 - time (sec): 18.30 - samples/sec: 1803.97 - lr: 0.000014 - momentum: 0.000000
2024-03-26 09:38:05,640 ----------------------------------------------------------------------------------------------------
2024-03-26 09:38:05,640 EPOCH 6 done: loss 0.0855 - lr: 0.000014
2024-03-26 09:38:06,540 DEV : loss 0.18254657089710236 - f1-score (micro avg) 0.9094
2024-03-26 09:38:06,541 saving best model
2024-03-26 09:38:06,964 ----------------------------------------------------------------------------------------------------
2024-03-26 09:38:08,276 epoch 7 - iter 9/95 - loss 0.11117729 - time (sec): 1.31 - samples/sec: 2256.04 - lr: 0.000013 - momentum: 0.000000
2024-03-26 09:38:09,896 epoch 7 - iter 18/95 - loss 0.09218612 - time (sec): 2.93 - samples/sec: 2003.58 - lr: 0.000013 - momentum: 0.000000
2024-03-26 09:38:11,674 epoch 7 - iter 27/95 - loss 0.08775884 - time (sec): 4.71 - samples/sec: 1941.48 - lr: 0.000013 - momentum: 0.000000
2024-03-26 09:38:13,520 epoch 7 - iter 36/95 - loss 0.07798637 - time (sec): 6.55 - samples/sec: 1908.65 - lr: 0.000012 - momentum: 0.000000
2024-03-26 09:38:15,791 epoch 7 - iter 45/95 - loss 0.07180718 - time (sec): 8.83 - samples/sec: 1856.84 - lr: 0.000012 - momentum: 0.000000
2024-03-26 09:38:16,761 epoch 7 - iter 54/95 - loss 0.07085060 - time (sec): 9.80 - samples/sec: 1934.14 - lr: 0.000012 - momentum: 0.000000
2024-03-26 09:38:18,601 epoch 7 - iter 63/95 - loss 0.06618084 - time (sec): 11.64 - samples/sec: 1933.38 - lr: 0.000011 - momentum: 0.000000
2024-03-26 09:38:20,495 epoch 7 - iter 72/95 - loss 0.06265646 - time (sec): 13.53 - samples/sec: 1893.11 - lr: 0.000011 - momentum: 0.000000
2024-03-26 09:38:22,414 epoch 7 - iter 81/95 - loss 0.06409786 - time (sec): 15.45 - samples/sec: 1889.95 - lr: 0.000011 - momentum: 0.000000
2024-03-26 09:38:24,340 epoch 7 - iter 90/95 - loss 0.06399918 - time (sec): 17.37 - samples/sec: 1892.31 - lr: 0.000010 - momentum: 0.000000
2024-03-26 09:38:25,163 ----------------------------------------------------------------------------------------------------
2024-03-26 09:38:25,163 EPOCH 7 done: loss 0.0631 - lr: 0.000010
2024-03-26 09:38:26,061 DEV : loss 0.18480655550956726 - f1-score (micro avg) 0.9115
2024-03-26 09:38:26,062 saving best model
2024-03-26 09:38:26,481 ----------------------------------------------------------------------------------------------------
2024-03-26 09:38:28,080 epoch 8 - iter 9/95 - loss 0.05686573 - time (sec): 1.60 - samples/sec: 1872.62 - lr: 0.000010 - momentum: 0.000000
2024-03-26 09:38:30,077 epoch 8 - iter 18/95 - loss 0.04837444 - time (sec): 3.59 - samples/sec: 1691.75 - lr: 0.000010 - momentum: 0.000000
2024-03-26 09:38:31,629 epoch 8 - iter 27/95 - loss 0.05493262 - time (sec): 5.15 - samples/sec: 1788.84 - lr: 0.000009 - momentum: 0.000000
2024-03-26 09:38:33,343 epoch 8 - iter 36/95 - loss 0.05918769 - time (sec): 6.86 - samples/sec: 1835.16 - lr: 0.000009 - momentum: 0.000000
2024-03-26 09:38:35,650 epoch 8 - iter 45/95 - loss 0.05018846 - time (sec): 9.17 - samples/sec: 1813.62 - lr: 0.000009 - momentum: 0.000000
2024-03-26 09:38:37,948 epoch 8 - iter 54/95 - loss 0.05231928 - time (sec): 11.47 - samples/sec: 1816.24 - lr: 0.000008 - momentum: 0.000000
2024-03-26 09:38:39,897 epoch 8 - iter 63/95 - loss 0.05413433 - time (sec): 13.41 - samples/sec: 1820.13 - lr: 0.000008 - momentum: 0.000000
2024-03-26 09:38:40,977 epoch 8 - iter 72/95 - loss 0.05401199 - time (sec): 14.49 - samples/sec: 1852.57 - lr: 0.000008 - momentum: 0.000000
2024-03-26 09:38:42,638 epoch 8 - iter 81/95 - loss 0.05202463 - time (sec): 16.15 - samples/sec: 1837.26 - lr: 0.000007 - momentum: 0.000000
2024-03-26 09:38:44,006 epoch 8 - iter 90/95 - loss 0.05195576 - time (sec): 17.52 - samples/sec: 1852.37 - lr: 0.000007 - momentum: 0.000000
2024-03-26 09:38:45,221 ----------------------------------------------------------------------------------------------------
2024-03-26 09:38:45,221 EPOCH 8 done: loss 0.0532 - lr: 0.000007
2024-03-26 09:38:46,118 DEV : loss 0.1893010288476944 - f1-score (micro avg) 0.9151
2024-03-26 09:38:46,119 saving best model
2024-03-26 09:38:46,543 ----------------------------------------------------------------------------------------------------
2024-03-26 09:38:48,300 epoch 9 - iter 9/95 - loss 0.02851503 - time (sec): 1.75 - samples/sec: 1979.54 - lr: 0.000007 - momentum: 0.000000
2024-03-26 09:38:50,234 epoch 9 - iter 18/95 - loss 0.02569522 - time (sec): 3.69 - samples/sec: 1831.60 - lr: 0.000006 - momentum: 0.000000
2024-03-26 09:38:52,062 epoch 9 - iter 27/95 - loss 0.02821932 - time (sec): 5.52 - samples/sec: 1780.97 - lr: 0.000006 - momentum: 0.000000
2024-03-26 09:38:53,993 epoch 9 - iter 36/95 - loss 0.03965945 - time (sec): 7.45 - samples/sec: 1807.56 - lr: 0.000006 - momentum: 0.000000
2024-03-26 09:38:55,880 epoch 9 - iter 45/95 - loss 0.03698830 - time (sec): 9.33 - samples/sec: 1786.27 - lr: 0.000005 - momentum: 0.000000
2024-03-26 09:38:57,728 epoch 9 - iter 54/95 - loss 0.03810645 - time (sec): 11.18 - samples/sec: 1819.00 - lr: 0.000005 - momentum: 0.000000
2024-03-26 09:38:59,600 epoch 9 - iter 63/95 - loss 0.03945291 - time (sec): 13.06 - samples/sec: 1818.91 - lr: 0.000005 - momentum: 0.000000
2024-03-26 09:39:01,180 epoch 9 - iter 72/95 - loss 0.04280555 - time (sec): 14.63 - samples/sec: 1829.37 - lr: 0.000004 - momentum: 0.000000
2024-03-26 09:39:02,877 epoch 9 - iter 81/95 - loss 0.04475982 - time (sec): 16.33 - samples/sec: 1820.69 - lr: 0.000004 - momentum: 0.000000
2024-03-26 09:39:04,627 epoch 9 - iter 90/95 - loss 0.04242681 - time (sec): 18.08 - samples/sec: 1838.42 - lr: 0.000004 - momentum: 0.000000
2024-03-26 09:39:05,120 ----------------------------------------------------------------------------------------------------
2024-03-26 09:39:05,120 EPOCH 9 done: loss 0.0436 - lr: 0.000004
2024-03-26 09:39:06,018 DEV : loss 0.18302294611930847 - f1-score (micro avg) 0.928
2024-03-26 09:39:06,019 saving best model
2024-03-26 09:39:06,442 ----------------------------------------------------------------------------------------------------
2024-03-26 09:39:07,911 epoch 10 - iter 9/95 - loss 0.01430248 - time (sec): 1.47 - samples/sec: 1892.14 - lr: 0.000003 - momentum: 0.000000
2024-03-26 09:39:09,716 epoch 10 - iter 18/95 - loss 0.02440600 - time (sec): 3.27 - samples/sec: 1847.50 - lr: 0.000003 - momentum: 0.000000
2024-03-26 09:39:11,840 epoch 10 - iter 27/95 - loss 0.03100064 - time (sec): 5.40 - samples/sec: 1791.55 - lr: 0.000003 - momentum: 0.000000
2024-03-26 09:39:13,686 epoch 10 - iter 36/95 - loss 0.03796915 - time (sec): 7.24 - samples/sec: 1810.96 - lr: 0.000002 - momentum: 0.000000
2024-03-26 09:39:14,849 epoch 10 - iter 45/95 - loss 0.03903990 - time (sec): 8.40 - samples/sec: 1864.85 - lr: 0.000002 - momentum: 0.000000
2024-03-26 09:39:16,746 epoch 10 - iter 54/95 - loss 0.04257694 - time (sec): 10.30 - samples/sec: 1848.08 - lr: 0.000002 - momentum: 0.000000
2024-03-26 09:39:18,118 epoch 10 - iter 63/95 - loss 0.04294533 - time (sec): 11.67 - samples/sec: 1861.40 - lr: 0.000001 - momentum: 0.000000
2024-03-26 09:39:20,341 epoch 10 - iter 72/95 - loss 0.03823652 - time (sec): 13.90 - samples/sec: 1843.12 - lr: 0.000001 - momentum: 0.000000
2024-03-26 09:39:22,628 epoch 10 - iter 81/95 - loss 0.04078223 - time (sec): 16.18 - samples/sec: 1824.51 - lr: 0.000001 - momentum: 0.000000
2024-03-26 09:39:24,462 epoch 10 - iter 90/95 - loss 0.03877567 - time (sec): 18.02 - samples/sec: 1816.73 - lr: 0.000000 - momentum: 0.000000
2024-03-26 09:39:25,470 ----------------------------------------------------------------------------------------------------
2024-03-26 09:39:25,470 EPOCH 10 done: loss 0.0376 - lr: 0.000000
2024-03-26 09:39:26,370 DEV : loss 0.1856098622083664 - f1-score (micro avg) 0.927
2024-03-26 09:39:26,654 ----------------------------------------------------------------------------------------------------
2024-03-26 09:39:26,655 Loading model from best epoch ...
2024-03-26 09:39:27,522 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
2024-03-26 09:39:28,274
Results:
- F-score (micro) 0.9126
- F-score (macro) 0.6926
- Accuracy 0.8452
By class:
precision recall f1-score support
Unternehmen 0.9331 0.8910 0.9115 266
Auslagerung 0.8626 0.9076 0.8845 249
Ort 0.9635 0.9851 0.9742 134
Software 0.0000 0.0000 0.0000 0
micro avg 0.9084 0.9168 0.9126 649
macro avg 0.6898 0.6959 0.6926 649
weighted avg 0.9123 0.9168 0.9141 649
2024-03-26 09:39:28,274 ----------------------------------------------------------------------------------------------------