stefan-it's picture
Upload ./training.log with huggingface_hub
5518485 verified
2024-03-26 15:23:48,594 ----------------------------------------------------------------------------------------------------
2024-03-26 15:23:48,594 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(31103, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2024-03-26 15:23:48,594 ----------------------------------------------------------------------------------------------------
2024-03-26 15:23:48,594 Corpus: 758 train + 94 dev + 96 test sentences
2024-03-26 15:23:48,594 ----------------------------------------------------------------------------------------------------
2024-03-26 15:23:48,594 Train: 758 sentences
2024-03-26 15:23:48,594 (train_with_dev=False, train_with_test=False)
2024-03-26 15:23:48,594 ----------------------------------------------------------------------------------------------------
2024-03-26 15:23:48,594 Training Params:
2024-03-26 15:23:48,594 - learning_rate: "3e-05"
2024-03-26 15:23:48,594 - mini_batch_size: "8"
2024-03-26 15:23:48,594 - max_epochs: "10"
2024-03-26 15:23:48,594 - shuffle: "True"
2024-03-26 15:23:48,594 ----------------------------------------------------------------------------------------------------
2024-03-26 15:23:48,594 Plugins:
2024-03-26 15:23:48,594 - TensorboardLogger
2024-03-26 15:23:48,594 - LinearScheduler | warmup_fraction: '0.1'
2024-03-26 15:23:48,594 ----------------------------------------------------------------------------------------------------
2024-03-26 15:23:48,595 Final evaluation on model from best epoch (best-model.pt)
2024-03-26 15:23:48,595 - metric: "('micro avg', 'f1-score')"
2024-03-26 15:23:48,595 ----------------------------------------------------------------------------------------------------
2024-03-26 15:23:48,595 Computation:
2024-03-26 15:23:48,595 - compute on device: cuda:0
2024-03-26 15:23:48,595 - embedding storage: none
2024-03-26 15:23:48,595 ----------------------------------------------------------------------------------------------------
2024-03-26 15:23:48,595 Model training base path: "flair-co-funer-german_dbmdz_bert_base-bs8-e10-lr3e-05-1"
2024-03-26 15:23:48,595 ----------------------------------------------------------------------------------------------------
2024-03-26 15:23:48,595 ----------------------------------------------------------------------------------------------------
2024-03-26 15:23:48,595 Logging anything other than scalars to TensorBoard is currently not supported.
2024-03-26 15:23:50,174 epoch 1 - iter 9/95 - loss 3.14979084 - time (sec): 1.58 - samples/sec: 1949.86 - lr: 0.000003 - momentum: 0.000000
2024-03-26 15:23:51,703 epoch 1 - iter 18/95 - loss 3.04307356 - time (sec): 3.11 - samples/sec: 2011.38 - lr: 0.000005 - momentum: 0.000000
2024-03-26 15:23:54,095 epoch 1 - iter 27/95 - loss 2.85325540 - time (sec): 5.50 - samples/sec: 1861.76 - lr: 0.000008 - momentum: 0.000000
2024-03-26 15:23:56,319 epoch 1 - iter 36/95 - loss 2.65209783 - time (sec): 7.72 - samples/sec: 1809.88 - lr: 0.000011 - momentum: 0.000000
2024-03-26 15:23:58,209 epoch 1 - iter 45/95 - loss 2.47670116 - time (sec): 9.61 - samples/sec: 1816.34 - lr: 0.000014 - momentum: 0.000000
2024-03-26 15:23:59,433 epoch 1 - iter 54/95 - loss 2.34499787 - time (sec): 10.84 - samples/sec: 1858.15 - lr: 0.000017 - momentum: 0.000000
2024-03-26 15:24:01,141 epoch 1 - iter 63/95 - loss 2.21603300 - time (sec): 12.55 - samples/sec: 1854.34 - lr: 0.000020 - momentum: 0.000000
2024-03-26 15:24:02,426 epoch 1 - iter 72/95 - loss 2.11033026 - time (sec): 13.83 - samples/sec: 1883.35 - lr: 0.000022 - momentum: 0.000000
2024-03-26 15:24:04,396 epoch 1 - iter 81/95 - loss 1.97356414 - time (sec): 15.80 - samples/sec: 1874.46 - lr: 0.000025 - momentum: 0.000000
2024-03-26 15:24:05,710 epoch 1 - iter 90/95 - loss 1.87082078 - time (sec): 17.12 - samples/sec: 1895.41 - lr: 0.000028 - momentum: 0.000000
2024-03-26 15:24:06,924 ----------------------------------------------------------------------------------------------------
2024-03-26 15:24:06,925 EPOCH 1 done: loss 1.7919 - lr: 0.000028
2024-03-26 15:24:07,760 DEV : loss 0.5088863968849182 - f1-score (micro avg) 0.6482
2024-03-26 15:24:07,761 saving best model
2024-03-26 15:24:08,026 ----------------------------------------------------------------------------------------------------
2024-03-26 15:24:10,065 epoch 2 - iter 9/95 - loss 0.50823411 - time (sec): 2.04 - samples/sec: 1811.64 - lr: 0.000030 - momentum: 0.000000
2024-03-26 15:24:11,741 epoch 2 - iter 18/95 - loss 0.53694131 - time (sec): 3.71 - samples/sec: 1953.10 - lr: 0.000029 - momentum: 0.000000
2024-03-26 15:24:13,548 epoch 2 - iter 27/95 - loss 0.50533491 - time (sec): 5.52 - samples/sec: 1867.09 - lr: 0.000029 - momentum: 0.000000
2024-03-26 15:24:15,312 epoch 2 - iter 36/95 - loss 0.48259154 - time (sec): 7.29 - samples/sec: 1835.30 - lr: 0.000029 - momentum: 0.000000
2024-03-26 15:24:17,204 epoch 2 - iter 45/95 - loss 0.45474310 - time (sec): 9.18 - samples/sec: 1845.27 - lr: 0.000028 - momentum: 0.000000
2024-03-26 15:24:19,404 epoch 2 - iter 54/95 - loss 0.42654159 - time (sec): 11.38 - samples/sec: 1814.83 - lr: 0.000028 - momentum: 0.000000
2024-03-26 15:24:20,715 epoch 2 - iter 63/95 - loss 0.42677167 - time (sec): 12.69 - samples/sec: 1856.69 - lr: 0.000028 - momentum: 0.000000
2024-03-26 15:24:22,043 epoch 2 - iter 72/95 - loss 0.41373855 - time (sec): 14.02 - samples/sec: 1887.33 - lr: 0.000028 - momentum: 0.000000
2024-03-26 15:24:23,845 epoch 2 - iter 81/95 - loss 0.40294984 - time (sec): 15.82 - samples/sec: 1871.34 - lr: 0.000027 - momentum: 0.000000
2024-03-26 15:24:25,501 epoch 2 - iter 90/95 - loss 0.39552323 - time (sec): 17.47 - samples/sec: 1867.73 - lr: 0.000027 - momentum: 0.000000
2024-03-26 15:24:26,434 ----------------------------------------------------------------------------------------------------
2024-03-26 15:24:26,434 EPOCH 2 done: loss 0.3899 - lr: 0.000027
2024-03-26 15:24:27,345 DEV : loss 0.2915351390838623 - f1-score (micro avg) 0.8051
2024-03-26 15:24:27,347 saving best model
2024-03-26 15:24:27,809 ----------------------------------------------------------------------------------------------------
2024-03-26 15:24:29,750 epoch 3 - iter 9/95 - loss 0.33132299 - time (sec): 1.94 - samples/sec: 1730.47 - lr: 0.000026 - momentum: 0.000000
2024-03-26 15:24:31,672 epoch 3 - iter 18/95 - loss 0.27912755 - time (sec): 3.86 - samples/sec: 1742.68 - lr: 0.000026 - momentum: 0.000000
2024-03-26 15:24:33,024 epoch 3 - iter 27/95 - loss 0.26573965 - time (sec): 5.21 - samples/sec: 1835.11 - lr: 0.000026 - momentum: 0.000000
2024-03-26 15:24:35,491 epoch 3 - iter 36/95 - loss 0.25234694 - time (sec): 7.68 - samples/sec: 1760.44 - lr: 0.000025 - momentum: 0.000000
2024-03-26 15:24:37,721 epoch 3 - iter 45/95 - loss 0.24175513 - time (sec): 9.91 - samples/sec: 1791.44 - lr: 0.000025 - momentum: 0.000000
2024-03-26 15:24:38,903 epoch 3 - iter 54/95 - loss 0.23630673 - time (sec): 11.09 - samples/sec: 1847.22 - lr: 0.000025 - momentum: 0.000000
2024-03-26 15:24:40,821 epoch 3 - iter 63/95 - loss 0.22822548 - time (sec): 13.01 - samples/sec: 1830.87 - lr: 0.000025 - momentum: 0.000000
2024-03-26 15:24:42,439 epoch 3 - iter 72/95 - loss 0.21717616 - time (sec): 14.63 - samples/sec: 1836.05 - lr: 0.000024 - momentum: 0.000000
2024-03-26 15:24:44,180 epoch 3 - iter 81/95 - loss 0.21748434 - time (sec): 16.37 - samples/sec: 1827.41 - lr: 0.000024 - momentum: 0.000000
2024-03-26 15:24:46,347 epoch 3 - iter 90/95 - loss 0.20897001 - time (sec): 18.54 - samples/sec: 1797.14 - lr: 0.000024 - momentum: 0.000000
2024-03-26 15:24:46,823 ----------------------------------------------------------------------------------------------------
2024-03-26 15:24:46,823 EPOCH 3 done: loss 0.2085 - lr: 0.000024
2024-03-26 15:24:47,721 DEV : loss 0.2427646666765213 - f1-score (micro avg) 0.8686
2024-03-26 15:24:47,722 saving best model
2024-03-26 15:24:48,166 ----------------------------------------------------------------------------------------------------
2024-03-26 15:24:49,762 epoch 4 - iter 9/95 - loss 0.17196514 - time (sec): 1.60 - samples/sec: 2018.98 - lr: 0.000023 - momentum: 0.000000
2024-03-26 15:24:51,788 epoch 4 - iter 18/95 - loss 0.14921918 - time (sec): 3.62 - samples/sec: 1780.76 - lr: 0.000023 - momentum: 0.000000
2024-03-26 15:24:53,569 epoch 4 - iter 27/95 - loss 0.15312969 - time (sec): 5.40 - samples/sec: 1803.35 - lr: 0.000022 - momentum: 0.000000
2024-03-26 15:24:56,120 epoch 4 - iter 36/95 - loss 0.13017501 - time (sec): 7.95 - samples/sec: 1732.26 - lr: 0.000022 - momentum: 0.000000
2024-03-26 15:24:57,807 epoch 4 - iter 45/95 - loss 0.13818847 - time (sec): 9.64 - samples/sec: 1751.27 - lr: 0.000022 - momentum: 0.000000
2024-03-26 15:24:59,342 epoch 4 - iter 54/95 - loss 0.13889360 - time (sec): 11.18 - samples/sec: 1804.85 - lr: 0.000022 - momentum: 0.000000
2024-03-26 15:25:01,194 epoch 4 - iter 63/95 - loss 0.14186955 - time (sec): 13.03 - samples/sec: 1827.36 - lr: 0.000021 - momentum: 0.000000
2024-03-26 15:25:02,476 epoch 4 - iter 72/95 - loss 0.14233703 - time (sec): 14.31 - samples/sec: 1856.74 - lr: 0.000021 - momentum: 0.000000
2024-03-26 15:25:04,195 epoch 4 - iter 81/95 - loss 0.14163043 - time (sec): 16.03 - samples/sec: 1846.28 - lr: 0.000021 - momentum: 0.000000
2024-03-26 15:25:05,685 epoch 4 - iter 90/95 - loss 0.14066665 - time (sec): 17.52 - samples/sec: 1867.47 - lr: 0.000020 - momentum: 0.000000
2024-03-26 15:25:06,586 ----------------------------------------------------------------------------------------------------
2024-03-26 15:25:06,586 EPOCH 4 done: loss 0.1405 - lr: 0.000020
2024-03-26 15:25:07,483 DEV : loss 0.19904547929763794 - f1-score (micro avg) 0.8939
2024-03-26 15:25:07,484 saving best model
2024-03-26 15:25:07,946 ----------------------------------------------------------------------------------------------------
2024-03-26 15:25:09,696 epoch 5 - iter 9/95 - loss 0.10174251 - time (sec): 1.75 - samples/sec: 1809.09 - lr: 0.000020 - momentum: 0.000000
2024-03-26 15:25:11,832 epoch 5 - iter 18/95 - loss 0.10059690 - time (sec): 3.89 - samples/sec: 1725.18 - lr: 0.000019 - momentum: 0.000000
2024-03-26 15:25:13,395 epoch 5 - iter 27/95 - loss 0.09581650 - time (sec): 5.45 - samples/sec: 1780.52 - lr: 0.000019 - momentum: 0.000000
2024-03-26 15:25:15,068 epoch 5 - iter 36/95 - loss 0.09741828 - time (sec): 7.12 - samples/sec: 1771.45 - lr: 0.000019 - momentum: 0.000000
2024-03-26 15:25:16,736 epoch 5 - iter 45/95 - loss 0.11306233 - time (sec): 8.79 - samples/sec: 1825.26 - lr: 0.000019 - momentum: 0.000000
2024-03-26 15:25:18,335 epoch 5 - iter 54/95 - loss 0.11499609 - time (sec): 10.39 - samples/sec: 1872.32 - lr: 0.000018 - momentum: 0.000000
2024-03-26 15:25:20,176 epoch 5 - iter 63/95 - loss 0.11069119 - time (sec): 12.23 - samples/sec: 1852.58 - lr: 0.000018 - momentum: 0.000000
2024-03-26 15:25:22,399 epoch 5 - iter 72/95 - loss 0.10254022 - time (sec): 14.45 - samples/sec: 1877.67 - lr: 0.000018 - momentum: 0.000000
2024-03-26 15:25:23,644 epoch 5 - iter 81/95 - loss 0.10177281 - time (sec): 15.70 - samples/sec: 1897.07 - lr: 0.000017 - momentum: 0.000000
2024-03-26 15:25:25,787 epoch 5 - iter 90/95 - loss 0.09762038 - time (sec): 17.84 - samples/sec: 1856.20 - lr: 0.000017 - momentum: 0.000000
2024-03-26 15:25:26,417 ----------------------------------------------------------------------------------------------------
2024-03-26 15:25:26,417 EPOCH 5 done: loss 0.0982 - lr: 0.000017
2024-03-26 15:25:27,324 DEV : loss 0.18307699263095856 - f1-score (micro avg) 0.9073
2024-03-26 15:25:27,325 saving best model
2024-03-26 15:25:27,804 ----------------------------------------------------------------------------------------------------
2024-03-26 15:25:29,385 epoch 6 - iter 9/95 - loss 0.03895202 - time (sec): 1.58 - samples/sec: 1829.62 - lr: 0.000016 - momentum: 0.000000
2024-03-26 15:25:31,383 epoch 6 - iter 18/95 - loss 0.05992809 - time (sec): 3.58 - samples/sec: 1833.81 - lr: 0.000016 - momentum: 0.000000
2024-03-26 15:25:33,061 epoch 6 - iter 27/95 - loss 0.06934296 - time (sec): 5.26 - samples/sec: 1870.23 - lr: 0.000016 - momentum: 0.000000
2024-03-26 15:25:34,708 epoch 6 - iter 36/95 - loss 0.06556657 - time (sec): 6.90 - samples/sec: 1835.76 - lr: 0.000016 - momentum: 0.000000
2024-03-26 15:25:36,306 epoch 6 - iter 45/95 - loss 0.06874063 - time (sec): 8.50 - samples/sec: 1849.94 - lr: 0.000015 - momentum: 0.000000
2024-03-26 15:25:38,304 epoch 6 - iter 54/95 - loss 0.07341583 - time (sec): 10.50 - samples/sec: 1831.20 - lr: 0.000015 - momentum: 0.000000
2024-03-26 15:25:39,875 epoch 6 - iter 63/95 - loss 0.07526510 - time (sec): 12.07 - samples/sec: 1831.58 - lr: 0.000015 - momentum: 0.000000
2024-03-26 15:25:42,666 epoch 6 - iter 72/95 - loss 0.07039873 - time (sec): 14.86 - samples/sec: 1794.44 - lr: 0.000014 - momentum: 0.000000
2024-03-26 15:25:44,514 epoch 6 - iter 81/95 - loss 0.07000601 - time (sec): 16.71 - samples/sec: 1802.30 - lr: 0.000014 - momentum: 0.000000
2024-03-26 15:25:46,183 epoch 6 - iter 90/95 - loss 0.07083690 - time (sec): 18.38 - samples/sec: 1796.30 - lr: 0.000014 - momentum: 0.000000
2024-03-26 15:25:46,793 ----------------------------------------------------------------------------------------------------
2024-03-26 15:25:46,793 EPOCH 6 done: loss 0.0725 - lr: 0.000014
2024-03-26 15:25:47,703 DEV : loss 0.18871811032295227 - f1-score (micro avg) 0.9027
2024-03-26 15:25:47,704 ----------------------------------------------------------------------------------------------------
2024-03-26 15:25:49,029 epoch 7 - iter 9/95 - loss 0.08815916 - time (sec): 1.32 - samples/sec: 2233.22 - lr: 0.000013 - momentum: 0.000000
2024-03-26 15:25:50,646 epoch 7 - iter 18/95 - loss 0.07169953 - time (sec): 2.94 - samples/sec: 1996.51 - lr: 0.000013 - momentum: 0.000000
2024-03-26 15:25:52,445 epoch 7 - iter 27/95 - loss 0.07522697 - time (sec): 4.74 - samples/sec: 1928.44 - lr: 0.000013 - momentum: 0.000000
2024-03-26 15:25:54,312 epoch 7 - iter 36/95 - loss 0.06714080 - time (sec): 6.61 - samples/sec: 1893.53 - lr: 0.000012 - momentum: 0.000000
2024-03-26 15:25:56,599 epoch 7 - iter 45/95 - loss 0.06055062 - time (sec): 8.89 - samples/sec: 1842.49 - lr: 0.000012 - momentum: 0.000000
2024-03-26 15:25:57,578 epoch 7 - iter 54/95 - loss 0.06119999 - time (sec): 9.87 - samples/sec: 1918.84 - lr: 0.000012 - momentum: 0.000000
2024-03-26 15:25:59,426 epoch 7 - iter 63/95 - loss 0.05722868 - time (sec): 11.72 - samples/sec: 1919.19 - lr: 0.000011 - momentum: 0.000000
2024-03-26 15:26:01,331 epoch 7 - iter 72/95 - loss 0.05507776 - time (sec): 13.63 - samples/sec: 1879.65 - lr: 0.000011 - momentum: 0.000000
2024-03-26 15:26:03,267 epoch 7 - iter 81/95 - loss 0.05448703 - time (sec): 15.56 - samples/sec: 1876.13 - lr: 0.000011 - momentum: 0.000000
2024-03-26 15:26:05,199 epoch 7 - iter 90/95 - loss 0.05399226 - time (sec): 17.49 - samples/sec: 1879.34 - lr: 0.000010 - momentum: 0.000000
2024-03-26 15:26:06,024 ----------------------------------------------------------------------------------------------------
2024-03-26 15:26:06,025 EPOCH 7 done: loss 0.0539 - lr: 0.000010
2024-03-26 15:26:06,957 DEV : loss 0.18794356286525726 - f1-score (micro avg) 0.9148
2024-03-26 15:26:06,958 saving best model
2024-03-26 15:26:07,419 ----------------------------------------------------------------------------------------------------
2024-03-26 15:26:09,010 epoch 8 - iter 9/95 - loss 0.05739097 - time (sec): 1.59 - samples/sec: 1880.34 - lr: 0.000010 - momentum: 0.000000
2024-03-26 15:26:11,015 epoch 8 - iter 18/95 - loss 0.05214964 - time (sec): 3.60 - samples/sec: 1691.16 - lr: 0.000010 - momentum: 0.000000
2024-03-26 15:26:12,574 epoch 8 - iter 27/95 - loss 0.05666378 - time (sec): 5.15 - samples/sec: 1785.67 - lr: 0.000009 - momentum: 0.000000
2024-03-26 15:26:14,284 epoch 8 - iter 36/95 - loss 0.05500534 - time (sec): 6.87 - samples/sec: 1833.93 - lr: 0.000009 - momentum: 0.000000
2024-03-26 15:26:16,576 epoch 8 - iter 45/95 - loss 0.04735151 - time (sec): 9.16 - samples/sec: 1815.50 - lr: 0.000009 - momentum: 0.000000
2024-03-26 15:26:18,867 epoch 8 - iter 54/95 - loss 0.04751933 - time (sec): 11.45 - samples/sec: 1819.00 - lr: 0.000008 - momentum: 0.000000
2024-03-26 15:26:20,811 epoch 8 - iter 63/95 - loss 0.04909725 - time (sec): 13.39 - samples/sec: 1823.14 - lr: 0.000008 - momentum: 0.000000
2024-03-26 15:26:21,890 epoch 8 - iter 72/95 - loss 0.04820220 - time (sec): 14.47 - samples/sec: 1855.66 - lr: 0.000008 - momentum: 0.000000
2024-03-26 15:26:23,542 epoch 8 - iter 81/95 - loss 0.04660140 - time (sec): 16.12 - samples/sec: 1840.90 - lr: 0.000007 - momentum: 0.000000
2024-03-26 15:26:24,903 epoch 8 - iter 90/95 - loss 0.04558973 - time (sec): 17.48 - samples/sec: 1856.55 - lr: 0.000007 - momentum: 0.000000
2024-03-26 15:26:26,112 ----------------------------------------------------------------------------------------------------
2024-03-26 15:26:26,112 EPOCH 8 done: loss 0.0478 - lr: 0.000007
2024-03-26 15:26:27,022 DEV : loss 0.1870705783367157 - f1-score (micro avg) 0.924
2024-03-26 15:26:27,025 saving best model
2024-03-26 15:26:27,485 ----------------------------------------------------------------------------------------------------
2024-03-26 15:26:29,237 epoch 9 - iter 9/95 - loss 0.02106672 - time (sec): 1.75 - samples/sec: 1983.41 - lr: 0.000007 - momentum: 0.000000
2024-03-26 15:26:31,152 epoch 9 - iter 18/95 - loss 0.02225839 - time (sec): 3.67 - samples/sec: 1843.01 - lr: 0.000006 - momentum: 0.000000
2024-03-26 15:26:32,991 epoch 9 - iter 27/95 - loss 0.02596204 - time (sec): 5.51 - samples/sec: 1784.61 - lr: 0.000006 - momentum: 0.000000
2024-03-26 15:26:34,917 epoch 9 - iter 36/95 - loss 0.03410280 - time (sec): 7.43 - samples/sec: 1811.69 - lr: 0.000006 - momentum: 0.000000
2024-03-26 15:26:36,789 epoch 9 - iter 45/95 - loss 0.03413443 - time (sec): 9.30 - samples/sec: 1792.29 - lr: 0.000005 - momentum: 0.000000
2024-03-26 15:26:38,645 epoch 9 - iter 54/95 - loss 0.03408301 - time (sec): 11.16 - samples/sec: 1822.88 - lr: 0.000005 - momentum: 0.000000
2024-03-26 15:26:40,516 epoch 9 - iter 63/95 - loss 0.03420543 - time (sec): 13.03 - samples/sec: 1822.30 - lr: 0.000005 - momentum: 0.000000
2024-03-26 15:26:42,088 epoch 9 - iter 72/95 - loss 0.03587913 - time (sec): 14.60 - samples/sec: 1833.40 - lr: 0.000004 - momentum: 0.000000
2024-03-26 15:26:43,788 epoch 9 - iter 81/95 - loss 0.03734644 - time (sec): 16.30 - samples/sec: 1824.04 - lr: 0.000004 - momentum: 0.000000
2024-03-26 15:26:45,538 epoch 9 - iter 90/95 - loss 0.03562347 - time (sec): 18.05 - samples/sec: 1841.38 - lr: 0.000004 - momentum: 0.000000
2024-03-26 15:26:46,038 ----------------------------------------------------------------------------------------------------
2024-03-26 15:26:46,038 EPOCH 9 done: loss 0.0360 - lr: 0.000004
2024-03-26 15:26:46,937 DEV : loss 0.194667786359787 - f1-score (micro avg) 0.9249
2024-03-26 15:26:46,938 saving best model
2024-03-26 15:26:47,393 ----------------------------------------------------------------------------------------------------
2024-03-26 15:26:48,863 epoch 10 - iter 9/95 - loss 0.01764166 - time (sec): 1.47 - samples/sec: 1891.64 - lr: 0.000003 - momentum: 0.000000
2024-03-26 15:26:50,677 epoch 10 - iter 18/95 - loss 0.02086608 - time (sec): 3.28 - samples/sec: 1841.60 - lr: 0.000003 - momentum: 0.000000
2024-03-26 15:26:52,806 epoch 10 - iter 27/95 - loss 0.02960285 - time (sec): 5.41 - samples/sec: 1786.52 - lr: 0.000003 - momentum: 0.000000
2024-03-26 15:26:54,657 epoch 10 - iter 36/95 - loss 0.03318762 - time (sec): 7.26 - samples/sec: 1806.19 - lr: 0.000002 - momentum: 0.000000
2024-03-26 15:26:55,822 epoch 10 - iter 45/95 - loss 0.03215798 - time (sec): 8.43 - samples/sec: 1860.01 - lr: 0.000002 - momentum: 0.000000
2024-03-26 15:26:57,715 epoch 10 - iter 54/95 - loss 0.03308007 - time (sec): 10.32 - samples/sec: 1844.85 - lr: 0.000002 - momentum: 0.000000
2024-03-26 15:26:59,093 epoch 10 - iter 63/95 - loss 0.03432878 - time (sec): 11.70 - samples/sec: 1857.55 - lr: 0.000001 - momentum: 0.000000
2024-03-26 15:27:01,335 epoch 10 - iter 72/95 - loss 0.02991145 - time (sec): 13.94 - samples/sec: 1837.36 - lr: 0.000001 - momentum: 0.000000
2024-03-26 15:27:03,633 epoch 10 - iter 81/95 - loss 0.03419502 - time (sec): 16.24 - samples/sec: 1818.46 - lr: 0.000001 - momentum: 0.000000
2024-03-26 15:27:05,471 epoch 10 - iter 90/95 - loss 0.03200818 - time (sec): 18.08 - samples/sec: 1810.90 - lr: 0.000000 - momentum: 0.000000
2024-03-26 15:27:06,480 ----------------------------------------------------------------------------------------------------
2024-03-26 15:27:06,480 EPOCH 10 done: loss 0.0310 - lr: 0.000000
2024-03-26 15:27:07,403 DEV : loss 0.1904587596654892 - f1-score (micro avg) 0.9336
2024-03-26 15:27:07,404 saving best model
2024-03-26 15:27:08,189 ----------------------------------------------------------------------------------------------------
2024-03-26 15:27:08,189 Loading model from best epoch ...
2024-03-26 15:27:09,080 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
2024-03-26 15:27:09,836
Results:
- F-score (micro) 0.9121
- F-score (macro) 0.6924
- Accuracy 0.8408
By class:
precision recall f1-score support
Unternehmen 0.9147 0.8872 0.9008 266
Auslagerung 0.8707 0.9197 0.8945 249
Ort 0.9635 0.9851 0.9742 134
Software 0.0000 0.0000 0.0000 0
micro avg 0.9045 0.9199 0.9121 649
macro avg 0.6872 0.6980 0.6924 649
weighted avg 0.9079 0.9199 0.9135 649
2024-03-26 15:27:09,836 ----------------------------------------------------------------------------------------------------