stefan-it's picture
Upload ./training.log with huggingface_hub
ed09391 verified
2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
2024-03-26 11:04:36,871 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(30001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
2024-03-26 11:04:36,871 Corpus: 758 train + 94 dev + 96 test sentences
2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
2024-03-26 11:04:36,871 Train: 758 sentences
2024-03-26 11:04:36,871 (train_with_dev=False, train_with_test=False)
2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
2024-03-26 11:04:36,871 Training Params:
2024-03-26 11:04:36,871 - learning_rate: "3e-05"
2024-03-26 11:04:36,871 - mini_batch_size: "8"
2024-03-26 11:04:36,871 - max_epochs: "10"
2024-03-26 11:04:36,871 - shuffle: "True"
2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
2024-03-26 11:04:36,871 Plugins:
2024-03-26 11:04:36,871 - TensorboardLogger
2024-03-26 11:04:36,871 - LinearScheduler | warmup_fraction: '0.1'
2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
2024-03-26 11:04:36,871 Final evaluation on model from best epoch (best-model.pt)
2024-03-26 11:04:36,871 - metric: "('micro avg', 'f1-score')"
2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
2024-03-26 11:04:36,871 Computation:
2024-03-26 11:04:36,871 - compute on device: cuda:0
2024-03-26 11:04:36,871 - embedding storage: none
2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
2024-03-26 11:04:36,871 Model training base path: "flair-co-funer-german_bert_base-bs8-e10-lr3e-05-1"
2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
2024-03-26 11:04:36,871 ----------------------------------------------------------------------------------------------------
2024-03-26 11:04:36,871 Logging anything other than scalars to TensorBoard is currently not supported.
2024-03-26 11:04:38,523 epoch 1 - iter 9/95 - loss 3.40933677 - time (sec): 1.65 - samples/sec: 1864.22 - lr: 0.000003 - momentum: 0.000000
2024-03-26 11:04:40,110 epoch 1 - iter 18/95 - loss 3.26210809 - time (sec): 3.24 - samples/sec: 1930.43 - lr: 0.000005 - momentum: 0.000000
2024-03-26 11:04:42,628 epoch 1 - iter 27/95 - loss 3.07575274 - time (sec): 5.76 - samples/sec: 1778.83 - lr: 0.000008 - momentum: 0.000000
2024-03-26 11:04:44,926 epoch 1 - iter 36/95 - loss 2.85331391 - time (sec): 8.05 - samples/sec: 1735.76 - lr: 0.000011 - momentum: 0.000000
2024-03-26 11:04:46,899 epoch 1 - iter 45/95 - loss 2.65661608 - time (sec): 10.03 - samples/sec: 1741.50 - lr: 0.000014 - momentum: 0.000000
2024-03-26 11:04:48,188 epoch 1 - iter 54/95 - loss 2.51764366 - time (sec): 11.32 - samples/sec: 1779.71 - lr: 0.000017 - momentum: 0.000000
2024-03-26 11:04:49,969 epoch 1 - iter 63/95 - loss 2.36976031 - time (sec): 13.10 - samples/sec: 1776.34 - lr: 0.000020 - momentum: 0.000000
2024-03-26 11:04:51,327 epoch 1 - iter 72/95 - loss 2.25457005 - time (sec): 14.46 - samples/sec: 1802.05 - lr: 0.000022 - momentum: 0.000000
2024-03-26 11:04:53,398 epoch 1 - iter 81/95 - loss 2.11328222 - time (sec): 16.53 - samples/sec: 1792.24 - lr: 0.000025 - momentum: 0.000000
2024-03-26 11:04:54,764 epoch 1 - iter 90/95 - loss 2.00924946 - time (sec): 17.89 - samples/sec: 1813.13 - lr: 0.000028 - momentum: 0.000000
2024-03-26 11:04:56,048 ----------------------------------------------------------------------------------------------------
2024-03-26 11:04:56,048 EPOCH 1 done: loss 1.9314 - lr: 0.000028
2024-03-26 11:04:56,903 DEV : loss 0.5613349676132202 - f1-score (micro avg) 0.6334
2024-03-26 11:04:56,904 saving best model
2024-03-26 11:04:57,197 ----------------------------------------------------------------------------------------------------
2024-03-26 11:04:59,380 epoch 2 - iter 9/95 - loss 0.57723266 - time (sec): 2.18 - samples/sec: 1692.36 - lr: 0.000030 - momentum: 0.000000
2024-03-26 11:05:01,132 epoch 2 - iter 18/95 - loss 0.59299330 - time (sec): 3.93 - samples/sec: 1843.91 - lr: 0.000029 - momentum: 0.000000
2024-03-26 11:05:02,998 epoch 2 - iter 27/95 - loss 0.56105796 - time (sec): 5.80 - samples/sec: 1777.29 - lr: 0.000029 - momentum: 0.000000
2024-03-26 11:05:04,798 epoch 2 - iter 36/95 - loss 0.52648710 - time (sec): 7.60 - samples/sec: 1759.18 - lr: 0.000029 - momentum: 0.000000
2024-03-26 11:05:06,726 epoch 2 - iter 45/95 - loss 0.49076722 - time (sec): 9.53 - samples/sec: 1777.21 - lr: 0.000028 - momentum: 0.000000
2024-03-26 11:05:08,976 epoch 2 - iter 54/95 - loss 0.45479843 - time (sec): 11.78 - samples/sec: 1752.91 - lr: 0.000028 - momentum: 0.000000
2024-03-26 11:05:10,325 epoch 2 - iter 63/95 - loss 0.44881549 - time (sec): 13.13 - samples/sec: 1794.48 - lr: 0.000028 - momentum: 0.000000
2024-03-26 11:05:11,692 epoch 2 - iter 72/95 - loss 0.43506525 - time (sec): 14.49 - samples/sec: 1824.98 - lr: 0.000028 - momentum: 0.000000
2024-03-26 11:05:13,510 epoch 2 - iter 81/95 - loss 0.42206134 - time (sec): 16.31 - samples/sec: 1814.64 - lr: 0.000027 - momentum: 0.000000
2024-03-26 11:05:15,178 epoch 2 - iter 90/95 - loss 0.41737343 - time (sec): 17.98 - samples/sec: 1815.09 - lr: 0.000027 - momentum: 0.000000
2024-03-26 11:05:16,110 ----------------------------------------------------------------------------------------------------
2024-03-26 11:05:16,110 EPOCH 2 done: loss 0.4112 - lr: 0.000027
2024-03-26 11:05:17,069 DEV : loss 0.31459930539131165 - f1-score (micro avg) 0.807
2024-03-26 11:05:17,070 saving best model
2024-03-26 11:05:17,555 ----------------------------------------------------------------------------------------------------
2024-03-26 11:05:19,551 epoch 3 - iter 9/95 - loss 0.31440666 - time (sec): 2.00 - samples/sec: 1681.68 - lr: 0.000026 - momentum: 0.000000
2024-03-26 11:05:21,633 epoch 3 - iter 18/95 - loss 0.28335954 - time (sec): 4.08 - samples/sec: 1650.51 - lr: 0.000026 - momentum: 0.000000
2024-03-26 11:05:23,006 epoch 3 - iter 27/95 - loss 0.27402111 - time (sec): 5.45 - samples/sec: 1755.03 - lr: 0.000026 - momentum: 0.000000
2024-03-26 11:05:25,527 epoch 3 - iter 36/95 - loss 0.26965058 - time (sec): 7.97 - samples/sec: 1696.07 - lr: 0.000025 - momentum: 0.000000
2024-03-26 11:05:27,809 epoch 3 - iter 45/95 - loss 0.25296598 - time (sec): 10.25 - samples/sec: 1731.51 - lr: 0.000025 - momentum: 0.000000
2024-03-26 11:05:29,018 epoch 3 - iter 54/95 - loss 0.24175864 - time (sec): 11.46 - samples/sec: 1787.67 - lr: 0.000025 - momentum: 0.000000
2024-03-26 11:05:31,057 epoch 3 - iter 63/95 - loss 0.22898676 - time (sec): 13.50 - samples/sec: 1764.24 - lr: 0.000025 - momentum: 0.000000
2024-03-26 11:05:32,704 epoch 3 - iter 72/95 - loss 0.21672189 - time (sec): 15.15 - samples/sec: 1773.05 - lr: 0.000024 - momentum: 0.000000
2024-03-26 11:05:34,527 epoch 3 - iter 81/95 - loss 0.21563889 - time (sec): 16.97 - samples/sec: 1762.62 - lr: 0.000024 - momentum: 0.000000
2024-03-26 11:05:36,799 epoch 3 - iter 90/95 - loss 0.20715295 - time (sec): 19.24 - samples/sec: 1731.07 - lr: 0.000024 - momentum: 0.000000
2024-03-26 11:05:37,291 ----------------------------------------------------------------------------------------------------
2024-03-26 11:05:37,291 EPOCH 3 done: loss 0.2084 - lr: 0.000024
2024-03-26 11:05:38,237 DEV : loss 0.24972011148929596 - f1-score (micro avg) 0.8552
2024-03-26 11:05:38,238 saving best model
2024-03-26 11:05:38,713 ----------------------------------------------------------------------------------------------------
2024-03-26 11:05:40,338 epoch 4 - iter 9/95 - loss 0.16002693 - time (sec): 1.62 - samples/sec: 1983.32 - lr: 0.000023 - momentum: 0.000000
2024-03-26 11:05:42,433 epoch 4 - iter 18/95 - loss 0.13594050 - time (sec): 3.72 - samples/sec: 1734.08 - lr: 0.000023 - momentum: 0.000000
2024-03-26 11:05:44,241 epoch 4 - iter 27/95 - loss 0.14400724 - time (sec): 5.53 - samples/sec: 1762.62 - lr: 0.000022 - momentum: 0.000000
2024-03-26 11:05:46,829 epoch 4 - iter 36/95 - loss 0.12468849 - time (sec): 8.11 - samples/sec: 1697.88 - lr: 0.000022 - momentum: 0.000000
2024-03-26 11:05:48,532 epoch 4 - iter 45/95 - loss 0.13305017 - time (sec): 9.82 - samples/sec: 1719.67 - lr: 0.000022 - momentum: 0.000000
2024-03-26 11:05:50,109 epoch 4 - iter 54/95 - loss 0.14059580 - time (sec): 11.40 - samples/sec: 1769.99 - lr: 0.000022 - momentum: 0.000000
2024-03-26 11:05:52,067 epoch 4 - iter 63/95 - loss 0.14104997 - time (sec): 13.35 - samples/sec: 1782.75 - lr: 0.000021 - momentum: 0.000000
2024-03-26 11:05:53,365 epoch 4 - iter 72/95 - loss 0.14056154 - time (sec): 14.65 - samples/sec: 1813.54 - lr: 0.000021 - momentum: 0.000000
2024-03-26 11:05:55,134 epoch 4 - iter 81/95 - loss 0.13741921 - time (sec): 16.42 - samples/sec: 1802.26 - lr: 0.000021 - momentum: 0.000000
2024-03-26 11:05:56,677 epoch 4 - iter 90/95 - loss 0.13423455 - time (sec): 17.96 - samples/sec: 1821.25 - lr: 0.000020 - momentum: 0.000000
2024-03-26 11:05:57,609 ----------------------------------------------------------------------------------------------------
2024-03-26 11:05:57,609 EPOCH 4 done: loss 0.1333 - lr: 0.000020
2024-03-26 11:05:58,553 DEV : loss 0.2319207489490509 - f1-score (micro avg) 0.8966
2024-03-26 11:05:58,554 saving best model
2024-03-26 11:05:59,032 ----------------------------------------------------------------------------------------------------
2024-03-26 11:06:00,701 epoch 5 - iter 9/95 - loss 0.09106046 - time (sec): 1.67 - samples/sec: 1896.96 - lr: 0.000020 - momentum: 0.000000
2024-03-26 11:06:02,974 epoch 5 - iter 18/95 - loss 0.11214134 - time (sec): 3.94 - samples/sec: 1700.70 - lr: 0.000019 - momentum: 0.000000
2024-03-26 11:06:04,590 epoch 5 - iter 27/95 - loss 0.10946660 - time (sec): 5.56 - samples/sec: 1745.51 - lr: 0.000019 - momentum: 0.000000
2024-03-26 11:06:06,311 epoch 5 - iter 36/95 - loss 0.10398039 - time (sec): 7.28 - samples/sec: 1733.22 - lr: 0.000019 - momentum: 0.000000
2024-03-26 11:06:08,070 epoch 5 - iter 45/95 - loss 0.11325312 - time (sec): 9.04 - samples/sec: 1775.03 - lr: 0.000019 - momentum: 0.000000
2024-03-26 11:06:09,736 epoch 5 - iter 54/95 - loss 0.11530607 - time (sec): 10.70 - samples/sec: 1817.06 - lr: 0.000018 - momentum: 0.000000
2024-03-26 11:06:11,620 epoch 5 - iter 63/95 - loss 0.10923260 - time (sec): 12.59 - samples/sec: 1799.95 - lr: 0.000018 - momentum: 0.000000
2024-03-26 11:06:13,849 epoch 5 - iter 72/95 - loss 0.10006288 - time (sec): 14.82 - samples/sec: 1831.56 - lr: 0.000018 - momentum: 0.000000
2024-03-26 11:06:15,138 epoch 5 - iter 81/95 - loss 0.10120984 - time (sec): 16.11 - samples/sec: 1849.01 - lr: 0.000017 - momentum: 0.000000
2024-03-26 11:06:17,349 epoch 5 - iter 90/95 - loss 0.09734972 - time (sec): 18.32 - samples/sec: 1807.97 - lr: 0.000017 - momentum: 0.000000
2024-03-26 11:06:17,987 ----------------------------------------------------------------------------------------------------
2024-03-26 11:06:17,987 EPOCH 5 done: loss 0.0979 - lr: 0.000017
2024-03-26 11:06:18,961 DEV : loss 0.2127046436071396 - f1-score (micro avg) 0.8974
2024-03-26 11:06:18,963 saving best model
2024-03-26 11:06:19,438 ----------------------------------------------------------------------------------------------------
2024-03-26 11:06:21,034 epoch 6 - iter 9/95 - loss 0.04200073 - time (sec): 1.59 - samples/sec: 1813.67 - lr: 0.000016 - momentum: 0.000000
2024-03-26 11:06:23,107 epoch 6 - iter 18/95 - loss 0.05948374 - time (sec): 3.67 - samples/sec: 1788.89 - lr: 0.000016 - momentum: 0.000000
2024-03-26 11:06:24,828 epoch 6 - iter 27/95 - loss 0.06671877 - time (sec): 5.39 - samples/sec: 1824.30 - lr: 0.000016 - momentum: 0.000000
2024-03-26 11:06:26,526 epoch 6 - iter 36/95 - loss 0.06392011 - time (sec): 7.09 - samples/sec: 1788.22 - lr: 0.000016 - momentum: 0.000000
2024-03-26 11:06:28,148 epoch 6 - iter 45/95 - loss 0.07242741 - time (sec): 8.71 - samples/sec: 1805.81 - lr: 0.000015 - momentum: 0.000000
2024-03-26 11:06:30,233 epoch 6 - iter 54/95 - loss 0.07406039 - time (sec): 10.79 - samples/sec: 1781.25 - lr: 0.000015 - momentum: 0.000000
2024-03-26 11:06:31,868 epoch 6 - iter 63/95 - loss 0.07635195 - time (sec): 12.43 - samples/sec: 1778.68 - lr: 0.000015 - momentum: 0.000000
2024-03-26 11:06:34,770 epoch 6 - iter 72/95 - loss 0.07127343 - time (sec): 15.33 - samples/sec: 1739.55 - lr: 0.000014 - momentum: 0.000000
2024-03-26 11:06:36,712 epoch 6 - iter 81/95 - loss 0.06987351 - time (sec): 17.27 - samples/sec: 1743.49 - lr: 0.000014 - momentum: 0.000000
2024-03-26 11:06:38,441 epoch 6 - iter 90/95 - loss 0.07065652 - time (sec): 19.00 - samples/sec: 1737.38 - lr: 0.000014 - momentum: 0.000000
2024-03-26 11:06:39,053 ----------------------------------------------------------------------------------------------------
2024-03-26 11:06:39,053 EPOCH 6 done: loss 0.0710 - lr: 0.000014
2024-03-26 11:06:40,007 DEV : loss 0.2133890837430954 - f1-score (micro avg) 0.9046
2024-03-26 11:06:40,008 saving best model
2024-03-26 11:06:40,490 ----------------------------------------------------------------------------------------------------
2024-03-26 11:06:41,856 epoch 7 - iter 9/95 - loss 0.08524458 - time (sec): 1.37 - samples/sec: 2165.50 - lr: 0.000013 - momentum: 0.000000
2024-03-26 11:06:43,572 epoch 7 - iter 18/95 - loss 0.07552488 - time (sec): 3.08 - samples/sec: 1906.20 - lr: 0.000013 - momentum: 0.000000
2024-03-26 11:06:45,409 epoch 7 - iter 27/95 - loss 0.07187309 - time (sec): 4.92 - samples/sec: 1858.85 - lr: 0.000013 - momentum: 0.000000
2024-03-26 11:06:47,334 epoch 7 - iter 36/95 - loss 0.06534470 - time (sec): 6.84 - samples/sec: 1828.25 - lr: 0.000012 - momentum: 0.000000
2024-03-26 11:06:49,747 epoch 7 - iter 45/95 - loss 0.06072521 - time (sec): 9.26 - samples/sec: 1770.58 - lr: 0.000012 - momentum: 0.000000
2024-03-26 11:06:50,760 epoch 7 - iter 54/95 - loss 0.06119397 - time (sec): 10.27 - samples/sec: 1844.86 - lr: 0.000012 - momentum: 0.000000
2024-03-26 11:06:52,656 epoch 7 - iter 63/95 - loss 0.05777551 - time (sec): 12.16 - samples/sec: 1849.25 - lr: 0.000011 - momentum: 0.000000
2024-03-26 11:06:54,617 epoch 7 - iter 72/95 - loss 0.05549539 - time (sec): 14.13 - samples/sec: 1813.17 - lr: 0.000011 - momentum: 0.000000
2024-03-26 11:06:56,639 epoch 7 - iter 81/95 - loss 0.05670037 - time (sec): 16.15 - samples/sec: 1808.04 - lr: 0.000011 - momentum: 0.000000
2024-03-26 11:06:58,649 epoch 7 - iter 90/95 - loss 0.05591996 - time (sec): 18.16 - samples/sec: 1810.66 - lr: 0.000010 - momentum: 0.000000
2024-03-26 11:06:59,508 ----------------------------------------------------------------------------------------------------
2024-03-26 11:06:59,508 EPOCH 7 done: loss 0.0559 - lr: 0.000010
2024-03-26 11:07:00,471 DEV : loss 0.19283850491046906 - f1-score (micro avg) 0.9148
2024-03-26 11:07:00,473 saving best model
2024-03-26 11:07:00,949 ----------------------------------------------------------------------------------------------------
2024-03-26 11:07:02,622 epoch 8 - iter 9/95 - loss 0.04158058 - time (sec): 1.67 - samples/sec: 1788.85 - lr: 0.000010 - momentum: 0.000000
2024-03-26 11:07:04,677 epoch 8 - iter 18/95 - loss 0.03721012 - time (sec): 3.73 - samples/sec: 1631.56 - lr: 0.000010 - momentum: 0.000000
2024-03-26 11:07:06,284 epoch 8 - iter 27/95 - loss 0.03911622 - time (sec): 5.33 - samples/sec: 1725.71 - lr: 0.000009 - momentum: 0.000000
2024-03-26 11:07:08,047 epoch 8 - iter 36/95 - loss 0.04077943 - time (sec): 7.10 - samples/sec: 1773.99 - lr: 0.000009 - momentum: 0.000000
2024-03-26 11:07:10,485 epoch 8 - iter 45/95 - loss 0.03480845 - time (sec): 9.53 - samples/sec: 1743.66 - lr: 0.000009 - momentum: 0.000000
2024-03-26 11:07:12,891 epoch 8 - iter 54/95 - loss 0.03949966 - time (sec): 11.94 - samples/sec: 1743.95 - lr: 0.000008 - momentum: 0.000000
2024-03-26 11:07:14,890 epoch 8 - iter 63/95 - loss 0.04061254 - time (sec): 13.94 - samples/sec: 1751.54 - lr: 0.000008 - momentum: 0.000000
2024-03-26 11:07:15,995 epoch 8 - iter 72/95 - loss 0.04045703 - time (sec): 15.04 - samples/sec: 1784.82 - lr: 0.000008 - momentum: 0.000000
2024-03-26 11:07:17,702 epoch 8 - iter 81/95 - loss 0.04072728 - time (sec): 16.75 - samples/sec: 1771.85 - lr: 0.000007 - momentum: 0.000000
2024-03-26 11:07:19,135 epoch 8 - iter 90/95 - loss 0.04136720 - time (sec): 18.18 - samples/sec: 1784.94 - lr: 0.000007 - momentum: 0.000000
2024-03-26 11:07:20,399 ----------------------------------------------------------------------------------------------------
2024-03-26 11:07:20,399 EPOCH 8 done: loss 0.0431 - lr: 0.000007
2024-03-26 11:07:21,357 DEV : loss 0.22110594809055328 - f1-score (micro avg) 0.9234
2024-03-26 11:07:21,358 saving best model
2024-03-26 11:07:21,819 ----------------------------------------------------------------------------------------------------
2024-03-26 11:07:23,659 epoch 9 - iter 9/95 - loss 0.02117655 - time (sec): 1.84 - samples/sec: 1888.79 - lr: 0.000007 - momentum: 0.000000
2024-03-26 11:07:25,630 epoch 9 - iter 18/95 - loss 0.01735997 - time (sec): 3.81 - samples/sec: 1773.36 - lr: 0.000006 - momentum: 0.000000
2024-03-26 11:07:27,518 epoch 9 - iter 27/95 - loss 0.02071184 - time (sec): 5.70 - samples/sec: 1724.52 - lr: 0.000006 - momentum: 0.000000
2024-03-26 11:07:29,467 epoch 9 - iter 36/95 - loss 0.03209492 - time (sec): 7.65 - samples/sec: 1760.60 - lr: 0.000006 - momentum: 0.000000
2024-03-26 11:07:31,408 epoch 9 - iter 45/95 - loss 0.03322818 - time (sec): 9.59 - samples/sec: 1739.09 - lr: 0.000005 - momentum: 0.000000
2024-03-26 11:07:33,300 epoch 9 - iter 54/95 - loss 0.03241033 - time (sec): 11.48 - samples/sec: 1771.92 - lr: 0.000005 - momentum: 0.000000
2024-03-26 11:07:35,232 epoch 9 - iter 63/95 - loss 0.03415390 - time (sec): 13.41 - samples/sec: 1770.43 - lr: 0.000005 - momentum: 0.000000
2024-03-26 11:07:36,896 epoch 9 - iter 72/95 - loss 0.03629789 - time (sec): 15.08 - samples/sec: 1775.74 - lr: 0.000004 - momentum: 0.000000
2024-03-26 11:07:38,662 epoch 9 - iter 81/95 - loss 0.03751881 - time (sec): 16.84 - samples/sec: 1765.57 - lr: 0.000004 - momentum: 0.000000
2024-03-26 11:07:40,478 epoch 9 - iter 90/95 - loss 0.03567593 - time (sec): 18.66 - samples/sec: 1781.63 - lr: 0.000004 - momentum: 0.000000
2024-03-26 11:07:40,996 ----------------------------------------------------------------------------------------------------
2024-03-26 11:07:40,996 EPOCH 9 done: loss 0.0366 - lr: 0.000004
2024-03-26 11:07:41,962 DEV : loss 0.21862706542015076 - f1-score (micro avg) 0.9222
2024-03-26 11:07:41,963 ----------------------------------------------------------------------------------------------------
2024-03-26 11:07:43,480 epoch 10 - iter 9/95 - loss 0.01060765 - time (sec): 1.52 - samples/sec: 1831.02 - lr: 0.000003 - momentum: 0.000000
2024-03-26 11:07:45,382 epoch 10 - iter 18/95 - loss 0.01790097 - time (sec): 3.42 - samples/sec: 1768.27 - lr: 0.000003 - momentum: 0.000000
2024-03-26 11:07:47,560 epoch 10 - iter 27/95 - loss 0.02783328 - time (sec): 5.60 - samples/sec: 1727.46 - lr: 0.000003 - momentum: 0.000000
2024-03-26 11:07:49,513 epoch 10 - iter 36/95 - loss 0.03132828 - time (sec): 7.55 - samples/sec: 1737.31 - lr: 0.000002 - momentum: 0.000000
2024-03-26 11:07:50,728 epoch 10 - iter 45/95 - loss 0.03012425 - time (sec): 8.76 - samples/sec: 1788.43 - lr: 0.000002 - momentum: 0.000000
2024-03-26 11:07:52,708 epoch 10 - iter 54/95 - loss 0.03276909 - time (sec): 10.74 - samples/sec: 1772.15 - lr: 0.000002 - momentum: 0.000000
2024-03-26 11:07:54,137 epoch 10 - iter 63/95 - loss 0.03503644 - time (sec): 12.17 - samples/sec: 1785.04 - lr: 0.000001 - momentum: 0.000000
2024-03-26 11:07:56,445 epoch 10 - iter 72/95 - loss 0.03111766 - time (sec): 14.48 - samples/sec: 1768.76 - lr: 0.000001 - momentum: 0.000000
2024-03-26 11:07:58,830 epoch 10 - iter 81/95 - loss 0.03395486 - time (sec): 16.87 - samples/sec: 1750.74 - lr: 0.000001 - momentum: 0.000000
2024-03-26 11:08:00,685 epoch 10 - iter 90/95 - loss 0.03194792 - time (sec): 18.72 - samples/sec: 1748.52 - lr: 0.000000 - momentum: 0.000000
2024-03-26 11:08:01,734 ----------------------------------------------------------------------------------------------------
2024-03-26 11:08:01,735 EPOCH 10 done: loss 0.0309 - lr: 0.000000
2024-03-26 11:08:02,703 DEV : loss 0.22073255479335785 - f1-score (micro avg) 0.9226
2024-03-26 11:08:03,024 ----------------------------------------------------------------------------------------------------
2024-03-26 11:08:03,025 Loading model from best epoch ...
2024-03-26 11:08:03,994 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
2024-03-26 11:08:04,882
Results:
- F-score (micro) 0.912
- F-score (macro) 0.6941
- Accuracy 0.8406
By class:
precision recall f1-score support
Unternehmen 0.9008 0.8872 0.8939 266
Auslagerung 0.8764 0.9116 0.8937 249
Ort 0.9852 0.9925 0.9888 134
Software 0.0000 0.0000 0.0000 0
micro avg 0.9058 0.9183 0.9120 649
macro avg 0.6906 0.6979 0.6941 649
weighted avg 0.9089 0.9183 0.9134 649
2024-03-26 11:08:04,882 ----------------------------------------------------------------------------------------------------