stefan-it's picture
Upload ./training.log with huggingface_hub
a92dcf3 verified
2024-03-26 16:00:10,103 ----------------------------------------------------------------------------------------------------
2024-03-26 16:00:10,103 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(31103, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2024-03-26 16:00:10,103 ----------------------------------------------------------------------------------------------------
2024-03-26 16:00:10,103 Corpus: 758 train + 94 dev + 96 test sentences
2024-03-26 16:00:10,103 ----------------------------------------------------------------------------------------------------
2024-03-26 16:00:10,103 Train: 758 sentences
2024-03-26 16:00:10,103 (train_with_dev=False, train_with_test=False)
2024-03-26 16:00:10,103 ----------------------------------------------------------------------------------------------------
2024-03-26 16:00:10,103 Training Params:
2024-03-26 16:00:10,103 - learning_rate: "5e-05"
2024-03-26 16:00:10,103 - mini_batch_size: "8"
2024-03-26 16:00:10,103 - max_epochs: "10"
2024-03-26 16:00:10,103 - shuffle: "True"
2024-03-26 16:00:10,103 ----------------------------------------------------------------------------------------------------
2024-03-26 16:00:10,103 Plugins:
2024-03-26 16:00:10,103 - TensorboardLogger
2024-03-26 16:00:10,103 - LinearScheduler | warmup_fraction: '0.1'
2024-03-26 16:00:10,103 ----------------------------------------------------------------------------------------------------
2024-03-26 16:00:10,103 Final evaluation on model from best epoch (best-model.pt)
2024-03-26 16:00:10,103 - metric: "('micro avg', 'f1-score')"
2024-03-26 16:00:10,103 ----------------------------------------------------------------------------------------------------
2024-03-26 16:00:10,103 Computation:
2024-03-26 16:00:10,103 - compute on device: cuda:0
2024-03-26 16:00:10,103 - embedding storage: none
2024-03-26 16:00:10,103 ----------------------------------------------------------------------------------------------------
2024-03-26 16:00:10,103 Model training base path: "flair-co-funer-german_dbmdz_bert_base-bs8-e10-lr5e-05-3"
2024-03-26 16:00:10,103 ----------------------------------------------------------------------------------------------------
2024-03-26 16:00:10,103 ----------------------------------------------------------------------------------------------------
2024-03-26 16:00:10,103 Logging anything other than scalars to TensorBoard is currently not supported.
2024-03-26 16:00:11,464 epoch 1 - iter 9/95 - loss 3.38937223 - time (sec): 1.36 - samples/sec: 2344.72 - lr: 0.000004 - momentum: 0.000000
2024-03-26 16:00:13,275 epoch 1 - iter 18/95 - loss 3.23427345 - time (sec): 3.17 - samples/sec: 1991.11 - lr: 0.000009 - momentum: 0.000000
2024-03-26 16:00:15,187 epoch 1 - iter 27/95 - loss 2.95063236 - time (sec): 5.08 - samples/sec: 1943.51 - lr: 0.000014 - momentum: 0.000000
2024-03-26 16:00:16,548 epoch 1 - iter 36/95 - loss 2.68191715 - time (sec): 6.44 - samples/sec: 1963.25 - lr: 0.000018 - momentum: 0.000000
2024-03-26 16:00:18,440 epoch 1 - iter 45/95 - loss 2.45788459 - time (sec): 8.34 - samples/sec: 1945.86 - lr: 0.000023 - momentum: 0.000000
2024-03-26 16:00:19,794 epoch 1 - iter 54/95 - loss 2.29457257 - time (sec): 9.69 - samples/sec: 1969.63 - lr: 0.000028 - momentum: 0.000000
2024-03-26 16:00:21,041 epoch 1 - iter 63/95 - loss 2.14298789 - time (sec): 10.94 - samples/sec: 1996.55 - lr: 0.000033 - momentum: 0.000000
2024-03-26 16:00:22,974 epoch 1 - iter 72/95 - loss 1.95332289 - time (sec): 12.87 - samples/sec: 1982.77 - lr: 0.000037 - momentum: 0.000000
2024-03-26 16:00:24,934 epoch 1 - iter 81/95 - loss 1.78824283 - time (sec): 14.83 - samples/sec: 1968.68 - lr: 0.000042 - momentum: 0.000000
2024-03-26 16:00:26,454 epoch 1 - iter 90/95 - loss 1.66300792 - time (sec): 16.35 - samples/sec: 1986.71 - lr: 0.000047 - momentum: 0.000000
2024-03-26 16:00:27,491 ----------------------------------------------------------------------------------------------------
2024-03-26 16:00:27,491 EPOCH 1 done: loss 1.5900 - lr: 0.000047
2024-03-26 16:00:28,295 DEV : loss 0.44570106267929077 - f1-score (micro avg) 0.687
2024-03-26 16:00:28,296 saving best model
2024-03-26 16:00:28,557 ----------------------------------------------------------------------------------------------------
2024-03-26 16:00:29,915 epoch 2 - iter 9/95 - loss 0.52755606 - time (sec): 1.36 - samples/sec: 2018.97 - lr: 0.000050 - momentum: 0.000000
2024-03-26 16:00:31,744 epoch 2 - iter 18/95 - loss 0.39877224 - time (sec): 3.19 - samples/sec: 1917.26 - lr: 0.000049 - momentum: 0.000000
2024-03-26 16:00:32,910 epoch 2 - iter 27/95 - loss 0.39086480 - time (sec): 4.35 - samples/sec: 1971.50 - lr: 0.000048 - momentum: 0.000000
2024-03-26 16:00:35,150 epoch 2 - iter 36/95 - loss 0.36615692 - time (sec): 6.59 - samples/sec: 1923.29 - lr: 0.000048 - momentum: 0.000000
2024-03-26 16:00:37,076 epoch 2 - iter 45/95 - loss 0.36188341 - time (sec): 8.52 - samples/sec: 1930.42 - lr: 0.000047 - momentum: 0.000000
2024-03-26 16:00:39,227 epoch 2 - iter 54/95 - loss 0.34938497 - time (sec): 10.67 - samples/sec: 1901.63 - lr: 0.000047 - momentum: 0.000000
2024-03-26 16:00:41,225 epoch 2 - iter 63/95 - loss 0.33873827 - time (sec): 12.67 - samples/sec: 1855.75 - lr: 0.000046 - momentum: 0.000000
2024-03-26 16:00:42,730 epoch 2 - iter 72/95 - loss 0.34109897 - time (sec): 14.17 - samples/sec: 1864.02 - lr: 0.000046 - momentum: 0.000000
2024-03-26 16:00:44,172 epoch 2 - iter 81/95 - loss 0.34624828 - time (sec): 15.61 - samples/sec: 1886.66 - lr: 0.000045 - momentum: 0.000000
2024-03-26 16:00:46,384 epoch 2 - iter 90/95 - loss 0.33363333 - time (sec): 17.83 - samples/sec: 1860.80 - lr: 0.000045 - momentum: 0.000000
2024-03-26 16:00:47,022 ----------------------------------------------------------------------------------------------------
2024-03-26 16:00:47,022 EPOCH 2 done: loss 0.3302 - lr: 0.000045
2024-03-26 16:00:47,911 DEV : loss 0.24601905047893524 - f1-score (micro avg) 0.8349
2024-03-26 16:00:47,912 saving best model
2024-03-26 16:00:48,351 ----------------------------------------------------------------------------------------------------
2024-03-26 16:00:49,982 epoch 3 - iter 9/95 - loss 0.17646890 - time (sec): 1.63 - samples/sec: 1833.52 - lr: 0.000044 - momentum: 0.000000
2024-03-26 16:00:51,766 epoch 3 - iter 18/95 - loss 0.16766463 - time (sec): 3.41 - samples/sec: 1855.01 - lr: 0.000043 - momentum: 0.000000
2024-03-26 16:00:52,956 epoch 3 - iter 27/95 - loss 0.18728041 - time (sec): 4.60 - samples/sec: 2030.12 - lr: 0.000043 - momentum: 0.000000
2024-03-26 16:00:54,515 epoch 3 - iter 36/95 - loss 0.18521714 - time (sec): 6.16 - samples/sec: 2016.24 - lr: 0.000042 - momentum: 0.000000
2024-03-26 16:00:55,916 epoch 3 - iter 45/95 - loss 0.19008186 - time (sec): 7.56 - samples/sec: 2026.74 - lr: 0.000042 - momentum: 0.000000
2024-03-26 16:00:57,902 epoch 3 - iter 54/95 - loss 0.18491639 - time (sec): 9.55 - samples/sec: 1979.28 - lr: 0.000041 - momentum: 0.000000
2024-03-26 16:00:59,891 epoch 3 - iter 63/95 - loss 0.18437887 - time (sec): 11.54 - samples/sec: 1929.51 - lr: 0.000041 - momentum: 0.000000
2024-03-26 16:01:01,728 epoch 3 - iter 72/95 - loss 0.18892877 - time (sec): 13.37 - samples/sec: 1912.41 - lr: 0.000040 - momentum: 0.000000
2024-03-26 16:01:03,733 epoch 3 - iter 81/95 - loss 0.18190291 - time (sec): 15.38 - samples/sec: 1885.10 - lr: 0.000040 - momentum: 0.000000
2024-03-26 16:01:05,682 epoch 3 - iter 90/95 - loss 0.19017568 - time (sec): 17.33 - samples/sec: 1886.81 - lr: 0.000039 - momentum: 0.000000
2024-03-26 16:01:06,770 ----------------------------------------------------------------------------------------------------
2024-03-26 16:01:06,770 EPOCH 3 done: loss 0.1854 - lr: 0.000039
2024-03-26 16:01:07,662 DEV : loss 0.17745639383792877 - f1-score (micro avg) 0.8735
2024-03-26 16:01:07,663 saving best model
2024-03-26 16:01:08,092 ----------------------------------------------------------------------------------------------------
2024-03-26 16:01:09,387 epoch 4 - iter 9/95 - loss 0.12389405 - time (sec): 1.29 - samples/sec: 2146.08 - lr: 0.000039 - momentum: 0.000000
2024-03-26 16:01:11,232 epoch 4 - iter 18/95 - loss 0.12111894 - time (sec): 3.14 - samples/sec: 1958.29 - lr: 0.000038 - momentum: 0.000000
2024-03-26 16:01:13,234 epoch 4 - iter 27/95 - loss 0.11487951 - time (sec): 5.14 - samples/sec: 1878.49 - lr: 0.000037 - momentum: 0.000000
2024-03-26 16:01:14,716 epoch 4 - iter 36/95 - loss 0.11435733 - time (sec): 6.62 - samples/sec: 1894.36 - lr: 0.000037 - momentum: 0.000000
2024-03-26 16:01:17,154 epoch 4 - iter 45/95 - loss 0.11185218 - time (sec): 9.06 - samples/sec: 1823.85 - lr: 0.000036 - momentum: 0.000000
2024-03-26 16:01:19,006 epoch 4 - iter 54/95 - loss 0.11138127 - time (sec): 10.91 - samples/sec: 1808.45 - lr: 0.000036 - momentum: 0.000000
2024-03-26 16:01:20,943 epoch 4 - iter 63/95 - loss 0.10945874 - time (sec): 12.85 - samples/sec: 1789.84 - lr: 0.000035 - momentum: 0.000000
2024-03-26 16:01:22,821 epoch 4 - iter 72/95 - loss 0.11053287 - time (sec): 14.73 - samples/sec: 1806.53 - lr: 0.000035 - momentum: 0.000000
2024-03-26 16:01:24,846 epoch 4 - iter 81/95 - loss 0.11929813 - time (sec): 16.75 - samples/sec: 1804.79 - lr: 0.000034 - momentum: 0.000000
2024-03-26 16:01:25,824 epoch 4 - iter 90/95 - loss 0.11963430 - time (sec): 17.73 - samples/sec: 1844.53 - lr: 0.000034 - momentum: 0.000000
2024-03-26 16:01:26,849 ----------------------------------------------------------------------------------------------------
2024-03-26 16:01:26,849 EPOCH 4 done: loss 0.1195 - lr: 0.000034
2024-03-26 16:01:27,742 DEV : loss 0.17124532163143158 - f1-score (micro avg) 0.9038
2024-03-26 16:01:27,743 saving best model
2024-03-26 16:01:28,185 ----------------------------------------------------------------------------------------------------
2024-03-26 16:01:30,060 epoch 5 - iter 9/95 - loss 0.09160459 - time (sec): 1.87 - samples/sec: 1834.63 - lr: 0.000033 - momentum: 0.000000
2024-03-26 16:01:31,476 epoch 5 - iter 18/95 - loss 0.08544367 - time (sec): 3.29 - samples/sec: 1902.36 - lr: 0.000032 - momentum: 0.000000
2024-03-26 16:01:32,833 epoch 5 - iter 27/95 - loss 0.10039035 - time (sec): 4.65 - samples/sec: 1943.76 - lr: 0.000032 - momentum: 0.000000
2024-03-26 16:01:34,716 epoch 5 - iter 36/95 - loss 0.09956787 - time (sec): 6.53 - samples/sec: 1879.63 - lr: 0.000031 - momentum: 0.000000
2024-03-26 16:01:36,923 epoch 5 - iter 45/95 - loss 0.09508086 - time (sec): 8.74 - samples/sec: 1863.49 - lr: 0.000031 - momentum: 0.000000
2024-03-26 16:01:39,355 epoch 5 - iter 54/95 - loss 0.09118724 - time (sec): 11.17 - samples/sec: 1818.57 - lr: 0.000030 - momentum: 0.000000
2024-03-26 16:01:41,013 epoch 5 - iter 63/95 - loss 0.08797836 - time (sec): 12.83 - samples/sec: 1811.31 - lr: 0.000030 - momentum: 0.000000
2024-03-26 16:01:42,784 epoch 5 - iter 72/95 - loss 0.08582414 - time (sec): 14.60 - samples/sec: 1811.74 - lr: 0.000029 - momentum: 0.000000
2024-03-26 16:01:44,986 epoch 5 - iter 81/95 - loss 0.08720160 - time (sec): 16.80 - samples/sec: 1796.16 - lr: 0.000029 - momentum: 0.000000
2024-03-26 16:01:46,370 epoch 5 - iter 90/95 - loss 0.08919404 - time (sec): 18.18 - samples/sec: 1812.62 - lr: 0.000028 - momentum: 0.000000
2024-03-26 16:01:47,136 ----------------------------------------------------------------------------------------------------
2024-03-26 16:01:47,136 EPOCH 5 done: loss 0.0873 - lr: 0.000028
2024-03-26 16:01:48,034 DEV : loss 0.16974209249019623 - f1-score (micro avg) 0.9198
2024-03-26 16:01:48,035 saving best model
2024-03-26 16:01:48,479 ----------------------------------------------------------------------------------------------------
2024-03-26 16:01:50,417 epoch 6 - iter 9/95 - loss 0.05813380 - time (sec): 1.94 - samples/sec: 1801.23 - lr: 0.000027 - momentum: 0.000000
2024-03-26 16:01:51,968 epoch 6 - iter 18/95 - loss 0.05561430 - time (sec): 3.49 - samples/sec: 1823.72 - lr: 0.000027 - momentum: 0.000000
2024-03-26 16:01:53,872 epoch 6 - iter 27/95 - loss 0.05571319 - time (sec): 5.39 - samples/sec: 1834.08 - lr: 0.000026 - momentum: 0.000000
2024-03-26 16:01:55,443 epoch 6 - iter 36/95 - loss 0.05560469 - time (sec): 6.96 - samples/sec: 1831.74 - lr: 0.000026 - momentum: 0.000000
2024-03-26 16:01:56,891 epoch 6 - iter 45/95 - loss 0.05722084 - time (sec): 8.41 - samples/sec: 1869.84 - lr: 0.000025 - momentum: 0.000000
2024-03-26 16:01:58,341 epoch 6 - iter 54/95 - loss 0.05800404 - time (sec): 9.86 - samples/sec: 1867.99 - lr: 0.000025 - momentum: 0.000000
2024-03-26 16:01:59,615 epoch 6 - iter 63/95 - loss 0.05975631 - time (sec): 11.13 - samples/sec: 1930.92 - lr: 0.000024 - momentum: 0.000000
2024-03-26 16:02:01,857 epoch 6 - iter 72/95 - loss 0.06417836 - time (sec): 13.38 - samples/sec: 1898.32 - lr: 0.000024 - momentum: 0.000000
2024-03-26 16:02:03,435 epoch 6 - iter 81/95 - loss 0.06173608 - time (sec): 14.95 - samples/sec: 1916.22 - lr: 0.000023 - momentum: 0.000000
2024-03-26 16:02:05,134 epoch 6 - iter 90/95 - loss 0.06379710 - time (sec): 16.65 - samples/sec: 1935.46 - lr: 0.000023 - momentum: 0.000000
2024-03-26 16:02:06,397 ----------------------------------------------------------------------------------------------------
2024-03-26 16:02:06,397 EPOCH 6 done: loss 0.0636 - lr: 0.000023
2024-03-26 16:02:07,293 DEV : loss 0.1660703867673874 - f1-score (micro avg) 0.936
2024-03-26 16:02:07,294 saving best model
2024-03-26 16:02:07,738 ----------------------------------------------------------------------------------------------------
2024-03-26 16:02:09,621 epoch 7 - iter 9/95 - loss 0.04045283 - time (sec): 1.88 - samples/sec: 1688.44 - lr: 0.000022 - momentum: 0.000000
2024-03-26 16:02:11,655 epoch 7 - iter 18/95 - loss 0.02841548 - time (sec): 3.91 - samples/sec: 1674.65 - lr: 0.000021 - momentum: 0.000000
2024-03-26 16:02:13,183 epoch 7 - iter 27/95 - loss 0.02627311 - time (sec): 5.44 - samples/sec: 1797.34 - lr: 0.000021 - momentum: 0.000000
2024-03-26 16:02:15,125 epoch 7 - iter 36/95 - loss 0.02722641 - time (sec): 7.38 - samples/sec: 1784.42 - lr: 0.000020 - momentum: 0.000000
2024-03-26 16:02:17,497 epoch 7 - iter 45/95 - loss 0.03259541 - time (sec): 9.76 - samples/sec: 1777.38 - lr: 0.000020 - momentum: 0.000000
2024-03-26 16:02:19,014 epoch 7 - iter 54/95 - loss 0.03491364 - time (sec): 11.27 - samples/sec: 1784.27 - lr: 0.000019 - momentum: 0.000000
2024-03-26 16:02:21,194 epoch 7 - iter 63/95 - loss 0.04056925 - time (sec): 13.45 - samples/sec: 1790.05 - lr: 0.000019 - momentum: 0.000000
2024-03-26 16:02:22,983 epoch 7 - iter 72/95 - loss 0.04684255 - time (sec): 15.24 - samples/sec: 1795.92 - lr: 0.000018 - momentum: 0.000000
2024-03-26 16:02:24,403 epoch 7 - iter 81/95 - loss 0.04395515 - time (sec): 16.66 - samples/sec: 1807.79 - lr: 0.000018 - momentum: 0.000000
2024-03-26 16:02:26,375 epoch 7 - iter 90/95 - loss 0.04605325 - time (sec): 18.63 - samples/sec: 1787.55 - lr: 0.000017 - momentum: 0.000000
2024-03-26 16:02:26,860 ----------------------------------------------------------------------------------------------------
2024-03-26 16:02:26,860 EPOCH 7 done: loss 0.0469 - lr: 0.000017
2024-03-26 16:02:27,749 DEV : loss 0.18657518923282623 - f1-score (micro avg) 0.9354
2024-03-26 16:02:27,750 ----------------------------------------------------------------------------------------------------
2024-03-26 16:02:29,622 epoch 8 - iter 9/95 - loss 0.02945220 - time (sec): 1.87 - samples/sec: 1713.38 - lr: 0.000016 - momentum: 0.000000
2024-03-26 16:02:32,102 epoch 8 - iter 18/95 - loss 0.02065644 - time (sec): 4.35 - samples/sec: 1702.09 - lr: 0.000016 - momentum: 0.000000
2024-03-26 16:02:33,863 epoch 8 - iter 27/95 - loss 0.02067654 - time (sec): 6.11 - samples/sec: 1740.79 - lr: 0.000015 - momentum: 0.000000
2024-03-26 16:02:35,488 epoch 8 - iter 36/95 - loss 0.02424997 - time (sec): 7.74 - samples/sec: 1713.45 - lr: 0.000015 - momentum: 0.000000
2024-03-26 16:02:36,997 epoch 8 - iter 45/95 - loss 0.02289045 - time (sec): 9.25 - samples/sec: 1746.67 - lr: 0.000014 - momentum: 0.000000
2024-03-26 16:02:38,662 epoch 8 - iter 54/95 - loss 0.02432025 - time (sec): 10.91 - samples/sec: 1767.82 - lr: 0.000014 - momentum: 0.000000
2024-03-26 16:02:40,854 epoch 8 - iter 63/95 - loss 0.03266905 - time (sec): 13.10 - samples/sec: 1766.14 - lr: 0.000013 - momentum: 0.000000
2024-03-26 16:02:43,099 epoch 8 - iter 72/95 - loss 0.03310354 - time (sec): 15.35 - samples/sec: 1748.47 - lr: 0.000013 - momentum: 0.000000
2024-03-26 16:02:44,752 epoch 8 - iter 81/95 - loss 0.03687701 - time (sec): 17.00 - samples/sec: 1752.00 - lr: 0.000012 - momentum: 0.000000
2024-03-26 16:02:46,046 epoch 8 - iter 90/95 - loss 0.03730740 - time (sec): 18.30 - samples/sec: 1794.50 - lr: 0.000012 - momentum: 0.000000
2024-03-26 16:02:46,945 ----------------------------------------------------------------------------------------------------
2024-03-26 16:02:46,945 EPOCH 8 done: loss 0.0362 - lr: 0.000012
2024-03-26 16:02:47,878 DEV : loss 0.18976689875125885 - f1-score (micro avg) 0.9344
2024-03-26 16:02:47,880 ----------------------------------------------------------------------------------------------------
2024-03-26 16:02:49,859 epoch 9 - iter 9/95 - loss 0.01635759 - time (sec): 1.98 - samples/sec: 1782.87 - lr: 0.000011 - momentum: 0.000000
2024-03-26 16:02:51,582 epoch 9 - iter 18/95 - loss 0.02304365 - time (sec): 3.70 - samples/sec: 1808.51 - lr: 0.000010 - momentum: 0.000000
2024-03-26 16:02:53,458 epoch 9 - iter 27/95 - loss 0.02506509 - time (sec): 5.58 - samples/sec: 1832.02 - lr: 0.000010 - momentum: 0.000000
2024-03-26 16:02:55,313 epoch 9 - iter 36/95 - loss 0.02234485 - time (sec): 7.43 - samples/sec: 1827.49 - lr: 0.000009 - momentum: 0.000000
2024-03-26 16:02:57,570 epoch 9 - iter 45/95 - loss 0.02061792 - time (sec): 9.69 - samples/sec: 1749.48 - lr: 0.000009 - momentum: 0.000000
2024-03-26 16:02:59,495 epoch 9 - iter 54/95 - loss 0.02488325 - time (sec): 11.61 - samples/sec: 1738.19 - lr: 0.000008 - momentum: 0.000000
2024-03-26 16:03:01,384 epoch 9 - iter 63/95 - loss 0.02441510 - time (sec): 13.50 - samples/sec: 1748.69 - lr: 0.000008 - momentum: 0.000000
2024-03-26 16:03:03,255 epoch 9 - iter 72/95 - loss 0.02401047 - time (sec): 15.37 - samples/sec: 1752.87 - lr: 0.000007 - momentum: 0.000000
2024-03-26 16:03:04,501 epoch 9 - iter 81/95 - loss 0.02516190 - time (sec): 16.62 - samples/sec: 1776.36 - lr: 0.000007 - momentum: 0.000000
2024-03-26 16:03:05,902 epoch 9 - iter 90/95 - loss 0.02891991 - time (sec): 18.02 - samples/sec: 1798.48 - lr: 0.000006 - momentum: 0.000000
2024-03-26 16:03:06,844 ----------------------------------------------------------------------------------------------------
2024-03-26 16:03:06,844 EPOCH 9 done: loss 0.0284 - lr: 0.000006
2024-03-26 16:03:07,796 DEV : loss 0.2133364975452423 - f1-score (micro avg) 0.9383
2024-03-26 16:03:07,797 saving best model
2024-03-26 16:03:08,256 ----------------------------------------------------------------------------------------------------
2024-03-26 16:03:10,399 epoch 10 - iter 9/95 - loss 0.00852392 - time (sec): 2.14 - samples/sec: 1780.84 - lr: 0.000005 - momentum: 0.000000
2024-03-26 16:03:11,661 epoch 10 - iter 18/95 - loss 0.00945942 - time (sec): 3.40 - samples/sec: 1902.77 - lr: 0.000005 - momentum: 0.000000
2024-03-26 16:03:12,973 epoch 10 - iter 27/95 - loss 0.03452703 - time (sec): 4.71 - samples/sec: 2011.60 - lr: 0.000004 - momentum: 0.000000
2024-03-26 16:03:14,328 epoch 10 - iter 36/95 - loss 0.02966500 - time (sec): 6.07 - samples/sec: 2032.32 - lr: 0.000004 - momentum: 0.000000
2024-03-26 16:03:16,198 epoch 10 - iter 45/95 - loss 0.02385615 - time (sec): 7.94 - samples/sec: 2005.69 - lr: 0.000003 - momentum: 0.000000
2024-03-26 16:03:17,776 epoch 10 - iter 54/95 - loss 0.02232956 - time (sec): 9.52 - samples/sec: 1991.74 - lr: 0.000003 - momentum: 0.000000
2024-03-26 16:03:20,299 epoch 10 - iter 63/95 - loss 0.02272554 - time (sec): 12.04 - samples/sec: 1909.44 - lr: 0.000002 - momentum: 0.000000
2024-03-26 16:03:21,585 epoch 10 - iter 72/95 - loss 0.02126761 - time (sec): 13.33 - samples/sec: 1913.48 - lr: 0.000002 - momentum: 0.000000
2024-03-26 16:03:23,927 epoch 10 - iter 81/95 - loss 0.02015630 - time (sec): 15.67 - samples/sec: 1859.08 - lr: 0.000001 - momentum: 0.000000
2024-03-26 16:03:26,158 epoch 10 - iter 90/95 - loss 0.02166406 - time (sec): 17.90 - samples/sec: 1837.32 - lr: 0.000001 - momentum: 0.000000
2024-03-26 16:03:27,216 ----------------------------------------------------------------------------------------------------
2024-03-26 16:03:27,216 EPOCH 10 done: loss 0.0211 - lr: 0.000001
2024-03-26 16:03:28,123 DEV : loss 0.21451087296009064 - f1-score (micro avg) 0.9331
2024-03-26 16:03:28,392 ----------------------------------------------------------------------------------------------------
2024-03-26 16:03:28,392 Loading model from best epoch ...
2024-03-26 16:03:29,252 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
2024-03-26 16:03:30,117
Results:
- F-score (micro) 0.907
- F-score (macro) 0.9177
- Accuracy 0.8345
By class:
precision recall f1-score support
Unternehmen 0.8520 0.8872 0.8692 266
Auslagerung 0.8958 0.9317 0.9134 249
Ort 0.9565 0.9851 0.9706 134
micro avg 0.8902 0.9245 0.9070 649
macro avg 0.9014 0.9347 0.9177 649
weighted avg 0.8904 0.9245 0.9071 649
2024-03-26 16:03:30,117 ----------------------------------------------------------------------------------------------------