stefan-it's picture
Upload ./training.log with huggingface_hub
8145aa3 verified
2024-03-26 11:38:10,623 ----------------------------------------------------------------------------------------------------
2024-03-26 11:38:10,623 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(30001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2024-03-26 11:38:10,623 ----------------------------------------------------------------------------------------------------
2024-03-26 11:38:10,623 Corpus: 758 train + 94 dev + 96 test sentences
2024-03-26 11:38:10,624 ----------------------------------------------------------------------------------------------------
2024-03-26 11:38:10,624 Train: 758 sentences
2024-03-26 11:38:10,624 (train_with_dev=False, train_with_test=False)
2024-03-26 11:38:10,624 ----------------------------------------------------------------------------------------------------
2024-03-26 11:38:10,624 Training Params:
2024-03-26 11:38:10,624 - learning_rate: "3e-05"
2024-03-26 11:38:10,624 - mini_batch_size: "8"
2024-03-26 11:38:10,624 - max_epochs: "10"
2024-03-26 11:38:10,624 - shuffle: "True"
2024-03-26 11:38:10,624 ----------------------------------------------------------------------------------------------------
2024-03-26 11:38:10,624 Plugins:
2024-03-26 11:38:10,624 - TensorboardLogger
2024-03-26 11:38:10,624 - LinearScheduler | warmup_fraction: '0.1'
2024-03-26 11:38:10,624 ----------------------------------------------------------------------------------------------------
2024-03-26 11:38:10,624 Final evaluation on model from best epoch (best-model.pt)
2024-03-26 11:38:10,624 - metric: "('micro avg', 'f1-score')"
2024-03-26 11:38:10,624 ----------------------------------------------------------------------------------------------------
2024-03-26 11:38:10,624 Computation:
2024-03-26 11:38:10,624 - compute on device: cuda:0
2024-03-26 11:38:10,624 - embedding storage: none
2024-03-26 11:38:10,624 ----------------------------------------------------------------------------------------------------
2024-03-26 11:38:10,624 Model training base path: "flair-co-funer-german_bert_base-bs8-e10-lr3e-05-3"
2024-03-26 11:38:10,624 ----------------------------------------------------------------------------------------------------
2024-03-26 11:38:10,624 ----------------------------------------------------------------------------------------------------
2024-03-26 11:38:10,624 Logging anything other than scalars to TensorBoard is currently not supported.
2024-03-26 11:38:12,056 epoch 1 - iter 9/95 - loss 3.00468469 - time (sec): 1.43 - samples/sec: 2227.30 - lr: 0.000003 - momentum: 0.000000
2024-03-26 11:38:13,987 epoch 1 - iter 18/95 - loss 2.92881161 - time (sec): 3.36 - samples/sec: 1877.95 - lr: 0.000005 - momentum: 0.000000
2024-03-26 11:38:15,931 epoch 1 - iter 27/95 - loss 2.76773785 - time (sec): 5.31 - samples/sec: 1861.61 - lr: 0.000008 - momentum: 0.000000
2024-03-26 11:38:17,338 epoch 1 - iter 36/95 - loss 2.60379851 - time (sec): 6.71 - samples/sec: 1884.54 - lr: 0.000011 - momentum: 0.000000
2024-03-26 11:38:19,280 epoch 1 - iter 45/95 - loss 2.47709620 - time (sec): 8.66 - samples/sec: 1874.16 - lr: 0.000014 - momentum: 0.000000
2024-03-26 11:38:20,693 epoch 1 - iter 54/95 - loss 2.35481628 - time (sec): 10.07 - samples/sec: 1895.68 - lr: 0.000017 - momentum: 0.000000
2024-03-26 11:38:21,973 epoch 1 - iter 63/95 - loss 2.24785391 - time (sec): 11.35 - samples/sec: 1924.02 - lr: 0.000020 - momentum: 0.000000
2024-03-26 11:38:23,942 epoch 1 - iter 72/95 - loss 2.09969141 - time (sec): 13.32 - samples/sec: 1916.20 - lr: 0.000022 - momentum: 0.000000
2024-03-26 11:38:25,948 epoch 1 - iter 81/95 - loss 1.95569193 - time (sec): 15.32 - samples/sec: 1905.21 - lr: 0.000025 - momentum: 0.000000
2024-03-26 11:38:27,481 epoch 1 - iter 90/95 - loss 1.84081212 - time (sec): 16.86 - samples/sec: 1927.03 - lr: 0.000028 - momentum: 0.000000
2024-03-26 11:38:28,567 ----------------------------------------------------------------------------------------------------
2024-03-26 11:38:28,567 EPOCH 1 done: loss 1.7687 - lr: 0.000028
2024-03-26 11:38:29,421 DEV : loss 0.5348557829856873 - f1-score (micro avg) 0.6468
2024-03-26 11:38:29,423 saving best model
2024-03-26 11:38:29,692 ----------------------------------------------------------------------------------------------------
2024-03-26 11:38:31,120 epoch 2 - iter 9/95 - loss 0.56636333 - time (sec): 1.43 - samples/sec: 1919.48 - lr: 0.000030 - momentum: 0.000000
2024-03-26 11:38:33,005 epoch 2 - iter 18/95 - loss 0.48310807 - time (sec): 3.31 - samples/sec: 1844.25 - lr: 0.000029 - momentum: 0.000000
2024-03-26 11:38:34,210 epoch 2 - iter 27/95 - loss 0.47598804 - time (sec): 4.52 - samples/sec: 1899.34 - lr: 0.000029 - momentum: 0.000000
2024-03-26 11:38:36,520 epoch 2 - iter 36/95 - loss 0.45006817 - time (sec): 6.83 - samples/sec: 1857.02 - lr: 0.000029 - momentum: 0.000000
2024-03-26 11:38:38,497 epoch 2 - iter 45/95 - loss 0.43584834 - time (sec): 8.80 - samples/sec: 1867.64 - lr: 0.000028 - momentum: 0.000000
2024-03-26 11:38:40,759 epoch 2 - iter 54/95 - loss 0.42240852 - time (sec): 11.07 - samples/sec: 1833.29 - lr: 0.000028 - momentum: 0.000000
2024-03-26 11:38:42,823 epoch 2 - iter 63/95 - loss 0.40358698 - time (sec): 13.13 - samples/sec: 1790.32 - lr: 0.000028 - momentum: 0.000000
2024-03-26 11:38:44,393 epoch 2 - iter 72/95 - loss 0.40170925 - time (sec): 14.70 - samples/sec: 1797.04 - lr: 0.000028 - momentum: 0.000000
2024-03-26 11:38:45,876 epoch 2 - iter 81/95 - loss 0.40641165 - time (sec): 16.18 - samples/sec: 1820.31 - lr: 0.000027 - momentum: 0.000000
2024-03-26 11:38:48,250 epoch 2 - iter 90/95 - loss 0.39195044 - time (sec): 18.56 - samples/sec: 1787.46 - lr: 0.000027 - momentum: 0.000000
2024-03-26 11:38:48,909 ----------------------------------------------------------------------------------------------------
2024-03-26 11:38:48,909 EPOCH 2 done: loss 0.3885 - lr: 0.000027
2024-03-26 11:38:49,844 DEV : loss 0.2869340181350708 - f1-score (micro avg) 0.8096
2024-03-26 11:38:49,845 saving best model
2024-03-26 11:38:50,309 ----------------------------------------------------------------------------------------------------
2024-03-26 11:38:51,988 epoch 3 - iter 9/95 - loss 0.21703660 - time (sec): 1.68 - samples/sec: 1779.77 - lr: 0.000026 - momentum: 0.000000
2024-03-26 11:38:53,823 epoch 3 - iter 18/95 - loss 0.20441662 - time (sec): 3.51 - samples/sec: 1802.25 - lr: 0.000026 - momentum: 0.000000
2024-03-26 11:38:55,060 epoch 3 - iter 27/95 - loss 0.21475983 - time (sec): 4.75 - samples/sec: 1966.73 - lr: 0.000026 - momentum: 0.000000
2024-03-26 11:38:56,653 epoch 3 - iter 36/95 - loss 0.20563935 - time (sec): 6.34 - samples/sec: 1958.60 - lr: 0.000025 - momentum: 0.000000
2024-03-26 11:38:58,141 epoch 3 - iter 45/95 - loss 0.21041674 - time (sec): 7.83 - samples/sec: 1957.44 - lr: 0.000025 - momentum: 0.000000
2024-03-26 11:39:00,188 epoch 3 - iter 54/95 - loss 0.20554863 - time (sec): 9.88 - samples/sec: 1913.23 - lr: 0.000025 - momentum: 0.000000
2024-03-26 11:39:02,236 epoch 3 - iter 63/95 - loss 0.20061276 - time (sec): 11.93 - samples/sec: 1866.68 - lr: 0.000025 - momentum: 0.000000
2024-03-26 11:39:04,124 epoch 3 - iter 72/95 - loss 0.20369030 - time (sec): 13.81 - samples/sec: 1851.62 - lr: 0.000024 - momentum: 0.000000
2024-03-26 11:39:06,179 epoch 3 - iter 81/95 - loss 0.19527871 - time (sec): 15.87 - samples/sec: 1826.89 - lr: 0.000024 - momentum: 0.000000
2024-03-26 11:39:08,208 epoch 3 - iter 90/95 - loss 0.20474082 - time (sec): 17.90 - samples/sec: 1826.75 - lr: 0.000024 - momentum: 0.000000
2024-03-26 11:39:09,343 ----------------------------------------------------------------------------------------------------
2024-03-26 11:39:09,343 EPOCH 3 done: loss 0.2006 - lr: 0.000024
2024-03-26 11:39:10,283 DEV : loss 0.22436510026454926 - f1-score (micro avg) 0.8601
2024-03-26 11:39:10,284 saving best model
2024-03-26 11:39:10,721 ----------------------------------------------------------------------------------------------------
2024-03-26 11:39:12,045 epoch 4 - iter 9/95 - loss 0.15904831 - time (sec): 1.32 - samples/sec: 2097.94 - lr: 0.000023 - momentum: 0.000000
2024-03-26 11:39:13,966 epoch 4 - iter 18/95 - loss 0.13585153 - time (sec): 3.24 - samples/sec: 1894.27 - lr: 0.000023 - momentum: 0.000000
2024-03-26 11:39:16,032 epoch 4 - iter 27/95 - loss 0.14408454 - time (sec): 5.31 - samples/sec: 1818.35 - lr: 0.000022 - momentum: 0.000000
2024-03-26 11:39:17,579 epoch 4 - iter 36/95 - loss 0.13786917 - time (sec): 6.86 - samples/sec: 1829.31 - lr: 0.000022 - momentum: 0.000000
2024-03-26 11:39:20,120 epoch 4 - iter 45/95 - loss 0.13153200 - time (sec): 9.40 - samples/sec: 1758.29 - lr: 0.000022 - momentum: 0.000000
2024-03-26 11:39:22,024 epoch 4 - iter 54/95 - loss 0.12653534 - time (sec): 11.30 - samples/sec: 1746.17 - lr: 0.000022 - momentum: 0.000000
2024-03-26 11:39:24,049 epoch 4 - iter 63/95 - loss 0.12344140 - time (sec): 13.33 - samples/sec: 1725.65 - lr: 0.000021 - momentum: 0.000000
2024-03-26 11:39:25,973 epoch 4 - iter 72/95 - loss 0.12630286 - time (sec): 15.25 - samples/sec: 1744.48 - lr: 0.000021 - momentum: 0.000000
2024-03-26 11:39:28,035 epoch 4 - iter 81/95 - loss 0.13168865 - time (sec): 17.31 - samples/sec: 1746.30 - lr: 0.000021 - momentum: 0.000000
2024-03-26 11:39:29,038 epoch 4 - iter 90/95 - loss 0.13058662 - time (sec): 18.32 - samples/sec: 1785.53 - lr: 0.000020 - momentum: 0.000000
2024-03-26 11:39:30,083 ----------------------------------------------------------------------------------------------------
2024-03-26 11:39:30,083 EPOCH 4 done: loss 0.1294 - lr: 0.000020
2024-03-26 11:39:31,013 DEV : loss 0.19798463582992554 - f1-score (micro avg) 0.8867
2024-03-26 11:39:31,014 saving best model
2024-03-26 11:39:31,447 ----------------------------------------------------------------------------------------------------
2024-03-26 11:39:33,430 epoch 5 - iter 9/95 - loss 0.10970783 - time (sec): 1.98 - samples/sec: 1733.87 - lr: 0.000020 - momentum: 0.000000
2024-03-26 11:39:34,886 epoch 5 - iter 18/95 - loss 0.10381087 - time (sec): 3.44 - samples/sec: 1820.09 - lr: 0.000019 - momentum: 0.000000
2024-03-26 11:39:36,269 epoch 5 - iter 27/95 - loss 0.10324233 - time (sec): 4.82 - samples/sec: 1873.54 - lr: 0.000019 - momentum: 0.000000
2024-03-26 11:39:38,219 epoch 5 - iter 36/95 - loss 0.10862119 - time (sec): 6.77 - samples/sec: 1812.61 - lr: 0.000019 - momentum: 0.000000
2024-03-26 11:39:40,531 epoch 5 - iter 45/95 - loss 0.10336276 - time (sec): 9.08 - samples/sec: 1792.35 - lr: 0.000019 - momentum: 0.000000
2024-03-26 11:39:43,050 epoch 5 - iter 54/95 - loss 0.09777919 - time (sec): 11.60 - samples/sec: 1750.74 - lr: 0.000018 - momentum: 0.000000
2024-03-26 11:39:44,772 epoch 5 - iter 63/95 - loss 0.09605003 - time (sec): 13.32 - samples/sec: 1743.65 - lr: 0.000018 - momentum: 0.000000
2024-03-26 11:39:46,623 epoch 5 - iter 72/95 - loss 0.09405670 - time (sec): 15.17 - samples/sec: 1742.78 - lr: 0.000018 - momentum: 0.000000
2024-03-26 11:39:48,974 epoch 5 - iter 81/95 - loss 0.09408093 - time (sec): 17.53 - samples/sec: 1721.72 - lr: 0.000017 - momentum: 0.000000
2024-03-26 11:39:50,415 epoch 5 - iter 90/95 - loss 0.09570225 - time (sec): 18.97 - samples/sec: 1737.72 - lr: 0.000017 - momentum: 0.000000
2024-03-26 11:39:51,218 ----------------------------------------------------------------------------------------------------
2024-03-26 11:39:51,218 EPOCH 5 done: loss 0.0938 - lr: 0.000017
2024-03-26 11:39:52,179 DEV : loss 0.1974947303533554 - f1-score (micro avg) 0.9042
2024-03-26 11:39:52,180 saving best model
2024-03-26 11:39:52,617 ----------------------------------------------------------------------------------------------------
2024-03-26 11:39:54,589 epoch 6 - iter 9/95 - loss 0.07217050 - time (sec): 1.97 - samples/sec: 1769.07 - lr: 0.000016 - momentum: 0.000000
2024-03-26 11:39:56,183 epoch 6 - iter 18/95 - loss 0.07321513 - time (sec): 3.57 - samples/sec: 1783.75 - lr: 0.000016 - momentum: 0.000000
2024-03-26 11:39:58,140 epoch 6 - iter 27/95 - loss 0.07237570 - time (sec): 5.52 - samples/sec: 1790.40 - lr: 0.000016 - momentum: 0.000000
2024-03-26 11:39:59,762 epoch 6 - iter 36/95 - loss 0.07303326 - time (sec): 7.14 - samples/sec: 1784.81 - lr: 0.000016 - momentum: 0.000000
2024-03-26 11:40:01,242 epoch 6 - iter 45/95 - loss 0.07181887 - time (sec): 8.62 - samples/sec: 1823.24 - lr: 0.000015 - momentum: 0.000000
2024-03-26 11:40:02,728 epoch 6 - iter 54/95 - loss 0.06802721 - time (sec): 10.11 - samples/sec: 1821.72 - lr: 0.000015 - momentum: 0.000000
2024-03-26 11:40:04,048 epoch 6 - iter 63/95 - loss 0.06640286 - time (sec): 11.43 - samples/sec: 1880.83 - lr: 0.000015 - momentum: 0.000000
2024-03-26 11:40:06,394 epoch 6 - iter 72/95 - loss 0.07238628 - time (sec): 13.78 - samples/sec: 1843.19 - lr: 0.000014 - momentum: 0.000000
2024-03-26 11:40:08,029 epoch 6 - iter 81/95 - loss 0.06982111 - time (sec): 15.41 - samples/sec: 1859.26 - lr: 0.000014 - momentum: 0.000000
2024-03-26 11:40:09,787 epoch 6 - iter 90/95 - loss 0.07240808 - time (sec): 17.17 - samples/sec: 1877.24 - lr: 0.000014 - momentum: 0.000000
2024-03-26 11:40:11,168 ----------------------------------------------------------------------------------------------------
2024-03-26 11:40:11,168 EPOCH 6 done: loss 0.0733 - lr: 0.000014
2024-03-26 11:40:12,142 DEV : loss 0.20535056293010712 - f1-score (micro avg) 0.9062
2024-03-26 11:40:12,143 saving best model
2024-03-26 11:40:12,585 ----------------------------------------------------------------------------------------------------
2024-03-26 11:40:14,631 epoch 7 - iter 9/95 - loss 0.06137997 - time (sec): 2.05 - samples/sec: 1553.04 - lr: 0.000013 - momentum: 0.000000
2024-03-26 11:40:16,692 epoch 7 - iter 18/95 - loss 0.04583919 - time (sec): 4.11 - samples/sec: 1596.53 - lr: 0.000013 - momentum: 0.000000
2024-03-26 11:40:18,293 epoch 7 - iter 27/95 - loss 0.04102055 - time (sec): 5.71 - samples/sec: 1714.12 - lr: 0.000013 - momentum: 0.000000
2024-03-26 11:40:20,332 epoch 7 - iter 36/95 - loss 0.04230142 - time (sec): 7.75 - samples/sec: 1701.02 - lr: 0.000012 - momentum: 0.000000
2024-03-26 11:40:22,736 epoch 7 - iter 45/95 - loss 0.04534105 - time (sec): 10.15 - samples/sec: 1708.38 - lr: 0.000012 - momentum: 0.000000
2024-03-26 11:40:24,299 epoch 7 - iter 54/95 - loss 0.04383378 - time (sec): 11.71 - samples/sec: 1717.39 - lr: 0.000012 - momentum: 0.000000
2024-03-26 11:40:26,549 epoch 7 - iter 63/95 - loss 0.04715345 - time (sec): 13.96 - samples/sec: 1724.72 - lr: 0.000011 - momentum: 0.000000
2024-03-26 11:40:28,393 epoch 7 - iter 72/95 - loss 0.05266698 - time (sec): 15.81 - samples/sec: 1731.80 - lr: 0.000011 - momentum: 0.000000
2024-03-26 11:40:29,865 epoch 7 - iter 81/95 - loss 0.04998609 - time (sec): 17.28 - samples/sec: 1743.30 - lr: 0.000011 - momentum: 0.000000
2024-03-26 11:40:31,962 epoch 7 - iter 90/95 - loss 0.05320010 - time (sec): 19.38 - samples/sec: 1719.17 - lr: 0.000010 - momentum: 0.000000
2024-03-26 11:40:32,440 ----------------------------------------------------------------------------------------------------
2024-03-26 11:40:32,441 EPOCH 7 done: loss 0.0533 - lr: 0.000010
2024-03-26 11:40:33,404 DEV : loss 0.1789586842060089 - f1-score (micro avg) 0.9121
2024-03-26 11:40:33,405 saving best model
2024-03-26 11:40:33,834 ----------------------------------------------------------------------------------------------------
2024-03-26 11:40:35,732 epoch 8 - iter 9/95 - loss 0.03467893 - time (sec): 1.90 - samples/sec: 1690.13 - lr: 0.000010 - momentum: 0.000000
2024-03-26 11:40:38,308 epoch 8 - iter 18/95 - loss 0.02789460 - time (sec): 4.47 - samples/sec: 1656.17 - lr: 0.000010 - momentum: 0.000000
2024-03-26 11:40:40,169 epoch 8 - iter 27/95 - loss 0.02442571 - time (sec): 6.33 - samples/sec: 1680.13 - lr: 0.000009 - momentum: 0.000000
2024-03-26 11:40:41,782 epoch 8 - iter 36/95 - loss 0.02717416 - time (sec): 7.95 - samples/sec: 1668.45 - lr: 0.000009 - momentum: 0.000000
2024-03-26 11:40:43,339 epoch 8 - iter 45/95 - loss 0.02552008 - time (sec): 9.50 - samples/sec: 1699.36 - lr: 0.000009 - momentum: 0.000000
2024-03-26 11:40:45,064 epoch 8 - iter 54/95 - loss 0.02682554 - time (sec): 11.23 - samples/sec: 1717.91 - lr: 0.000008 - momentum: 0.000000
2024-03-26 11:40:47,330 epoch 8 - iter 63/95 - loss 0.03473284 - time (sec): 13.50 - samples/sec: 1714.85 - lr: 0.000008 - momentum: 0.000000
2024-03-26 11:40:49,691 epoch 8 - iter 72/95 - loss 0.04004587 - time (sec): 15.86 - samples/sec: 1692.57 - lr: 0.000008 - momentum: 0.000000
2024-03-26 11:40:51,432 epoch 8 - iter 81/95 - loss 0.04672366 - time (sec): 17.60 - samples/sec: 1692.79 - lr: 0.000007 - momentum: 0.000000
2024-03-26 11:40:52,756 epoch 8 - iter 90/95 - loss 0.04625643 - time (sec): 18.92 - samples/sec: 1735.25 - lr: 0.000007 - momentum: 0.000000
2024-03-26 11:40:53,691 ----------------------------------------------------------------------------------------------------
2024-03-26 11:40:53,691 EPOCH 8 done: loss 0.0449 - lr: 0.000007
2024-03-26 11:40:54,627 DEV : loss 0.20146656036376953 - f1-score (micro avg) 0.913
2024-03-26 11:40:54,628 saving best model
2024-03-26 11:40:55,073 ----------------------------------------------------------------------------------------------------
2024-03-26 11:40:57,083 epoch 9 - iter 9/95 - loss 0.01238365 - time (sec): 2.01 - samples/sec: 1755.11 - lr: 0.000007 - momentum: 0.000000
2024-03-26 11:40:58,881 epoch 9 - iter 18/95 - loss 0.02759052 - time (sec): 3.81 - samples/sec: 1758.22 - lr: 0.000006 - momentum: 0.000000
2024-03-26 11:41:00,905 epoch 9 - iter 27/95 - loss 0.02758322 - time (sec): 5.83 - samples/sec: 1752.05 - lr: 0.000006 - momentum: 0.000000
2024-03-26 11:41:02,853 epoch 9 - iter 36/95 - loss 0.02846974 - time (sec): 7.78 - samples/sec: 1746.04 - lr: 0.000006 - momentum: 0.000000
2024-03-26 11:41:05,177 epoch 9 - iter 45/95 - loss 0.02677424 - time (sec): 10.10 - samples/sec: 1677.81 - lr: 0.000005 - momentum: 0.000000
2024-03-26 11:41:07,176 epoch 9 - iter 54/95 - loss 0.03171239 - time (sec): 12.10 - samples/sec: 1668.11 - lr: 0.000005 - momentum: 0.000000
2024-03-26 11:41:09,138 epoch 9 - iter 63/95 - loss 0.03114385 - time (sec): 14.06 - samples/sec: 1678.90 - lr: 0.000005 - momentum: 0.000000
2024-03-26 11:41:11,164 epoch 9 - iter 72/95 - loss 0.03333336 - time (sec): 16.09 - samples/sec: 1674.88 - lr: 0.000004 - momentum: 0.000000
2024-03-26 11:41:12,456 epoch 9 - iter 81/95 - loss 0.03525355 - time (sec): 17.38 - samples/sec: 1698.51 - lr: 0.000004 - momentum: 0.000000
2024-03-26 11:41:13,929 epoch 9 - iter 90/95 - loss 0.03895801 - time (sec): 18.86 - samples/sec: 1718.89 - lr: 0.000004 - momentum: 0.000000
2024-03-26 11:41:14,915 ----------------------------------------------------------------------------------------------------
2024-03-26 11:41:14,915 EPOCH 9 done: loss 0.0384 - lr: 0.000004
2024-03-26 11:41:15,903 DEV : loss 0.21197910606861115 - f1-score (micro avg) 0.92
2024-03-26 11:41:15,904 saving best model
2024-03-26 11:41:16,343 ----------------------------------------------------------------------------------------------------
2024-03-26 11:41:18,547 epoch 10 - iter 9/95 - loss 0.01499061 - time (sec): 2.20 - samples/sec: 1730.24 - lr: 0.000003 - momentum: 0.000000
2024-03-26 11:41:19,860 epoch 10 - iter 18/95 - loss 0.01541849 - time (sec): 3.52 - samples/sec: 1841.02 - lr: 0.000003 - momentum: 0.000000
2024-03-26 11:41:21,230 epoch 10 - iter 27/95 - loss 0.03534289 - time (sec): 4.89 - samples/sec: 1940.76 - lr: 0.000003 - momentum: 0.000000
2024-03-26 11:41:22,652 epoch 10 - iter 36/95 - loss 0.03443000 - time (sec): 6.31 - samples/sec: 1955.29 - lr: 0.000002 - momentum: 0.000000
2024-03-26 11:41:24,582 epoch 10 - iter 45/95 - loss 0.02965463 - time (sec): 8.24 - samples/sec: 1933.09 - lr: 0.000002 - momentum: 0.000000
2024-03-26 11:41:26,272 epoch 10 - iter 54/95 - loss 0.02812516 - time (sec): 9.93 - samples/sec: 1909.42 - lr: 0.000002 - momentum: 0.000000
2024-03-26 11:41:28,862 epoch 10 - iter 63/95 - loss 0.02903030 - time (sec): 12.52 - samples/sec: 1836.54 - lr: 0.000001 - momentum: 0.000000
2024-03-26 11:41:30,168 epoch 10 - iter 72/95 - loss 0.02845424 - time (sec): 13.82 - samples/sec: 1844.54 - lr: 0.000001 - momentum: 0.000000
2024-03-26 11:41:32,604 epoch 10 - iter 81/95 - loss 0.02695920 - time (sec): 16.26 - samples/sec: 1791.40 - lr: 0.000001 - momentum: 0.000000
2024-03-26 11:41:34,895 epoch 10 - iter 90/95 - loss 0.03168106 - time (sec): 18.55 - samples/sec: 1772.73 - lr: 0.000000 - momentum: 0.000000
2024-03-26 11:41:35,996 ----------------------------------------------------------------------------------------------------
2024-03-26 11:41:35,996 EPOCH 10 done: loss 0.0322 - lr: 0.000000
2024-03-26 11:41:36,940 DEV : loss 0.2103956937789917 - f1-score (micro avg) 0.9207
2024-03-26 11:41:36,942 saving best model
2024-03-26 11:41:37,656 ----------------------------------------------------------------------------------------------------
2024-03-26 11:41:37,657 Loading model from best epoch ...
2024-03-26 11:41:38,536 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
2024-03-26 11:41:39,309
Results:
- F-score (micro) 0.9048
- F-score (macro) 0.6892
- Accuracy 0.8285
By class:
precision recall f1-score support
Unternehmen 0.8792 0.8759 0.8776 266
Auslagerung 0.8736 0.9157 0.8941 249
Ort 0.9779 0.9925 0.9852 134
Software 0.0000 0.0000 0.0000 0
micro avg 0.8946 0.9153 0.9048 649
macro avg 0.6827 0.6960 0.6892 649
weighted avg 0.8974 0.9153 0.9061 649
2024-03-26 11:41:39,309 ----------------------------------------------------------------------------------------------------