2024-03-26 16:00:10,103 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:00:10,103 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(31103, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2024-03-26 16:00:10,103 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:00:10,103 Corpus: 758 train + 94 dev + 96 test sentences 2024-03-26 16:00:10,103 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:00:10,103 Train: 758 sentences 2024-03-26 16:00:10,103 (train_with_dev=False, train_with_test=False) 2024-03-26 16:00:10,103 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:00:10,103 Training Params: 2024-03-26 16:00:10,103 - learning_rate: "5e-05" 2024-03-26 16:00:10,103 - mini_batch_size: "8" 2024-03-26 16:00:10,103 - max_epochs: "10" 2024-03-26 16:00:10,103 - shuffle: "True" 2024-03-26 16:00:10,103 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:00:10,103 Plugins: 2024-03-26 16:00:10,103 - TensorboardLogger 2024-03-26 16:00:10,103 - LinearScheduler | warmup_fraction: '0.1' 2024-03-26 16:00:10,103 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:00:10,103 Final evaluation on model from best epoch (best-model.pt) 2024-03-26 16:00:10,103 - metric: "('micro avg', 'f1-score')" 2024-03-26 16:00:10,103 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:00:10,103 Computation: 2024-03-26 16:00:10,103 - compute on device: cuda:0 2024-03-26 16:00:10,103 - embedding storage: none 2024-03-26 16:00:10,103 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:00:10,103 Model training base path: "flair-co-funer-german_dbmdz_bert_base-bs8-e10-lr5e-05-3" 2024-03-26 16:00:10,103 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:00:10,103 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:00:10,103 Logging anything other than scalars to TensorBoard is currently not supported. 2024-03-26 16:00:11,464 epoch 1 - iter 9/95 - loss 3.38937223 - time (sec): 1.36 - samples/sec: 2344.72 - lr: 0.000004 - momentum: 0.000000 2024-03-26 16:00:13,275 epoch 1 - iter 18/95 - loss 3.23427345 - time (sec): 3.17 - samples/sec: 1991.11 - lr: 0.000009 - momentum: 0.000000 2024-03-26 16:00:15,187 epoch 1 - iter 27/95 - loss 2.95063236 - time (sec): 5.08 - samples/sec: 1943.51 - lr: 0.000014 - momentum: 0.000000 2024-03-26 16:00:16,548 epoch 1 - iter 36/95 - loss 2.68191715 - time (sec): 6.44 - samples/sec: 1963.25 - lr: 0.000018 - momentum: 0.000000 2024-03-26 16:00:18,440 epoch 1 - iter 45/95 - loss 2.45788459 - time (sec): 8.34 - samples/sec: 1945.86 - lr: 0.000023 - momentum: 0.000000 2024-03-26 16:00:19,794 epoch 1 - iter 54/95 - loss 2.29457257 - time (sec): 9.69 - samples/sec: 1969.63 - lr: 0.000028 - momentum: 0.000000 2024-03-26 16:00:21,041 epoch 1 - iter 63/95 - loss 2.14298789 - time (sec): 10.94 - samples/sec: 1996.55 - lr: 0.000033 - momentum: 0.000000 2024-03-26 16:00:22,974 epoch 1 - iter 72/95 - loss 1.95332289 - time (sec): 12.87 - samples/sec: 1982.77 - lr: 0.000037 - momentum: 0.000000 2024-03-26 16:00:24,934 epoch 1 - iter 81/95 - loss 1.78824283 - time (sec): 14.83 - samples/sec: 1968.68 - lr: 0.000042 - momentum: 0.000000 2024-03-26 16:00:26,454 epoch 1 - iter 90/95 - loss 1.66300792 - time (sec): 16.35 - samples/sec: 1986.71 - lr: 0.000047 - momentum: 0.000000 2024-03-26 16:00:27,491 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:00:27,491 EPOCH 1 done: loss 1.5900 - lr: 0.000047 2024-03-26 16:00:28,295 DEV : loss 0.44570106267929077 - f1-score (micro avg) 0.687 2024-03-26 16:00:28,296 saving best model 2024-03-26 16:00:28,557 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:00:29,915 epoch 2 - iter 9/95 - loss 0.52755606 - time (sec): 1.36 - samples/sec: 2018.97 - lr: 0.000050 - momentum: 0.000000 2024-03-26 16:00:31,744 epoch 2 - iter 18/95 - loss 0.39877224 - time (sec): 3.19 - samples/sec: 1917.26 - lr: 0.000049 - momentum: 0.000000 2024-03-26 16:00:32,910 epoch 2 - iter 27/95 - loss 0.39086480 - time (sec): 4.35 - samples/sec: 1971.50 - lr: 0.000048 - momentum: 0.000000 2024-03-26 16:00:35,150 epoch 2 - iter 36/95 - loss 0.36615692 - time (sec): 6.59 - samples/sec: 1923.29 - lr: 0.000048 - momentum: 0.000000 2024-03-26 16:00:37,076 epoch 2 - iter 45/95 - loss 0.36188341 - time (sec): 8.52 - samples/sec: 1930.42 - lr: 0.000047 - momentum: 0.000000 2024-03-26 16:00:39,227 epoch 2 - iter 54/95 - loss 0.34938497 - time (sec): 10.67 - samples/sec: 1901.63 - lr: 0.000047 - momentum: 0.000000 2024-03-26 16:00:41,225 epoch 2 - iter 63/95 - loss 0.33873827 - time (sec): 12.67 - samples/sec: 1855.75 - lr: 0.000046 - momentum: 0.000000 2024-03-26 16:00:42,730 epoch 2 - iter 72/95 - loss 0.34109897 - time (sec): 14.17 - samples/sec: 1864.02 - lr: 0.000046 - momentum: 0.000000 2024-03-26 16:00:44,172 epoch 2 - iter 81/95 - loss 0.34624828 - time (sec): 15.61 - samples/sec: 1886.66 - lr: 0.000045 - momentum: 0.000000 2024-03-26 16:00:46,384 epoch 2 - iter 90/95 - loss 0.33363333 - time (sec): 17.83 - samples/sec: 1860.80 - lr: 0.000045 - momentum: 0.000000 2024-03-26 16:00:47,022 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:00:47,022 EPOCH 2 done: loss 0.3302 - lr: 0.000045 2024-03-26 16:00:47,911 DEV : loss 0.24601905047893524 - f1-score (micro avg) 0.8349 2024-03-26 16:00:47,912 saving best model 2024-03-26 16:00:48,351 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:00:49,982 epoch 3 - iter 9/95 - loss 0.17646890 - time (sec): 1.63 - samples/sec: 1833.52 - lr: 0.000044 - momentum: 0.000000 2024-03-26 16:00:51,766 epoch 3 - iter 18/95 - loss 0.16766463 - time (sec): 3.41 - samples/sec: 1855.01 - lr: 0.000043 - momentum: 0.000000 2024-03-26 16:00:52,956 epoch 3 - iter 27/95 - loss 0.18728041 - time (sec): 4.60 - samples/sec: 2030.12 - lr: 0.000043 - momentum: 0.000000 2024-03-26 16:00:54,515 epoch 3 - iter 36/95 - loss 0.18521714 - time (sec): 6.16 - samples/sec: 2016.24 - lr: 0.000042 - momentum: 0.000000 2024-03-26 16:00:55,916 epoch 3 - iter 45/95 - loss 0.19008186 - time (sec): 7.56 - samples/sec: 2026.74 - lr: 0.000042 - momentum: 0.000000 2024-03-26 16:00:57,902 epoch 3 - iter 54/95 - loss 0.18491639 - time (sec): 9.55 - samples/sec: 1979.28 - lr: 0.000041 - momentum: 0.000000 2024-03-26 16:00:59,891 epoch 3 - iter 63/95 - loss 0.18437887 - time (sec): 11.54 - samples/sec: 1929.51 - lr: 0.000041 - momentum: 0.000000 2024-03-26 16:01:01,728 epoch 3 - iter 72/95 - loss 0.18892877 - time (sec): 13.37 - samples/sec: 1912.41 - lr: 0.000040 - momentum: 0.000000 2024-03-26 16:01:03,733 epoch 3 - iter 81/95 - loss 0.18190291 - time (sec): 15.38 - samples/sec: 1885.10 - lr: 0.000040 - momentum: 0.000000 2024-03-26 16:01:05,682 epoch 3 - iter 90/95 - loss 0.19017568 - time (sec): 17.33 - samples/sec: 1886.81 - lr: 0.000039 - momentum: 0.000000 2024-03-26 16:01:06,770 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:01:06,770 EPOCH 3 done: loss 0.1854 - lr: 0.000039 2024-03-26 16:01:07,662 DEV : loss 0.17745639383792877 - f1-score (micro avg) 0.8735 2024-03-26 16:01:07,663 saving best model 2024-03-26 16:01:08,092 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:01:09,387 epoch 4 - iter 9/95 - loss 0.12389405 - time (sec): 1.29 - samples/sec: 2146.08 - lr: 0.000039 - momentum: 0.000000 2024-03-26 16:01:11,232 epoch 4 - iter 18/95 - loss 0.12111894 - time (sec): 3.14 - samples/sec: 1958.29 - lr: 0.000038 - momentum: 0.000000 2024-03-26 16:01:13,234 epoch 4 - iter 27/95 - loss 0.11487951 - time (sec): 5.14 - samples/sec: 1878.49 - lr: 0.000037 - momentum: 0.000000 2024-03-26 16:01:14,716 epoch 4 - iter 36/95 - loss 0.11435733 - time (sec): 6.62 - samples/sec: 1894.36 - lr: 0.000037 - momentum: 0.000000 2024-03-26 16:01:17,154 epoch 4 - iter 45/95 - loss 0.11185218 - time (sec): 9.06 - samples/sec: 1823.85 - lr: 0.000036 - momentum: 0.000000 2024-03-26 16:01:19,006 epoch 4 - iter 54/95 - loss 0.11138127 - time (sec): 10.91 - samples/sec: 1808.45 - lr: 0.000036 - momentum: 0.000000 2024-03-26 16:01:20,943 epoch 4 - iter 63/95 - loss 0.10945874 - time (sec): 12.85 - samples/sec: 1789.84 - lr: 0.000035 - momentum: 0.000000 2024-03-26 16:01:22,821 epoch 4 - iter 72/95 - loss 0.11053287 - time (sec): 14.73 - samples/sec: 1806.53 - lr: 0.000035 - momentum: 0.000000 2024-03-26 16:01:24,846 epoch 4 - iter 81/95 - loss 0.11929813 - time (sec): 16.75 - samples/sec: 1804.79 - lr: 0.000034 - momentum: 0.000000 2024-03-26 16:01:25,824 epoch 4 - iter 90/95 - loss 0.11963430 - time (sec): 17.73 - samples/sec: 1844.53 - lr: 0.000034 - momentum: 0.000000 2024-03-26 16:01:26,849 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:01:26,849 EPOCH 4 done: loss 0.1195 - lr: 0.000034 2024-03-26 16:01:27,742 DEV : loss 0.17124532163143158 - f1-score (micro avg) 0.9038 2024-03-26 16:01:27,743 saving best model 2024-03-26 16:01:28,185 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:01:30,060 epoch 5 - iter 9/95 - loss 0.09160459 - time (sec): 1.87 - samples/sec: 1834.63 - lr: 0.000033 - momentum: 0.000000 2024-03-26 16:01:31,476 epoch 5 - iter 18/95 - loss 0.08544367 - time (sec): 3.29 - samples/sec: 1902.36 - lr: 0.000032 - momentum: 0.000000 2024-03-26 16:01:32,833 epoch 5 - iter 27/95 - loss 0.10039035 - time (sec): 4.65 - samples/sec: 1943.76 - lr: 0.000032 - momentum: 0.000000 2024-03-26 16:01:34,716 epoch 5 - iter 36/95 - loss 0.09956787 - time (sec): 6.53 - samples/sec: 1879.63 - lr: 0.000031 - momentum: 0.000000 2024-03-26 16:01:36,923 epoch 5 - iter 45/95 - loss 0.09508086 - time (sec): 8.74 - samples/sec: 1863.49 - lr: 0.000031 - momentum: 0.000000 2024-03-26 16:01:39,355 epoch 5 - iter 54/95 - loss 0.09118724 - time (sec): 11.17 - samples/sec: 1818.57 - lr: 0.000030 - momentum: 0.000000 2024-03-26 16:01:41,013 epoch 5 - iter 63/95 - loss 0.08797836 - time (sec): 12.83 - samples/sec: 1811.31 - lr: 0.000030 - momentum: 0.000000 2024-03-26 16:01:42,784 epoch 5 - iter 72/95 - loss 0.08582414 - time (sec): 14.60 - samples/sec: 1811.74 - lr: 0.000029 - momentum: 0.000000 2024-03-26 16:01:44,986 epoch 5 - iter 81/95 - loss 0.08720160 - time (sec): 16.80 - samples/sec: 1796.16 - lr: 0.000029 - momentum: 0.000000 2024-03-26 16:01:46,370 epoch 5 - iter 90/95 - loss 0.08919404 - time (sec): 18.18 - samples/sec: 1812.62 - lr: 0.000028 - momentum: 0.000000 2024-03-26 16:01:47,136 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:01:47,136 EPOCH 5 done: loss 0.0873 - lr: 0.000028 2024-03-26 16:01:48,034 DEV : loss 0.16974209249019623 - f1-score (micro avg) 0.9198 2024-03-26 16:01:48,035 saving best model 2024-03-26 16:01:48,479 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:01:50,417 epoch 6 - iter 9/95 - loss 0.05813380 - time (sec): 1.94 - samples/sec: 1801.23 - lr: 0.000027 - momentum: 0.000000 2024-03-26 16:01:51,968 epoch 6 - iter 18/95 - loss 0.05561430 - time (sec): 3.49 - samples/sec: 1823.72 - lr: 0.000027 - momentum: 0.000000 2024-03-26 16:01:53,872 epoch 6 - iter 27/95 - loss 0.05571319 - time (sec): 5.39 - samples/sec: 1834.08 - lr: 0.000026 - momentum: 0.000000 2024-03-26 16:01:55,443 epoch 6 - iter 36/95 - loss 0.05560469 - time (sec): 6.96 - samples/sec: 1831.74 - lr: 0.000026 - momentum: 0.000000 2024-03-26 16:01:56,891 epoch 6 - iter 45/95 - loss 0.05722084 - time (sec): 8.41 - samples/sec: 1869.84 - lr: 0.000025 - momentum: 0.000000 2024-03-26 16:01:58,341 epoch 6 - iter 54/95 - loss 0.05800404 - time (sec): 9.86 - samples/sec: 1867.99 - lr: 0.000025 - momentum: 0.000000 2024-03-26 16:01:59,615 epoch 6 - iter 63/95 - loss 0.05975631 - time (sec): 11.13 - samples/sec: 1930.92 - lr: 0.000024 - momentum: 0.000000 2024-03-26 16:02:01,857 epoch 6 - iter 72/95 - loss 0.06417836 - time (sec): 13.38 - samples/sec: 1898.32 - lr: 0.000024 - momentum: 0.000000 2024-03-26 16:02:03,435 epoch 6 - iter 81/95 - loss 0.06173608 - time (sec): 14.95 - samples/sec: 1916.22 - lr: 0.000023 - momentum: 0.000000 2024-03-26 16:02:05,134 epoch 6 - iter 90/95 - loss 0.06379710 - time (sec): 16.65 - samples/sec: 1935.46 - lr: 0.000023 - momentum: 0.000000 2024-03-26 16:02:06,397 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:02:06,397 EPOCH 6 done: loss 0.0636 - lr: 0.000023 2024-03-26 16:02:07,293 DEV : loss 0.1660703867673874 - f1-score (micro avg) 0.936 2024-03-26 16:02:07,294 saving best model 2024-03-26 16:02:07,738 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:02:09,621 epoch 7 - iter 9/95 - loss 0.04045283 - time (sec): 1.88 - samples/sec: 1688.44 - lr: 0.000022 - momentum: 0.000000 2024-03-26 16:02:11,655 epoch 7 - iter 18/95 - loss 0.02841548 - time (sec): 3.91 - samples/sec: 1674.65 - lr: 0.000021 - momentum: 0.000000 2024-03-26 16:02:13,183 epoch 7 - iter 27/95 - loss 0.02627311 - time (sec): 5.44 - samples/sec: 1797.34 - lr: 0.000021 - momentum: 0.000000 2024-03-26 16:02:15,125 epoch 7 - iter 36/95 - loss 0.02722641 - time (sec): 7.38 - samples/sec: 1784.42 - lr: 0.000020 - momentum: 0.000000 2024-03-26 16:02:17,497 epoch 7 - iter 45/95 - loss 0.03259541 - time (sec): 9.76 - samples/sec: 1777.38 - lr: 0.000020 - momentum: 0.000000 2024-03-26 16:02:19,014 epoch 7 - iter 54/95 - loss 0.03491364 - time (sec): 11.27 - samples/sec: 1784.27 - lr: 0.000019 - momentum: 0.000000 2024-03-26 16:02:21,194 epoch 7 - iter 63/95 - loss 0.04056925 - time (sec): 13.45 - samples/sec: 1790.05 - lr: 0.000019 - momentum: 0.000000 2024-03-26 16:02:22,983 epoch 7 - iter 72/95 - loss 0.04684255 - time (sec): 15.24 - samples/sec: 1795.92 - lr: 0.000018 - momentum: 0.000000 2024-03-26 16:02:24,403 epoch 7 - iter 81/95 - loss 0.04395515 - time (sec): 16.66 - samples/sec: 1807.79 - lr: 0.000018 - momentum: 0.000000 2024-03-26 16:02:26,375 epoch 7 - iter 90/95 - loss 0.04605325 - time (sec): 18.63 - samples/sec: 1787.55 - lr: 0.000017 - momentum: 0.000000 2024-03-26 16:02:26,860 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:02:26,860 EPOCH 7 done: loss 0.0469 - lr: 0.000017 2024-03-26 16:02:27,749 DEV : loss 0.18657518923282623 - f1-score (micro avg) 0.9354 2024-03-26 16:02:27,750 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:02:29,622 epoch 8 - iter 9/95 - loss 0.02945220 - time (sec): 1.87 - samples/sec: 1713.38 - lr: 0.000016 - momentum: 0.000000 2024-03-26 16:02:32,102 epoch 8 - iter 18/95 - loss 0.02065644 - time (sec): 4.35 - samples/sec: 1702.09 - lr: 0.000016 - momentum: 0.000000 2024-03-26 16:02:33,863 epoch 8 - iter 27/95 - loss 0.02067654 - time (sec): 6.11 - samples/sec: 1740.79 - lr: 0.000015 - momentum: 0.000000 2024-03-26 16:02:35,488 epoch 8 - iter 36/95 - loss 0.02424997 - time (sec): 7.74 - samples/sec: 1713.45 - lr: 0.000015 - momentum: 0.000000 2024-03-26 16:02:36,997 epoch 8 - iter 45/95 - loss 0.02289045 - time (sec): 9.25 - samples/sec: 1746.67 - lr: 0.000014 - momentum: 0.000000 2024-03-26 16:02:38,662 epoch 8 - iter 54/95 - loss 0.02432025 - time (sec): 10.91 - samples/sec: 1767.82 - lr: 0.000014 - momentum: 0.000000 2024-03-26 16:02:40,854 epoch 8 - iter 63/95 - loss 0.03266905 - time (sec): 13.10 - samples/sec: 1766.14 - lr: 0.000013 - momentum: 0.000000 2024-03-26 16:02:43,099 epoch 8 - iter 72/95 - loss 0.03310354 - time (sec): 15.35 - samples/sec: 1748.47 - lr: 0.000013 - momentum: 0.000000 2024-03-26 16:02:44,752 epoch 8 - iter 81/95 - loss 0.03687701 - time (sec): 17.00 - samples/sec: 1752.00 - lr: 0.000012 - momentum: 0.000000 2024-03-26 16:02:46,046 epoch 8 - iter 90/95 - loss 0.03730740 - time (sec): 18.30 - samples/sec: 1794.50 - lr: 0.000012 - momentum: 0.000000 2024-03-26 16:02:46,945 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:02:46,945 EPOCH 8 done: loss 0.0362 - lr: 0.000012 2024-03-26 16:02:47,878 DEV : loss 0.18976689875125885 - f1-score (micro avg) 0.9344 2024-03-26 16:02:47,880 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:02:49,859 epoch 9 - iter 9/95 - loss 0.01635759 - time (sec): 1.98 - samples/sec: 1782.87 - lr: 0.000011 - momentum: 0.000000 2024-03-26 16:02:51,582 epoch 9 - iter 18/95 - loss 0.02304365 - time (sec): 3.70 - samples/sec: 1808.51 - lr: 0.000010 - momentum: 0.000000 2024-03-26 16:02:53,458 epoch 9 - iter 27/95 - loss 0.02506509 - time (sec): 5.58 - samples/sec: 1832.02 - lr: 0.000010 - momentum: 0.000000 2024-03-26 16:02:55,313 epoch 9 - iter 36/95 - loss 0.02234485 - time (sec): 7.43 - samples/sec: 1827.49 - lr: 0.000009 - momentum: 0.000000 2024-03-26 16:02:57,570 epoch 9 - iter 45/95 - loss 0.02061792 - time (sec): 9.69 - samples/sec: 1749.48 - lr: 0.000009 - momentum: 0.000000 2024-03-26 16:02:59,495 epoch 9 - iter 54/95 - loss 0.02488325 - time (sec): 11.61 - samples/sec: 1738.19 - lr: 0.000008 - momentum: 0.000000 2024-03-26 16:03:01,384 epoch 9 - iter 63/95 - loss 0.02441510 - time (sec): 13.50 - samples/sec: 1748.69 - lr: 0.000008 - momentum: 0.000000 2024-03-26 16:03:03,255 epoch 9 - iter 72/95 - loss 0.02401047 - time (sec): 15.37 - samples/sec: 1752.87 - lr: 0.000007 - momentum: 0.000000 2024-03-26 16:03:04,501 epoch 9 - iter 81/95 - loss 0.02516190 - time (sec): 16.62 - samples/sec: 1776.36 - lr: 0.000007 - momentum: 0.000000 2024-03-26 16:03:05,902 epoch 9 - iter 90/95 - loss 0.02891991 - time (sec): 18.02 - samples/sec: 1798.48 - lr: 0.000006 - momentum: 0.000000 2024-03-26 16:03:06,844 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:03:06,844 EPOCH 9 done: loss 0.0284 - lr: 0.000006 2024-03-26 16:03:07,796 DEV : loss 0.2133364975452423 - f1-score (micro avg) 0.9383 2024-03-26 16:03:07,797 saving best model 2024-03-26 16:03:08,256 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:03:10,399 epoch 10 - iter 9/95 - loss 0.00852392 - time (sec): 2.14 - samples/sec: 1780.84 - lr: 0.000005 - momentum: 0.000000 2024-03-26 16:03:11,661 epoch 10 - iter 18/95 - loss 0.00945942 - time (sec): 3.40 - samples/sec: 1902.77 - lr: 0.000005 - momentum: 0.000000 2024-03-26 16:03:12,973 epoch 10 - iter 27/95 - loss 0.03452703 - time (sec): 4.71 - samples/sec: 2011.60 - lr: 0.000004 - momentum: 0.000000 2024-03-26 16:03:14,328 epoch 10 - iter 36/95 - loss 0.02966500 - time (sec): 6.07 - samples/sec: 2032.32 - lr: 0.000004 - momentum: 0.000000 2024-03-26 16:03:16,198 epoch 10 - iter 45/95 - loss 0.02385615 - time (sec): 7.94 - samples/sec: 2005.69 - lr: 0.000003 - momentum: 0.000000 2024-03-26 16:03:17,776 epoch 10 - iter 54/95 - loss 0.02232956 - time (sec): 9.52 - samples/sec: 1991.74 - lr: 0.000003 - momentum: 0.000000 2024-03-26 16:03:20,299 epoch 10 - iter 63/95 - loss 0.02272554 - time (sec): 12.04 - samples/sec: 1909.44 - lr: 0.000002 - momentum: 0.000000 2024-03-26 16:03:21,585 epoch 10 - iter 72/95 - loss 0.02126761 - time (sec): 13.33 - samples/sec: 1913.48 - lr: 0.000002 - momentum: 0.000000 2024-03-26 16:03:23,927 epoch 10 - iter 81/95 - loss 0.02015630 - time (sec): 15.67 - samples/sec: 1859.08 - lr: 0.000001 - momentum: 0.000000 2024-03-26 16:03:26,158 epoch 10 - iter 90/95 - loss 0.02166406 - time (sec): 17.90 - samples/sec: 1837.32 - lr: 0.000001 - momentum: 0.000000 2024-03-26 16:03:27,216 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:03:27,216 EPOCH 10 done: loss 0.0211 - lr: 0.000001 2024-03-26 16:03:28,123 DEV : loss 0.21451087296009064 - f1-score (micro avg) 0.9331 2024-03-26 16:03:28,392 ---------------------------------------------------------------------------------------------------- 2024-03-26 16:03:28,392 Loading model from best epoch ... 2024-03-26 16:03:29,252 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software 2024-03-26 16:03:30,117 Results: - F-score (micro) 0.907 - F-score (macro) 0.9177 - Accuracy 0.8345 By class: precision recall f1-score support Unternehmen 0.8520 0.8872 0.8692 266 Auslagerung 0.8958 0.9317 0.9134 249 Ort 0.9565 0.9851 0.9706 134 micro avg 0.8902 0.9245 0.9070 649 macro avg 0.9014 0.9347 0.9177 649 weighted avg 0.8904 0.9245 0.9071 649 2024-03-26 16:03:30,117 ----------------------------------------------------------------------------------------------------