2023-10-13 13:13:20,057 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:13:20,058 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 13:13:20,058 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:13:20,058 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences - NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator 2023-10-13 13:13:20,058 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:13:20,058 Train: 3575 sentences 2023-10-13 13:13:20,058 (train_with_dev=False, train_with_test=False) 2023-10-13 13:13:20,058 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:13:20,058 Training Params: 2023-10-13 13:13:20,058 - learning_rate: "3e-05" 2023-10-13 13:13:20,058 - mini_batch_size: "8" 2023-10-13 13:13:20,058 - max_epochs: "10" 2023-10-13 13:13:20,058 - shuffle: "True" 2023-10-13 13:13:20,058 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:13:20,058 Plugins: 2023-10-13 13:13:20,058 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 13:13:20,058 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:13:20,058 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 13:13:20,058 - metric: "('micro avg', 'f1-score')" 2023-10-13 13:13:20,058 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:13:20,058 Computation: 2023-10-13 13:13:20,058 - compute on device: cuda:0 2023-10-13 13:13:20,058 - embedding storage: none 2023-10-13 13:13:20,058 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:13:20,058 Model training base path: "hmbench-hipe2020/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-13 13:13:20,058 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:13:20,058 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:13:22,914 epoch 1 - iter 44/447 - loss 3.11777082 - time (sec): 2.85 - samples/sec: 3068.12 - lr: 0.000003 - momentum: 0.000000 2023-10-13 13:13:25,858 epoch 1 - iter 88/447 - loss 2.36193790 - time (sec): 5.80 - samples/sec: 3064.20 - lr: 0.000006 - momentum: 0.000000 2023-10-13 13:13:28,576 epoch 1 - iter 132/447 - loss 1.79092667 - time (sec): 8.52 - samples/sec: 3020.50 - lr: 0.000009 - momentum: 0.000000 2023-10-13 13:13:31,655 epoch 1 - iter 176/447 - loss 1.44848537 - time (sec): 11.60 - samples/sec: 2976.51 - lr: 0.000012 - momentum: 0.000000 2023-10-13 13:13:34,436 epoch 1 - iter 220/447 - loss 1.23228332 - time (sec): 14.38 - samples/sec: 2970.99 - lr: 0.000015 - momentum: 0.000000 2023-10-13 13:13:37,240 epoch 1 - iter 264/447 - loss 1.08886007 - time (sec): 17.18 - samples/sec: 2964.39 - lr: 0.000018 - momentum: 0.000000 2023-10-13 13:13:39,997 epoch 1 - iter 308/447 - loss 0.98213248 - time (sec): 19.94 - samples/sec: 2970.69 - lr: 0.000021 - momentum: 0.000000 2023-10-13 13:13:42,747 epoch 1 - iter 352/447 - loss 0.89561881 - time (sec): 22.69 - samples/sec: 2980.97 - lr: 0.000024 - momentum: 0.000000 2023-10-13 13:13:45,432 epoch 1 - iter 396/447 - loss 0.82242688 - time (sec): 25.37 - samples/sec: 2981.41 - lr: 0.000027 - momentum: 0.000000 2023-10-13 13:13:48,627 epoch 1 - iter 440/447 - loss 0.75951686 - time (sec): 28.57 - samples/sec: 2983.72 - lr: 0.000029 - momentum: 0.000000 2023-10-13 13:13:49,039 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:13:49,040 EPOCH 1 done: loss 0.7518 - lr: 0.000029 2023-10-13 13:13:53,953 DEV : loss 0.18360073864459991 - f1-score (micro avg) 0.6411 2023-10-13 13:13:53,978 saving best model 2023-10-13 13:13:54,320 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:13:57,283 epoch 2 - iter 44/447 - loss 0.21325634 - time (sec): 2.96 - samples/sec: 3023.79 - lr: 0.000030 - momentum: 0.000000 2023-10-13 13:14:00,394 epoch 2 - iter 88/447 - loss 0.19763757 - time (sec): 6.07 - samples/sec: 3047.06 - lr: 0.000029 - momentum: 0.000000 2023-10-13 13:14:02,962 epoch 2 - iter 132/447 - loss 0.18755004 - time (sec): 8.64 - samples/sec: 3031.14 - lr: 0.000029 - momentum: 0.000000 2023-10-13 13:14:05,602 epoch 2 - iter 176/447 - loss 0.19041313 - time (sec): 11.28 - samples/sec: 3055.96 - lr: 0.000029 - momentum: 0.000000 2023-10-13 13:14:08,514 epoch 2 - iter 220/447 - loss 0.18445879 - time (sec): 14.19 - samples/sec: 3035.46 - lr: 0.000028 - momentum: 0.000000 2023-10-13 13:14:11,208 epoch 2 - iter 264/447 - loss 0.17645831 - time (sec): 16.89 - samples/sec: 3063.78 - lr: 0.000028 - momentum: 0.000000 2023-10-13 13:14:13,825 epoch 2 - iter 308/447 - loss 0.17200425 - time (sec): 19.50 - samples/sec: 3061.76 - lr: 0.000028 - momentum: 0.000000 2023-10-13 13:14:16,387 epoch 2 - iter 352/447 - loss 0.16971408 - time (sec): 22.07 - samples/sec: 3070.68 - lr: 0.000027 - momentum: 0.000000 2023-10-13 13:14:19,560 epoch 2 - iter 396/447 - loss 0.16473002 - time (sec): 25.24 - samples/sec: 3044.12 - lr: 0.000027 - momentum: 0.000000 2023-10-13 13:14:22,291 epoch 2 - iter 440/447 - loss 0.16259189 - time (sec): 27.97 - samples/sec: 3047.79 - lr: 0.000027 - momentum: 0.000000 2023-10-13 13:14:22,795 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:14:22,796 EPOCH 2 done: loss 0.1618 - lr: 0.000027 2023-10-13 13:14:31,384 DEV : loss 0.1280011683702469 - f1-score (micro avg) 0.6793 2023-10-13 13:14:31,411 saving best model 2023-10-13 13:14:31,868 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:14:34,764 epoch 3 - iter 44/447 - loss 0.10970391 - time (sec): 2.89 - samples/sec: 2957.74 - lr: 0.000026 - momentum: 0.000000 2023-10-13 13:14:38,002 epoch 3 - iter 88/447 - loss 0.10095129 - time (sec): 6.13 - samples/sec: 2911.29 - lr: 0.000026 - momentum: 0.000000 2023-10-13 13:14:40,901 epoch 3 - iter 132/447 - loss 0.09159053 - time (sec): 9.03 - samples/sec: 2902.88 - lr: 0.000026 - momentum: 0.000000 2023-10-13 13:14:43,698 epoch 3 - iter 176/447 - loss 0.09005402 - time (sec): 11.83 - samples/sec: 2913.87 - lr: 0.000025 - momentum: 0.000000 2023-10-13 13:14:46,381 epoch 3 - iter 220/447 - loss 0.08834012 - time (sec): 14.51 - samples/sec: 2898.73 - lr: 0.000025 - momentum: 0.000000 2023-10-13 13:14:49,199 epoch 3 - iter 264/447 - loss 0.08867495 - time (sec): 17.33 - samples/sec: 2917.60 - lr: 0.000025 - momentum: 0.000000 2023-10-13 13:14:51,891 epoch 3 - iter 308/447 - loss 0.08828525 - time (sec): 20.02 - samples/sec: 2931.90 - lr: 0.000024 - momentum: 0.000000 2023-10-13 13:14:54,732 epoch 3 - iter 352/447 - loss 0.08402043 - time (sec): 22.86 - samples/sec: 2953.93 - lr: 0.000024 - momentum: 0.000000 2023-10-13 13:14:57,347 epoch 3 - iter 396/447 - loss 0.08793302 - time (sec): 25.48 - samples/sec: 2976.67 - lr: 0.000024 - momentum: 0.000000 2023-10-13 13:15:00,431 epoch 3 - iter 440/447 - loss 0.08822390 - time (sec): 28.56 - samples/sec: 2987.02 - lr: 0.000023 - momentum: 0.000000 2023-10-13 13:15:00,838 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:15:00,839 EPOCH 3 done: loss 0.0882 - lr: 0.000023 2023-10-13 13:15:09,537 DEV : loss 0.1200539767742157 - f1-score (micro avg) 0.7483 2023-10-13 13:15:09,567 saving best model 2023-10-13 13:15:09,992 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:15:12,750 epoch 4 - iter 44/447 - loss 0.04601992 - time (sec): 2.76 - samples/sec: 3061.98 - lr: 0.000023 - momentum: 0.000000 2023-10-13 13:15:15,345 epoch 4 - iter 88/447 - loss 0.05132868 - time (sec): 5.35 - samples/sec: 3066.86 - lr: 0.000023 - momentum: 0.000000 2023-10-13 13:15:18,079 epoch 4 - iter 132/447 - loss 0.04850121 - time (sec): 8.09 - samples/sec: 3088.42 - lr: 0.000022 - momentum: 0.000000 2023-10-13 13:15:20,701 epoch 4 - iter 176/447 - loss 0.04502986 - time (sec): 10.71 - samples/sec: 3112.42 - lr: 0.000022 - momentum: 0.000000 2023-10-13 13:15:24,117 epoch 4 - iter 220/447 - loss 0.04746775 - time (sec): 14.12 - samples/sec: 3070.89 - lr: 0.000022 - momentum: 0.000000 2023-10-13 13:15:26,903 epoch 4 - iter 264/447 - loss 0.04791562 - time (sec): 16.91 - samples/sec: 3071.67 - lr: 0.000021 - momentum: 0.000000 2023-10-13 13:15:29,531 epoch 4 - iter 308/447 - loss 0.04886274 - time (sec): 19.54 - samples/sec: 3060.31 - lr: 0.000021 - momentum: 0.000000 2023-10-13 13:15:32,213 epoch 4 - iter 352/447 - loss 0.04909487 - time (sec): 22.22 - samples/sec: 3052.34 - lr: 0.000021 - momentum: 0.000000 2023-10-13 13:15:35,496 epoch 4 - iter 396/447 - loss 0.04995671 - time (sec): 25.50 - samples/sec: 3031.17 - lr: 0.000020 - momentum: 0.000000 2023-10-13 13:15:38,187 epoch 4 - iter 440/447 - loss 0.05051059 - time (sec): 28.19 - samples/sec: 3023.06 - lr: 0.000020 - momentum: 0.000000 2023-10-13 13:15:38,616 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:15:38,617 EPOCH 4 done: loss 0.0506 - lr: 0.000020 2023-10-13 13:15:47,029 DEV : loss 0.14254607260227203 - f1-score (micro avg) 0.7467 2023-10-13 13:15:47,056 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:15:49,999 epoch 5 - iter 44/447 - loss 0.03966301 - time (sec): 2.94 - samples/sec: 3049.04 - lr: 0.000020 - momentum: 0.000000 2023-10-13 13:15:52,832 epoch 5 - iter 88/447 - loss 0.03669848 - time (sec): 5.78 - samples/sec: 2990.27 - lr: 0.000019 - momentum: 0.000000 2023-10-13 13:15:55,761 epoch 5 - iter 132/447 - loss 0.03533218 - time (sec): 8.70 - samples/sec: 3000.27 - lr: 0.000019 - momentum: 0.000000 2023-10-13 13:15:58,591 epoch 5 - iter 176/447 - loss 0.03429375 - time (sec): 11.53 - samples/sec: 3006.74 - lr: 0.000019 - momentum: 0.000000 2023-10-13 13:16:01,229 epoch 5 - iter 220/447 - loss 0.03499214 - time (sec): 14.17 - samples/sec: 3009.08 - lr: 0.000018 - momentum: 0.000000 2023-10-13 13:16:04,099 epoch 5 - iter 264/447 - loss 0.03588282 - time (sec): 17.04 - samples/sec: 3007.92 - lr: 0.000018 - momentum: 0.000000 2023-10-13 13:16:07,274 epoch 5 - iter 308/447 - loss 0.03556606 - time (sec): 20.22 - samples/sec: 2996.33 - lr: 0.000018 - momentum: 0.000000 2023-10-13 13:16:09,873 epoch 5 - iter 352/447 - loss 0.03896903 - time (sec): 22.82 - samples/sec: 3006.18 - lr: 0.000017 - momentum: 0.000000 2023-10-13 13:16:12,685 epoch 5 - iter 396/447 - loss 0.03718731 - time (sec): 25.63 - samples/sec: 2995.43 - lr: 0.000017 - momentum: 0.000000 2023-10-13 13:16:15,538 epoch 5 - iter 440/447 - loss 0.03556750 - time (sec): 28.48 - samples/sec: 2997.63 - lr: 0.000017 - momentum: 0.000000 2023-10-13 13:16:15,921 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:16:15,922 EPOCH 5 done: loss 0.0352 - lr: 0.000017 2023-10-13 13:16:24,461 DEV : loss 0.16648352146148682 - f1-score (micro avg) 0.7495 2023-10-13 13:16:24,487 saving best model 2023-10-13 13:16:24,910 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:16:27,776 epoch 6 - iter 44/447 - loss 0.01297918 - time (sec): 2.86 - samples/sec: 3007.92 - lr: 0.000016 - momentum: 0.000000 2023-10-13 13:16:30,760 epoch 6 - iter 88/447 - loss 0.01521816 - time (sec): 5.85 - samples/sec: 3013.80 - lr: 0.000016 - momentum: 0.000000 2023-10-13 13:16:33,450 epoch 6 - iter 132/447 - loss 0.01783763 - time (sec): 8.53 - samples/sec: 3042.85 - lr: 0.000016 - momentum: 0.000000 2023-10-13 13:16:36,642 epoch 6 - iter 176/447 - loss 0.01724718 - time (sec): 11.73 - samples/sec: 3055.21 - lr: 0.000015 - momentum: 0.000000 2023-10-13 13:16:39,440 epoch 6 - iter 220/447 - loss 0.01881627 - time (sec): 14.53 - samples/sec: 2985.20 - lr: 0.000015 - momentum: 0.000000 2023-10-13 13:16:42,158 epoch 6 - iter 264/447 - loss 0.01817880 - time (sec): 17.24 - samples/sec: 2985.53 - lr: 0.000015 - momentum: 0.000000 2023-10-13 13:16:45,022 epoch 6 - iter 308/447 - loss 0.01947892 - time (sec): 20.11 - samples/sec: 2982.50 - lr: 0.000014 - momentum: 0.000000 2023-10-13 13:16:47,824 epoch 6 - iter 352/447 - loss 0.02012745 - time (sec): 22.91 - samples/sec: 2969.62 - lr: 0.000014 - momentum: 0.000000 2023-10-13 13:16:50,622 epoch 6 - iter 396/447 - loss 0.02016141 - time (sec): 25.71 - samples/sec: 2992.05 - lr: 0.000014 - momentum: 0.000000 2023-10-13 13:16:53,327 epoch 6 - iter 440/447 - loss 0.02126196 - time (sec): 28.41 - samples/sec: 3002.50 - lr: 0.000013 - momentum: 0.000000 2023-10-13 13:16:53,729 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:16:53,730 EPOCH 6 done: loss 0.0212 - lr: 0.000013 2023-10-13 13:17:02,414 DEV : loss 0.173013374209404 - f1-score (micro avg) 0.7741 2023-10-13 13:17:02,440 saving best model 2023-10-13 13:17:02,868 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:17:06,304 epoch 7 - iter 44/447 - loss 0.02481076 - time (sec): 3.43 - samples/sec: 2891.86 - lr: 0.000013 - momentum: 0.000000 2023-10-13 13:17:09,074 epoch 7 - iter 88/447 - loss 0.01608324 - time (sec): 6.20 - samples/sec: 2875.39 - lr: 0.000013 - momentum: 0.000000 2023-10-13 13:17:12,040 epoch 7 - iter 132/447 - loss 0.01410154 - time (sec): 9.17 - samples/sec: 2899.58 - lr: 0.000012 - momentum: 0.000000 2023-10-13 13:17:14,929 epoch 7 - iter 176/447 - loss 0.01489213 - time (sec): 12.06 - samples/sec: 2938.16 - lr: 0.000012 - momentum: 0.000000 2023-10-13 13:17:17,754 epoch 7 - iter 220/447 - loss 0.01604270 - time (sec): 14.88 - samples/sec: 2952.10 - lr: 0.000012 - momentum: 0.000000 2023-10-13 13:17:20,446 epoch 7 - iter 264/447 - loss 0.01618986 - time (sec): 17.58 - samples/sec: 2938.06 - lr: 0.000011 - momentum: 0.000000 2023-10-13 13:17:23,179 epoch 7 - iter 308/447 - loss 0.01644546 - time (sec): 20.31 - samples/sec: 2961.42 - lr: 0.000011 - momentum: 0.000000 2023-10-13 13:17:25,924 epoch 7 - iter 352/447 - loss 0.01537412 - time (sec): 23.05 - samples/sec: 2965.75 - lr: 0.000011 - momentum: 0.000000 2023-10-13 13:17:28,546 epoch 7 - iter 396/447 - loss 0.01579262 - time (sec): 25.68 - samples/sec: 2974.59 - lr: 0.000010 - momentum: 0.000000 2023-10-13 13:17:31,395 epoch 7 - iter 440/447 - loss 0.01544474 - time (sec): 28.53 - samples/sec: 2995.90 - lr: 0.000010 - momentum: 0.000000 2023-10-13 13:17:31,781 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:17:31,782 EPOCH 7 done: loss 0.0155 - lr: 0.000010 2023-10-13 13:17:39,921 DEV : loss 0.1985795646905899 - f1-score (micro avg) 0.783 2023-10-13 13:17:39,950 saving best model 2023-10-13 13:17:40,398 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:17:43,232 epoch 8 - iter 44/447 - loss 0.00957408 - time (sec): 2.83 - samples/sec: 3033.73 - lr: 0.000010 - momentum: 0.000000 2023-10-13 13:17:46,091 epoch 8 - iter 88/447 - loss 0.00922003 - time (sec): 5.69 - samples/sec: 3010.78 - lr: 0.000009 - momentum: 0.000000 2023-10-13 13:17:48,808 epoch 8 - iter 132/447 - loss 0.01038901 - time (sec): 8.41 - samples/sec: 3015.07 - lr: 0.000009 - momentum: 0.000000 2023-10-13 13:17:51,553 epoch 8 - iter 176/447 - loss 0.01041656 - time (sec): 11.15 - samples/sec: 3004.15 - lr: 0.000009 - momentum: 0.000000 2023-10-13 13:17:54,368 epoch 8 - iter 220/447 - loss 0.00978835 - time (sec): 13.97 - samples/sec: 2991.67 - lr: 0.000008 - momentum: 0.000000 2023-10-13 13:17:57,374 epoch 8 - iter 264/447 - loss 0.00983571 - time (sec): 16.97 - samples/sec: 2957.65 - lr: 0.000008 - momentum: 0.000000 2023-10-13 13:18:00,184 epoch 8 - iter 308/447 - loss 0.00935734 - time (sec): 19.78 - samples/sec: 2954.35 - lr: 0.000008 - momentum: 0.000000 2023-10-13 13:18:03,374 epoch 8 - iter 352/447 - loss 0.01020653 - time (sec): 22.97 - samples/sec: 2943.90 - lr: 0.000007 - momentum: 0.000000 2023-10-13 13:18:06,439 epoch 8 - iter 396/447 - loss 0.01083975 - time (sec): 26.04 - samples/sec: 2945.85 - lr: 0.000007 - momentum: 0.000000 2023-10-13 13:18:09,151 epoch 8 - iter 440/447 - loss 0.01083258 - time (sec): 28.75 - samples/sec: 2958.89 - lr: 0.000007 - momentum: 0.000000 2023-10-13 13:18:09,642 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:18:09,643 EPOCH 8 done: loss 0.0108 - lr: 0.000007 2023-10-13 13:18:17,739 DEV : loss 0.2084827721118927 - f1-score (micro avg) 0.7768 2023-10-13 13:18:17,767 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:18:20,478 epoch 9 - iter 44/447 - loss 0.00612637 - time (sec): 2.71 - samples/sec: 3061.27 - lr: 0.000006 - momentum: 0.000000 2023-10-13 13:18:23,449 epoch 9 - iter 88/447 - loss 0.00545493 - time (sec): 5.68 - samples/sec: 2988.83 - lr: 0.000006 - momentum: 0.000000 2023-10-13 13:18:26,013 epoch 9 - iter 132/447 - loss 0.00810978 - time (sec): 8.24 - samples/sec: 3022.80 - lr: 0.000006 - momentum: 0.000000 2023-10-13 13:18:28,792 epoch 9 - iter 176/447 - loss 0.00943692 - time (sec): 11.02 - samples/sec: 3017.39 - lr: 0.000005 - momentum: 0.000000 2023-10-13 13:18:31,737 epoch 9 - iter 220/447 - loss 0.00827507 - time (sec): 13.97 - samples/sec: 3011.35 - lr: 0.000005 - momentum: 0.000000 2023-10-13 13:18:34,594 epoch 9 - iter 264/447 - loss 0.00745688 - time (sec): 16.83 - samples/sec: 2998.27 - lr: 0.000005 - momentum: 0.000000 2023-10-13 13:18:37,258 epoch 9 - iter 308/447 - loss 0.00768552 - time (sec): 19.49 - samples/sec: 3026.07 - lr: 0.000004 - momentum: 0.000000 2023-10-13 13:18:41,084 epoch 9 - iter 352/447 - loss 0.00760993 - time (sec): 23.32 - samples/sec: 2963.42 - lr: 0.000004 - momentum: 0.000000 2023-10-13 13:18:43,947 epoch 9 - iter 396/447 - loss 0.00728736 - time (sec): 26.18 - samples/sec: 2954.39 - lr: 0.000004 - momentum: 0.000000 2023-10-13 13:18:46,795 epoch 9 - iter 440/447 - loss 0.00694997 - time (sec): 29.03 - samples/sec: 2944.30 - lr: 0.000003 - momentum: 0.000000 2023-10-13 13:18:47,199 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:18:47,199 EPOCH 9 done: loss 0.0070 - lr: 0.000003 2023-10-13 13:18:55,583 DEV : loss 0.2126864343881607 - f1-score (micro avg) 0.7776 2023-10-13 13:18:55,611 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:18:58,894 epoch 10 - iter 44/447 - loss 0.00635944 - time (sec): 3.28 - samples/sec: 3012.37 - lr: 0.000003 - momentum: 0.000000 2023-10-13 13:19:02,008 epoch 10 - iter 88/447 - loss 0.00527678 - time (sec): 6.40 - samples/sec: 2899.98 - lr: 0.000003 - momentum: 0.000000 2023-10-13 13:19:04,861 epoch 10 - iter 132/447 - loss 0.00649806 - time (sec): 9.25 - samples/sec: 2899.02 - lr: 0.000002 - momentum: 0.000000 2023-10-13 13:19:07,541 epoch 10 - iter 176/447 - loss 0.00621582 - time (sec): 11.93 - samples/sec: 2918.26 - lr: 0.000002 - momentum: 0.000000 2023-10-13 13:19:10,370 epoch 10 - iter 220/447 - loss 0.00571437 - time (sec): 14.76 - samples/sec: 2934.41 - lr: 0.000002 - momentum: 0.000000 2023-10-13 13:19:13,065 epoch 10 - iter 264/447 - loss 0.00591055 - time (sec): 17.45 - samples/sec: 2939.79 - lr: 0.000001 - momentum: 0.000000 2023-10-13 13:19:15,840 epoch 10 - iter 308/447 - loss 0.00566945 - time (sec): 20.23 - samples/sec: 2941.50 - lr: 0.000001 - momentum: 0.000000 2023-10-13 13:19:18,771 epoch 10 - iter 352/447 - loss 0.00536572 - time (sec): 23.16 - samples/sec: 2940.96 - lr: 0.000001 - momentum: 0.000000 2023-10-13 13:19:21,445 epoch 10 - iter 396/447 - loss 0.00529827 - time (sec): 25.83 - samples/sec: 2959.09 - lr: 0.000000 - momentum: 0.000000 2023-10-13 13:19:24,347 epoch 10 - iter 440/447 - loss 0.00512907 - time (sec): 28.73 - samples/sec: 2976.79 - lr: 0.000000 - momentum: 0.000000 2023-10-13 13:19:24,753 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:19:24,753 EPOCH 10 done: loss 0.0051 - lr: 0.000000 2023-10-13 13:19:33,426 DEV : loss 0.2154415100812912 - f1-score (micro avg) 0.7754 2023-10-13 13:19:33,796 ---------------------------------------------------------------------------------------------------- 2023-10-13 13:19:33,798 Loading model from best epoch ... 2023-10-13 13:19:35,452 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time 2023-10-13 13:19:40,515 Results: - F-score (micro) 0.7437 - F-score (macro) 0.6536 - Accuracy 0.6094 By class: precision recall f1-score support loc 0.8596 0.8322 0.8457 596 pers 0.6605 0.7538 0.7041 333 org 0.5310 0.4545 0.4898 132 prod 0.5957 0.4242 0.4956 66 time 0.7115 0.7551 0.7327 49 micro avg 0.7459 0.7415 0.7437 1176 macro avg 0.6717 0.6440 0.6536 1176 weighted avg 0.7454 0.7415 0.7413 1176 2023-10-13 13:19:40,515 ----------------------------------------------------------------------------------------------------