2023-10-16 18:33:55,768 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:33:55,769 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-16 18:33:55,769 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:33:55,770 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-16 18:33:55,770 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:33:55,770 Train: 1166 sentences 2023-10-16 18:33:55,770 (train_with_dev=False, train_with_test=False) 2023-10-16 18:33:55,770 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:33:55,770 Training Params: 2023-10-16 18:33:55,770 - learning_rate: "3e-05" 2023-10-16 18:33:55,770 - mini_batch_size: "4" 2023-10-16 18:33:55,770 - max_epochs: "10" 2023-10-16 18:33:55,770 - shuffle: "True" 2023-10-16 18:33:55,770 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:33:55,770 Plugins: 2023-10-16 18:33:55,770 - LinearScheduler | warmup_fraction: '0.1' 2023-10-16 18:33:55,770 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:33:55,770 Final evaluation on model from best epoch (best-model.pt) 2023-10-16 18:33:55,770 - metric: "('micro avg', 'f1-score')" 2023-10-16 18:33:55,770 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:33:55,770 Computation: 2023-10-16 18:33:55,770 - compute on device: cuda:0 2023-10-16 18:33:55,770 - embedding storage: none 2023-10-16 18:33:55,770 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:33:55,770 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-16 18:33:55,770 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:33:55,770 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:33:57,396 epoch 1 - iter 29/292 - loss 2.97811890 - time (sec): 1.62 - samples/sec: 2779.14 - lr: 0.000003 - momentum: 0.000000 2023-10-16 18:33:58,872 epoch 1 - iter 58/292 - loss 2.54035907 - time (sec): 3.10 - samples/sec: 2688.94 - lr: 0.000006 - momentum: 0.000000 2023-10-16 18:34:00,580 epoch 1 - iter 87/292 - loss 1.89046488 - time (sec): 4.81 - samples/sec: 2703.25 - lr: 0.000009 - momentum: 0.000000 2023-10-16 18:34:02,394 epoch 1 - iter 116/292 - loss 1.55026944 - time (sec): 6.62 - samples/sec: 2706.38 - lr: 0.000012 - momentum: 0.000000 2023-10-16 18:34:03,943 epoch 1 - iter 145/292 - loss 1.38005833 - time (sec): 8.17 - samples/sec: 2669.90 - lr: 0.000015 - momentum: 0.000000 2023-10-16 18:34:05,716 epoch 1 - iter 174/292 - loss 1.22687622 - time (sec): 9.94 - samples/sec: 2705.66 - lr: 0.000018 - momentum: 0.000000 2023-10-16 18:34:07,274 epoch 1 - iter 203/292 - loss 1.10559269 - time (sec): 11.50 - samples/sec: 2693.95 - lr: 0.000021 - momentum: 0.000000 2023-10-16 18:34:08,924 epoch 1 - iter 232/292 - loss 0.99171208 - time (sec): 13.15 - samples/sec: 2726.30 - lr: 0.000024 - momentum: 0.000000 2023-10-16 18:34:10,507 epoch 1 - iter 261/292 - loss 0.91903781 - time (sec): 14.74 - samples/sec: 2735.41 - lr: 0.000027 - momentum: 0.000000 2023-10-16 18:34:11,988 epoch 1 - iter 290/292 - loss 0.86127798 - time (sec): 16.22 - samples/sec: 2731.09 - lr: 0.000030 - momentum: 0.000000 2023-10-16 18:34:12,084 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:34:12,084 EPOCH 1 done: loss 0.8594 - lr: 0.000030 2023-10-16 18:34:13,262 DEV : loss 0.2028430998325348 - f1-score (micro avg) 0.4537 2023-10-16 18:34:13,267 saving best model 2023-10-16 18:34:13,617 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:34:15,292 epoch 2 - iter 29/292 - loss 0.26936582 - time (sec): 1.67 - samples/sec: 2624.97 - lr: 0.000030 - momentum: 0.000000 2023-10-16 18:34:17,194 epoch 2 - iter 58/292 - loss 0.27722925 - time (sec): 3.57 - samples/sec: 2779.08 - lr: 0.000029 - momentum: 0.000000 2023-10-16 18:34:18,959 epoch 2 - iter 87/292 - loss 0.26691371 - time (sec): 5.34 - samples/sec: 2671.42 - lr: 0.000029 - momentum: 0.000000 2023-10-16 18:34:20,698 epoch 2 - iter 116/292 - loss 0.24954932 - time (sec): 7.08 - samples/sec: 2675.17 - lr: 0.000029 - momentum: 0.000000 2023-10-16 18:34:22,186 epoch 2 - iter 145/292 - loss 0.23560297 - time (sec): 8.57 - samples/sec: 2672.15 - lr: 0.000028 - momentum: 0.000000 2023-10-16 18:34:23,808 epoch 2 - iter 174/292 - loss 0.22950433 - time (sec): 10.19 - samples/sec: 2698.00 - lr: 0.000028 - momentum: 0.000000 2023-10-16 18:34:25,265 epoch 2 - iter 203/292 - loss 0.22290494 - time (sec): 11.65 - samples/sec: 2692.17 - lr: 0.000028 - momentum: 0.000000 2023-10-16 18:34:26,822 epoch 2 - iter 232/292 - loss 0.21399971 - time (sec): 13.20 - samples/sec: 2687.84 - lr: 0.000027 - momentum: 0.000000 2023-10-16 18:34:28,522 epoch 2 - iter 261/292 - loss 0.20546782 - time (sec): 14.90 - samples/sec: 2677.06 - lr: 0.000027 - momentum: 0.000000 2023-10-16 18:34:30,088 epoch 2 - iter 290/292 - loss 0.19986350 - time (sec): 16.47 - samples/sec: 2672.05 - lr: 0.000027 - momentum: 0.000000 2023-10-16 18:34:30,201 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:34:30,201 EPOCH 2 done: loss 0.1981 - lr: 0.000027 2023-10-16 18:34:31,473 DEV : loss 0.15322378277778625 - f1-score (micro avg) 0.6482 2023-10-16 18:34:31,480 saving best model 2023-10-16 18:34:31,964 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:34:33,761 epoch 3 - iter 29/292 - loss 0.09096690 - time (sec): 1.79 - samples/sec: 2759.83 - lr: 0.000026 - momentum: 0.000000 2023-10-16 18:34:35,666 epoch 3 - iter 58/292 - loss 0.10310537 - time (sec): 3.70 - samples/sec: 2685.77 - lr: 0.000026 - momentum: 0.000000 2023-10-16 18:34:37,169 epoch 3 - iter 87/292 - loss 0.10661866 - time (sec): 5.20 - samples/sec: 2637.12 - lr: 0.000026 - momentum: 0.000000 2023-10-16 18:34:38,616 epoch 3 - iter 116/292 - loss 0.10383362 - time (sec): 6.65 - samples/sec: 2580.48 - lr: 0.000025 - momentum: 0.000000 2023-10-16 18:34:40,190 epoch 3 - iter 145/292 - loss 0.10510943 - time (sec): 8.22 - samples/sec: 2618.23 - lr: 0.000025 - momentum: 0.000000 2023-10-16 18:34:41,861 epoch 3 - iter 174/292 - loss 0.10016538 - time (sec): 9.89 - samples/sec: 2678.26 - lr: 0.000025 - momentum: 0.000000 2023-10-16 18:34:43,725 epoch 3 - iter 203/292 - loss 0.10786459 - time (sec): 11.76 - samples/sec: 2736.37 - lr: 0.000024 - momentum: 0.000000 2023-10-16 18:34:45,256 epoch 3 - iter 232/292 - loss 0.10663759 - time (sec): 13.29 - samples/sec: 2714.67 - lr: 0.000024 - momentum: 0.000000 2023-10-16 18:34:46,820 epoch 3 - iter 261/292 - loss 0.10807253 - time (sec): 14.85 - samples/sec: 2700.38 - lr: 0.000024 - momentum: 0.000000 2023-10-16 18:34:48,664 epoch 3 - iter 290/292 - loss 0.11099861 - time (sec): 16.70 - samples/sec: 2639.15 - lr: 0.000023 - momentum: 0.000000 2023-10-16 18:34:48,770 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:34:48,770 EPOCH 3 done: loss 0.1104 - lr: 0.000023 2023-10-16 18:34:50,024 DEV : loss 0.12740793824195862 - f1-score (micro avg) 0.6806 2023-10-16 18:34:50,031 saving best model 2023-10-16 18:34:50,504 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:34:52,267 epoch 4 - iter 29/292 - loss 0.07240283 - time (sec): 1.76 - samples/sec: 2936.93 - lr: 0.000023 - momentum: 0.000000 2023-10-16 18:34:53,979 epoch 4 - iter 58/292 - loss 0.08368012 - time (sec): 3.47 - samples/sec: 2761.94 - lr: 0.000023 - momentum: 0.000000 2023-10-16 18:34:55,684 epoch 4 - iter 87/292 - loss 0.07070759 - time (sec): 5.18 - samples/sec: 2782.71 - lr: 0.000022 - momentum: 0.000000 2023-10-16 18:34:57,090 epoch 4 - iter 116/292 - loss 0.07117669 - time (sec): 6.58 - samples/sec: 2705.55 - lr: 0.000022 - momentum: 0.000000 2023-10-16 18:34:58,812 epoch 4 - iter 145/292 - loss 0.07127745 - time (sec): 8.30 - samples/sec: 2722.52 - lr: 0.000022 - momentum: 0.000000 2023-10-16 18:35:00,544 epoch 4 - iter 174/292 - loss 0.07173224 - time (sec): 10.04 - samples/sec: 2682.01 - lr: 0.000021 - momentum: 0.000000 2023-10-16 18:35:02,092 epoch 4 - iter 203/292 - loss 0.07056668 - time (sec): 11.58 - samples/sec: 2652.69 - lr: 0.000021 - momentum: 0.000000 2023-10-16 18:35:03,990 epoch 4 - iter 232/292 - loss 0.06753009 - time (sec): 13.48 - samples/sec: 2688.22 - lr: 0.000021 - momentum: 0.000000 2023-10-16 18:35:05,532 epoch 4 - iter 261/292 - loss 0.07357699 - time (sec): 15.02 - samples/sec: 2677.96 - lr: 0.000020 - momentum: 0.000000 2023-10-16 18:35:07,100 epoch 4 - iter 290/292 - loss 0.07369514 - time (sec): 16.59 - samples/sec: 2664.83 - lr: 0.000020 - momentum: 0.000000 2023-10-16 18:35:07,196 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:35:07,196 EPOCH 4 done: loss 0.0738 - lr: 0.000020 2023-10-16 18:35:08,426 DEV : loss 0.11727064847946167 - f1-score (micro avg) 0.7393 2023-10-16 18:35:08,430 saving best model 2023-10-16 18:35:08,915 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:35:10,622 epoch 5 - iter 29/292 - loss 0.04380722 - time (sec): 1.70 - samples/sec: 2602.82 - lr: 0.000020 - momentum: 0.000000 2023-10-16 18:35:12,303 epoch 5 - iter 58/292 - loss 0.04146231 - time (sec): 3.38 - samples/sec: 2577.03 - lr: 0.000019 - momentum: 0.000000 2023-10-16 18:35:14,043 epoch 5 - iter 87/292 - loss 0.03652196 - time (sec): 5.12 - samples/sec: 2582.15 - lr: 0.000019 - momentum: 0.000000 2023-10-16 18:35:15,797 epoch 5 - iter 116/292 - loss 0.04326091 - time (sec): 6.88 - samples/sec: 2693.95 - lr: 0.000019 - momentum: 0.000000 2023-10-16 18:35:17,516 epoch 5 - iter 145/292 - loss 0.04237026 - time (sec): 8.60 - samples/sec: 2733.27 - lr: 0.000018 - momentum: 0.000000 2023-10-16 18:35:19,160 epoch 5 - iter 174/292 - loss 0.04347168 - time (sec): 10.24 - samples/sec: 2718.21 - lr: 0.000018 - momentum: 0.000000 2023-10-16 18:35:20,829 epoch 5 - iter 203/292 - loss 0.04408652 - time (sec): 11.91 - samples/sec: 2745.81 - lr: 0.000018 - momentum: 0.000000 2023-10-16 18:35:22,415 epoch 5 - iter 232/292 - loss 0.04665523 - time (sec): 13.50 - samples/sec: 2716.37 - lr: 0.000017 - momentum: 0.000000 2023-10-16 18:35:23,912 epoch 5 - iter 261/292 - loss 0.05132093 - time (sec): 14.99 - samples/sec: 2698.13 - lr: 0.000017 - momentum: 0.000000 2023-10-16 18:35:25,420 epoch 5 - iter 290/292 - loss 0.05212658 - time (sec): 16.50 - samples/sec: 2676.48 - lr: 0.000017 - momentum: 0.000000 2023-10-16 18:35:25,519 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:35:25,519 EPOCH 5 done: loss 0.0521 - lr: 0.000017 2023-10-16 18:35:26,837 DEV : loss 0.12913569808006287 - f1-score (micro avg) 0.7346 2023-10-16 18:35:26,842 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:35:28,457 epoch 6 - iter 29/292 - loss 0.04619303 - time (sec): 1.61 - samples/sec: 2734.91 - lr: 0.000016 - momentum: 0.000000 2023-10-16 18:35:30,170 epoch 6 - iter 58/292 - loss 0.04616009 - time (sec): 3.33 - samples/sec: 2791.21 - lr: 0.000016 - momentum: 0.000000 2023-10-16 18:35:31,746 epoch 6 - iter 87/292 - loss 0.03824377 - time (sec): 4.90 - samples/sec: 2769.71 - lr: 0.000016 - momentum: 0.000000 2023-10-16 18:35:33,366 epoch 6 - iter 116/292 - loss 0.04127051 - time (sec): 6.52 - samples/sec: 2722.40 - lr: 0.000015 - momentum: 0.000000 2023-10-16 18:35:34,997 epoch 6 - iter 145/292 - loss 0.03927787 - time (sec): 8.15 - samples/sec: 2714.87 - lr: 0.000015 - momentum: 0.000000 2023-10-16 18:35:36,590 epoch 6 - iter 174/292 - loss 0.03842080 - time (sec): 9.75 - samples/sec: 2673.25 - lr: 0.000015 - momentum: 0.000000 2023-10-16 18:35:38,235 epoch 6 - iter 203/292 - loss 0.03737113 - time (sec): 11.39 - samples/sec: 2695.94 - lr: 0.000014 - momentum: 0.000000 2023-10-16 18:35:39,896 epoch 6 - iter 232/292 - loss 0.03596931 - time (sec): 13.05 - samples/sec: 2718.61 - lr: 0.000014 - momentum: 0.000000 2023-10-16 18:35:41,551 epoch 6 - iter 261/292 - loss 0.03747061 - time (sec): 14.71 - samples/sec: 2714.82 - lr: 0.000014 - momentum: 0.000000 2023-10-16 18:35:43,293 epoch 6 - iter 290/292 - loss 0.03976872 - time (sec): 16.45 - samples/sec: 2690.23 - lr: 0.000013 - momentum: 0.000000 2023-10-16 18:35:43,391 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:35:43,391 EPOCH 6 done: loss 0.0396 - lr: 0.000013 2023-10-16 18:35:44,694 DEV : loss 0.1291522979736328 - f1-score (micro avg) 0.7653 2023-10-16 18:35:44,699 saving best model 2023-10-16 18:35:45,224 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:35:46,851 epoch 7 - iter 29/292 - loss 0.02619680 - time (sec): 1.63 - samples/sec: 2695.84 - lr: 0.000013 - momentum: 0.000000 2023-10-16 18:35:48,443 epoch 7 - iter 58/292 - loss 0.01856380 - time (sec): 3.22 - samples/sec: 2652.47 - lr: 0.000013 - momentum: 0.000000 2023-10-16 18:35:50,103 epoch 7 - iter 87/292 - loss 0.02119680 - time (sec): 4.88 - samples/sec: 2700.85 - lr: 0.000012 - momentum: 0.000000 2023-10-16 18:35:51,851 epoch 7 - iter 116/292 - loss 0.02361280 - time (sec): 6.63 - samples/sec: 2714.12 - lr: 0.000012 - momentum: 0.000000 2023-10-16 18:35:53,415 epoch 7 - iter 145/292 - loss 0.02219740 - time (sec): 8.19 - samples/sec: 2677.13 - lr: 0.000012 - momentum: 0.000000 2023-10-16 18:35:55,367 epoch 7 - iter 174/292 - loss 0.02704237 - time (sec): 10.14 - samples/sec: 2695.94 - lr: 0.000011 - momentum: 0.000000 2023-10-16 18:35:56,885 epoch 7 - iter 203/292 - loss 0.02612913 - time (sec): 11.66 - samples/sec: 2689.74 - lr: 0.000011 - momentum: 0.000000 2023-10-16 18:35:58,653 epoch 7 - iter 232/292 - loss 0.03100196 - time (sec): 13.43 - samples/sec: 2661.40 - lr: 0.000011 - momentum: 0.000000 2023-10-16 18:36:00,339 epoch 7 - iter 261/292 - loss 0.03288401 - time (sec): 15.11 - samples/sec: 2659.04 - lr: 0.000010 - momentum: 0.000000 2023-10-16 18:36:01,908 epoch 7 - iter 290/292 - loss 0.03146715 - time (sec): 16.68 - samples/sec: 2649.58 - lr: 0.000010 - momentum: 0.000000 2023-10-16 18:36:02,007 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:36:02,007 EPOCH 7 done: loss 0.0313 - lr: 0.000010 2023-10-16 18:36:03,679 DEV : loss 0.17999307811260223 - f1-score (micro avg) 0.7227 2023-10-16 18:36:03,684 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:36:05,287 epoch 8 - iter 29/292 - loss 0.01237339 - time (sec): 1.60 - samples/sec: 2758.23 - lr: 0.000010 - momentum: 0.000000 2023-10-16 18:36:07,003 epoch 8 - iter 58/292 - loss 0.01348390 - time (sec): 3.32 - samples/sec: 2791.09 - lr: 0.000009 - momentum: 0.000000 2023-10-16 18:36:08,872 epoch 8 - iter 87/292 - loss 0.02301764 - time (sec): 5.19 - samples/sec: 2752.84 - lr: 0.000009 - momentum: 0.000000 2023-10-16 18:36:10,361 epoch 8 - iter 116/292 - loss 0.02333317 - time (sec): 6.68 - samples/sec: 2652.61 - lr: 0.000009 - momentum: 0.000000 2023-10-16 18:36:12,248 epoch 8 - iter 145/292 - loss 0.02259095 - time (sec): 8.56 - samples/sec: 2710.38 - lr: 0.000008 - momentum: 0.000000 2023-10-16 18:36:13,794 epoch 8 - iter 174/292 - loss 0.02842056 - time (sec): 10.11 - samples/sec: 2707.82 - lr: 0.000008 - momentum: 0.000000 2023-10-16 18:36:15,438 epoch 8 - iter 203/292 - loss 0.02616189 - time (sec): 11.75 - samples/sec: 2681.02 - lr: 0.000008 - momentum: 0.000000 2023-10-16 18:36:17,108 epoch 8 - iter 232/292 - loss 0.02767555 - time (sec): 13.42 - samples/sec: 2691.88 - lr: 0.000007 - momentum: 0.000000 2023-10-16 18:36:18,772 epoch 8 - iter 261/292 - loss 0.02718671 - time (sec): 15.09 - samples/sec: 2694.67 - lr: 0.000007 - momentum: 0.000000 2023-10-16 18:36:20,190 epoch 8 - iter 290/292 - loss 0.02656034 - time (sec): 16.50 - samples/sec: 2681.20 - lr: 0.000007 - momentum: 0.000000 2023-10-16 18:36:20,275 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:36:20,275 EPOCH 8 done: loss 0.0265 - lr: 0.000007 2023-10-16 18:36:21,583 DEV : loss 0.16463765501976013 - f1-score (micro avg) 0.7474 2023-10-16 18:36:21,590 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:36:23,262 epoch 9 - iter 29/292 - loss 0.05507861 - time (sec): 1.67 - samples/sec: 2855.76 - lr: 0.000006 - momentum: 0.000000 2023-10-16 18:36:25,078 epoch 9 - iter 58/292 - loss 0.04162172 - time (sec): 3.49 - samples/sec: 2763.36 - lr: 0.000006 - momentum: 0.000000 2023-10-16 18:36:26,669 epoch 9 - iter 87/292 - loss 0.03075030 - time (sec): 5.08 - samples/sec: 2655.80 - lr: 0.000006 - momentum: 0.000000 2023-10-16 18:36:28,264 epoch 9 - iter 116/292 - loss 0.02666806 - time (sec): 6.67 - samples/sec: 2666.52 - lr: 0.000005 - momentum: 0.000000 2023-10-16 18:36:29,796 epoch 9 - iter 145/292 - loss 0.02435338 - time (sec): 8.20 - samples/sec: 2638.11 - lr: 0.000005 - momentum: 0.000000 2023-10-16 18:36:31,386 epoch 9 - iter 174/292 - loss 0.02409412 - time (sec): 9.79 - samples/sec: 2636.49 - lr: 0.000005 - momentum: 0.000000 2023-10-16 18:36:33,037 epoch 9 - iter 203/292 - loss 0.02235971 - time (sec): 11.45 - samples/sec: 2679.09 - lr: 0.000004 - momentum: 0.000000 2023-10-16 18:36:34,503 epoch 9 - iter 232/292 - loss 0.02212491 - time (sec): 12.91 - samples/sec: 2645.26 - lr: 0.000004 - momentum: 0.000000 2023-10-16 18:36:36,254 epoch 9 - iter 261/292 - loss 0.02037368 - time (sec): 14.66 - samples/sec: 2640.49 - lr: 0.000004 - momentum: 0.000000 2023-10-16 18:36:38,158 epoch 9 - iter 290/292 - loss 0.02139770 - time (sec): 16.57 - samples/sec: 2656.24 - lr: 0.000003 - momentum: 0.000000 2023-10-16 18:36:38,299 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:36:38,300 EPOCH 9 done: loss 0.0212 - lr: 0.000003 2023-10-16 18:36:39,571 DEV : loss 0.16017624735832214 - f1-score (micro avg) 0.7468 2023-10-16 18:36:39,576 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:36:41,110 epoch 10 - iter 29/292 - loss 0.02369239 - time (sec): 1.53 - samples/sec: 2527.67 - lr: 0.000003 - momentum: 0.000000 2023-10-16 18:36:42,656 epoch 10 - iter 58/292 - loss 0.01948950 - time (sec): 3.08 - samples/sec: 2656.11 - lr: 0.000003 - momentum: 0.000000 2023-10-16 18:36:44,376 epoch 10 - iter 87/292 - loss 0.01562727 - time (sec): 4.80 - samples/sec: 2696.13 - lr: 0.000002 - momentum: 0.000000 2023-10-16 18:36:46,289 epoch 10 - iter 116/292 - loss 0.01696371 - time (sec): 6.71 - samples/sec: 2709.47 - lr: 0.000002 - momentum: 0.000000 2023-10-16 18:36:47,832 epoch 10 - iter 145/292 - loss 0.01566586 - time (sec): 8.25 - samples/sec: 2697.78 - lr: 0.000002 - momentum: 0.000000 2023-10-16 18:36:49,603 epoch 10 - iter 174/292 - loss 0.01677469 - time (sec): 10.03 - samples/sec: 2705.42 - lr: 0.000001 - momentum: 0.000000 2023-10-16 18:36:51,097 epoch 10 - iter 203/292 - loss 0.01572230 - time (sec): 11.52 - samples/sec: 2698.95 - lr: 0.000001 - momentum: 0.000000 2023-10-16 18:36:52,729 epoch 10 - iter 232/292 - loss 0.01515656 - time (sec): 13.15 - samples/sec: 2696.30 - lr: 0.000001 - momentum: 0.000000 2023-10-16 18:36:54,455 epoch 10 - iter 261/292 - loss 0.01773537 - time (sec): 14.88 - samples/sec: 2725.51 - lr: 0.000000 - momentum: 0.000000 2023-10-16 18:36:55,964 epoch 10 - iter 290/292 - loss 0.01657736 - time (sec): 16.39 - samples/sec: 2704.03 - lr: 0.000000 - momentum: 0.000000 2023-10-16 18:36:56,045 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:36:56,045 EPOCH 10 done: loss 0.0165 - lr: 0.000000 2023-10-16 18:36:57,349 DEV : loss 0.16469089686870575 - f1-score (micro avg) 0.7484 2023-10-16 18:36:57,753 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:36:57,754 Loading model from best epoch ... 2023-10-16 18:36:59,284 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-16 18:37:01,644 Results: - F-score (micro) 0.7497 - F-score (macro) 0.6428 - Accuracy 0.6187 By class: precision recall f1-score support PER 0.8348 0.8276 0.8312 348 LOC 0.6289 0.8506 0.7231 261 ORG 0.4054 0.2885 0.3371 52 HumanProd 0.6071 0.7727 0.6800 22 micro avg 0.7104 0.7936 0.7497 683 macro avg 0.6191 0.6848 0.6428 683 weighted avg 0.7161 0.7936 0.7474 683 2023-10-16 18:37:01,645 ----------------------------------------------------------------------------------------------------