2023-10-25 16:26:03,207 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:26:03,208 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 16:26:03,208 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:26:03,208 MultiCorpus: 7142 train + 698 dev + 2570 test sentences - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator 2023-10-25 16:26:03,208 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:26:03,208 Train: 7142 sentences 2023-10-25 16:26:03,208 (train_with_dev=False, train_with_test=False) 2023-10-25 16:26:03,208 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:26:03,208 Training Params: 2023-10-25 16:26:03,208 - learning_rate: "3e-05" 2023-10-25 16:26:03,208 - mini_batch_size: "4" 2023-10-25 16:26:03,208 - max_epochs: "10" 2023-10-25 16:26:03,208 - shuffle: "True" 2023-10-25 16:26:03,208 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:26:03,208 Plugins: 2023-10-25 16:26:03,208 - TensorboardLogger 2023-10-25 16:26:03,208 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 16:26:03,208 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:26:03,208 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 16:26:03,209 - metric: "('micro avg', 'f1-score')" 2023-10-25 16:26:03,209 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:26:03,209 Computation: 2023-10-25 16:26:03,209 - compute on device: cuda:0 2023-10-25 16:26:03,209 - embedding storage: none 2023-10-25 16:26:03,209 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:26:03,209 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-25 16:26:03,209 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:26:03,209 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:26:03,209 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 16:26:12,778 epoch 1 - iter 178/1786 - loss 1.72389527 - time (sec): 9.57 - samples/sec: 2499.38 - lr: 0.000003 - momentum: 0.000000 2023-10-25 16:26:22,214 epoch 1 - iter 356/1786 - loss 1.11489437 - time (sec): 19.00 - samples/sec: 2559.80 - lr: 0.000006 - momentum: 0.000000 2023-10-25 16:26:31,700 epoch 1 - iter 534/1786 - loss 0.85501815 - time (sec): 28.49 - samples/sec: 2565.38 - lr: 0.000009 - momentum: 0.000000 2023-10-25 16:26:41,250 epoch 1 - iter 712/1786 - loss 0.69063084 - time (sec): 38.04 - samples/sec: 2617.48 - lr: 0.000012 - momentum: 0.000000 2023-10-25 16:26:50,769 epoch 1 - iter 890/1786 - loss 0.59262743 - time (sec): 47.56 - samples/sec: 2593.45 - lr: 0.000015 - momentum: 0.000000 2023-10-25 16:27:00,269 epoch 1 - iter 1068/1786 - loss 0.52253362 - time (sec): 57.06 - samples/sec: 2598.06 - lr: 0.000018 - momentum: 0.000000 2023-10-25 16:27:09,548 epoch 1 - iter 1246/1786 - loss 0.46852649 - time (sec): 66.34 - samples/sec: 2629.99 - lr: 0.000021 - momentum: 0.000000 2023-10-25 16:27:18,901 epoch 1 - iter 1424/1786 - loss 0.42908150 - time (sec): 75.69 - samples/sec: 2630.06 - lr: 0.000024 - momentum: 0.000000 2023-10-25 16:27:28,243 epoch 1 - iter 1602/1786 - loss 0.39816297 - time (sec): 85.03 - samples/sec: 2627.16 - lr: 0.000027 - momentum: 0.000000 2023-10-25 16:27:37,690 epoch 1 - iter 1780/1786 - loss 0.37429285 - time (sec): 94.48 - samples/sec: 2626.94 - lr: 0.000030 - momentum: 0.000000 2023-10-25 16:27:37,985 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:27:37,985 EPOCH 1 done: loss 0.3737 - lr: 0.000030 2023-10-25 16:27:42,018 DEV : loss 0.11665168404579163 - f1-score (micro avg) 0.7065 2023-10-25 16:27:42,041 saving best model 2023-10-25 16:27:42,503 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:27:51,329 epoch 2 - iter 178/1786 - loss 0.10461983 - time (sec): 8.82 - samples/sec: 2853.63 - lr: 0.000030 - momentum: 0.000000 2023-10-25 16:28:00,430 epoch 2 - iter 356/1786 - loss 0.11400338 - time (sec): 17.93 - samples/sec: 2630.89 - lr: 0.000029 - momentum: 0.000000 2023-10-25 16:28:09,657 epoch 2 - iter 534/1786 - loss 0.11338586 - time (sec): 27.15 - samples/sec: 2665.75 - lr: 0.000029 - momentum: 0.000000 2023-10-25 16:28:19,268 epoch 2 - iter 712/1786 - loss 0.11187233 - time (sec): 36.76 - samples/sec: 2721.39 - lr: 0.000029 - momentum: 0.000000 2023-10-25 16:28:28,609 epoch 2 - iter 890/1786 - loss 0.11112092 - time (sec): 46.10 - samples/sec: 2702.62 - lr: 0.000028 - momentum: 0.000000 2023-10-25 16:28:37,715 epoch 2 - iter 1068/1786 - loss 0.11480554 - time (sec): 55.21 - samples/sec: 2693.47 - lr: 0.000028 - momentum: 0.000000 2023-10-25 16:28:47,250 epoch 2 - iter 1246/1786 - loss 0.11362112 - time (sec): 64.75 - samples/sec: 2683.58 - lr: 0.000028 - momentum: 0.000000 2023-10-25 16:28:56,729 epoch 2 - iter 1424/1786 - loss 0.11345575 - time (sec): 74.22 - samples/sec: 2663.61 - lr: 0.000027 - momentum: 0.000000 2023-10-25 16:29:06,404 epoch 2 - iter 1602/1786 - loss 0.11331499 - time (sec): 83.90 - samples/sec: 2661.42 - lr: 0.000027 - momentum: 0.000000 2023-10-25 16:29:15,745 epoch 2 - iter 1780/1786 - loss 0.11389365 - time (sec): 93.24 - samples/sec: 2659.65 - lr: 0.000027 - momentum: 0.000000 2023-10-25 16:29:16,043 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:29:16,043 EPOCH 2 done: loss 0.1141 - lr: 0.000027 2023-10-25 16:29:20,360 DEV : loss 0.15264025330543518 - f1-score (micro avg) 0.7495 2023-10-25 16:29:20,382 saving best model 2023-10-25 16:29:21,051 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:29:30,883 epoch 3 - iter 178/1786 - loss 0.07629665 - time (sec): 9.83 - samples/sec: 2419.01 - lr: 0.000026 - momentum: 0.000000 2023-10-25 16:29:40,271 epoch 3 - iter 356/1786 - loss 0.06863531 - time (sec): 19.22 - samples/sec: 2509.35 - lr: 0.000026 - momentum: 0.000000 2023-10-25 16:29:49,295 epoch 3 - iter 534/1786 - loss 0.07758255 - time (sec): 28.24 - samples/sec: 2575.16 - lr: 0.000026 - momentum: 0.000000 2023-10-25 16:29:58,201 epoch 3 - iter 712/1786 - loss 0.07961424 - time (sec): 37.15 - samples/sec: 2586.45 - lr: 0.000025 - momentum: 0.000000 2023-10-25 16:30:07,236 epoch 3 - iter 890/1786 - loss 0.08178772 - time (sec): 46.18 - samples/sec: 2596.65 - lr: 0.000025 - momentum: 0.000000 2023-10-25 16:30:16,358 epoch 3 - iter 1068/1786 - loss 0.08404606 - time (sec): 55.30 - samples/sec: 2628.21 - lr: 0.000025 - momentum: 0.000000 2023-10-25 16:30:25,496 epoch 3 - iter 1246/1786 - loss 0.08188804 - time (sec): 64.44 - samples/sec: 2647.28 - lr: 0.000024 - momentum: 0.000000 2023-10-25 16:30:35,432 epoch 3 - iter 1424/1786 - loss 0.08055218 - time (sec): 74.38 - samples/sec: 2653.16 - lr: 0.000024 - momentum: 0.000000 2023-10-25 16:30:44,086 epoch 3 - iter 1602/1786 - loss 0.07964001 - time (sec): 83.03 - samples/sec: 2683.32 - lr: 0.000024 - momentum: 0.000000 2023-10-25 16:30:52,968 epoch 3 - iter 1780/1786 - loss 0.07759944 - time (sec): 91.91 - samples/sec: 2697.45 - lr: 0.000023 - momentum: 0.000000 2023-10-25 16:30:53,255 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:30:53,255 EPOCH 3 done: loss 0.0776 - lr: 0.000023 2023-10-25 16:30:57,163 DEV : loss 0.13464441895484924 - f1-score (micro avg) 0.7748 2023-10-25 16:30:57,184 saving best model 2023-10-25 16:30:57,839 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:31:06,592 epoch 4 - iter 178/1786 - loss 0.04440627 - time (sec): 8.75 - samples/sec: 2715.40 - lr: 0.000023 - momentum: 0.000000 2023-10-25 16:31:15,474 epoch 4 - iter 356/1786 - loss 0.05450741 - time (sec): 17.63 - samples/sec: 2733.26 - lr: 0.000023 - momentum: 0.000000 2023-10-25 16:31:24,257 epoch 4 - iter 534/1786 - loss 0.05779653 - time (sec): 26.41 - samples/sec: 2703.26 - lr: 0.000022 - momentum: 0.000000 2023-10-25 16:31:33,177 epoch 4 - iter 712/1786 - loss 0.05478759 - time (sec): 35.33 - samples/sec: 2760.99 - lr: 0.000022 - momentum: 0.000000 2023-10-25 16:31:41,855 epoch 4 - iter 890/1786 - loss 0.05436695 - time (sec): 44.01 - samples/sec: 2764.41 - lr: 0.000022 - momentum: 0.000000 2023-10-25 16:31:50,747 epoch 4 - iter 1068/1786 - loss 0.05371101 - time (sec): 52.90 - samples/sec: 2780.40 - lr: 0.000021 - momentum: 0.000000 2023-10-25 16:31:59,923 epoch 4 - iter 1246/1786 - loss 0.05398691 - time (sec): 62.08 - samples/sec: 2776.06 - lr: 0.000021 - momentum: 0.000000 2023-10-25 16:32:09,269 epoch 4 - iter 1424/1786 - loss 0.05442093 - time (sec): 71.43 - samples/sec: 2779.56 - lr: 0.000021 - momentum: 0.000000 2023-10-25 16:32:18,063 epoch 4 - iter 1602/1786 - loss 0.05445676 - time (sec): 80.22 - samples/sec: 2774.76 - lr: 0.000020 - momentum: 0.000000 2023-10-25 16:32:27,167 epoch 4 - iter 1780/1786 - loss 0.05567026 - time (sec): 89.32 - samples/sec: 2778.43 - lr: 0.000020 - momentum: 0.000000 2023-10-25 16:32:27,482 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:32:27,482 EPOCH 4 done: loss 0.0557 - lr: 0.000020 2023-10-25 16:32:32,322 DEV : loss 0.1592852622270584 - f1-score (micro avg) 0.7807 2023-10-25 16:32:32,358 saving best model 2023-10-25 16:32:33,008 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:32:42,076 epoch 5 - iter 178/1786 - loss 0.03505898 - time (sec): 9.07 - samples/sec: 2675.18 - lr: 0.000020 - momentum: 0.000000 2023-10-25 16:32:51,082 epoch 5 - iter 356/1786 - loss 0.03624807 - time (sec): 18.07 - samples/sec: 2659.88 - lr: 0.000019 - momentum: 0.000000 2023-10-25 16:33:00,526 epoch 5 - iter 534/1786 - loss 0.03535087 - time (sec): 27.52 - samples/sec: 2648.84 - lr: 0.000019 - momentum: 0.000000 2023-10-25 16:33:10,149 epoch 5 - iter 712/1786 - loss 0.03748850 - time (sec): 37.14 - samples/sec: 2613.01 - lr: 0.000019 - momentum: 0.000000 2023-10-25 16:33:18,985 epoch 5 - iter 890/1786 - loss 0.03909736 - time (sec): 45.97 - samples/sec: 2621.81 - lr: 0.000018 - momentum: 0.000000 2023-10-25 16:33:27,925 epoch 5 - iter 1068/1786 - loss 0.03966780 - time (sec): 54.91 - samples/sec: 2632.78 - lr: 0.000018 - momentum: 0.000000 2023-10-25 16:33:37,092 epoch 5 - iter 1246/1786 - loss 0.03878930 - time (sec): 64.08 - samples/sec: 2662.85 - lr: 0.000018 - momentum: 0.000000 2023-10-25 16:33:45,813 epoch 5 - iter 1424/1786 - loss 0.03882227 - time (sec): 72.80 - samples/sec: 2682.22 - lr: 0.000017 - momentum: 0.000000 2023-10-25 16:33:55,003 epoch 5 - iter 1602/1786 - loss 0.03840917 - time (sec): 81.99 - samples/sec: 2715.84 - lr: 0.000017 - momentum: 0.000000 2023-10-25 16:34:03,749 epoch 5 - iter 1780/1786 - loss 0.03930397 - time (sec): 90.74 - samples/sec: 2734.76 - lr: 0.000017 - momentum: 0.000000 2023-10-25 16:34:04,047 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:34:04,047 EPOCH 5 done: loss 0.0395 - lr: 0.000017 2023-10-25 16:34:07,972 DEV : loss 0.18587026000022888 - f1-score (micro avg) 0.762 2023-10-25 16:34:07,993 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:34:17,461 epoch 6 - iter 178/1786 - loss 0.02228299 - time (sec): 9.46 - samples/sec: 2513.38 - lr: 0.000016 - momentum: 0.000000 2023-10-25 16:34:26,900 epoch 6 - iter 356/1786 - loss 0.02707967 - time (sec): 18.90 - samples/sec: 2582.19 - lr: 0.000016 - momentum: 0.000000 2023-10-25 16:34:36,267 epoch 6 - iter 534/1786 - loss 0.02969054 - time (sec): 28.27 - samples/sec: 2599.89 - lr: 0.000016 - momentum: 0.000000 2023-10-25 16:34:45,851 epoch 6 - iter 712/1786 - loss 0.02850474 - time (sec): 37.86 - samples/sec: 2596.22 - lr: 0.000015 - momentum: 0.000000 2023-10-25 16:34:55,626 epoch 6 - iter 890/1786 - loss 0.02952048 - time (sec): 47.63 - samples/sec: 2593.98 - lr: 0.000015 - momentum: 0.000000 2023-10-25 16:35:05,298 epoch 6 - iter 1068/1786 - loss 0.02959778 - time (sec): 57.30 - samples/sec: 2579.41 - lr: 0.000015 - momentum: 0.000000 2023-10-25 16:35:14,998 epoch 6 - iter 1246/1786 - loss 0.02950384 - time (sec): 67.00 - samples/sec: 2578.56 - lr: 0.000014 - momentum: 0.000000 2023-10-25 16:35:24,826 epoch 6 - iter 1424/1786 - loss 0.02899248 - time (sec): 76.83 - samples/sec: 2587.24 - lr: 0.000014 - momentum: 0.000000 2023-10-25 16:35:34,276 epoch 6 - iter 1602/1786 - loss 0.02952807 - time (sec): 86.28 - samples/sec: 2584.11 - lr: 0.000014 - momentum: 0.000000 2023-10-25 16:35:43,735 epoch 6 - iter 1780/1786 - loss 0.02944428 - time (sec): 95.74 - samples/sec: 2590.73 - lr: 0.000013 - momentum: 0.000000 2023-10-25 16:35:44,048 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:35:44,049 EPOCH 6 done: loss 0.0295 - lr: 0.000013 2023-10-25 16:35:49,183 DEV : loss 0.18982860445976257 - f1-score (micro avg) 0.7865 2023-10-25 16:35:49,208 saving best model 2023-10-25 16:35:49,872 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:35:59,426 epoch 7 - iter 178/1786 - loss 0.02267264 - time (sec): 9.55 - samples/sec: 2718.81 - lr: 0.000013 - momentum: 0.000000 2023-10-25 16:36:08,982 epoch 7 - iter 356/1786 - loss 0.02071867 - time (sec): 19.11 - samples/sec: 2692.77 - lr: 0.000013 - momentum: 0.000000 2023-10-25 16:36:18,619 epoch 7 - iter 534/1786 - loss 0.02250539 - time (sec): 28.74 - samples/sec: 2676.34 - lr: 0.000012 - momentum: 0.000000 2023-10-25 16:36:28,445 epoch 7 - iter 712/1786 - loss 0.02189393 - time (sec): 38.57 - samples/sec: 2645.65 - lr: 0.000012 - momentum: 0.000000 2023-10-25 16:36:37,705 epoch 7 - iter 890/1786 - loss 0.02316080 - time (sec): 47.83 - samples/sec: 2631.90 - lr: 0.000012 - momentum: 0.000000 2023-10-25 16:36:46,646 epoch 7 - iter 1068/1786 - loss 0.02272730 - time (sec): 56.77 - samples/sec: 2667.04 - lr: 0.000011 - momentum: 0.000000 2023-10-25 16:36:55,632 epoch 7 - iter 1246/1786 - loss 0.02207885 - time (sec): 65.76 - samples/sec: 2665.92 - lr: 0.000011 - momentum: 0.000000 2023-10-25 16:37:05,058 epoch 7 - iter 1424/1786 - loss 0.02206815 - time (sec): 75.18 - samples/sec: 2654.28 - lr: 0.000011 - momentum: 0.000000 2023-10-25 16:37:14,475 epoch 7 - iter 1602/1786 - loss 0.02192510 - time (sec): 84.60 - samples/sec: 2655.92 - lr: 0.000010 - momentum: 0.000000 2023-10-25 16:37:23,960 epoch 7 - iter 1780/1786 - loss 0.02204271 - time (sec): 94.09 - samples/sec: 2636.41 - lr: 0.000010 - momentum: 0.000000 2023-10-25 16:37:24,276 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:37:24,276 EPOCH 7 done: loss 0.0220 - lr: 0.000010 2023-10-25 16:37:29,189 DEV : loss 0.19655928015708923 - f1-score (micro avg) 0.7919 2023-10-25 16:37:29,211 saving best model 2023-10-25 16:37:29,862 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:37:39,385 epoch 8 - iter 178/1786 - loss 0.01876484 - time (sec): 9.52 - samples/sec: 2613.60 - lr: 0.000010 - momentum: 0.000000 2023-10-25 16:37:48,866 epoch 8 - iter 356/1786 - loss 0.01519318 - time (sec): 19.00 - samples/sec: 2526.15 - lr: 0.000009 - momentum: 0.000000 2023-10-25 16:37:58,302 epoch 8 - iter 534/1786 - loss 0.01373478 - time (sec): 28.44 - samples/sec: 2626.74 - lr: 0.000009 - momentum: 0.000000 2023-10-25 16:38:07,293 epoch 8 - iter 712/1786 - loss 0.01454254 - time (sec): 37.43 - samples/sec: 2645.54 - lr: 0.000009 - momentum: 0.000000 2023-10-25 16:38:16,013 epoch 8 - iter 890/1786 - loss 0.01484691 - time (sec): 46.15 - samples/sec: 2685.45 - lr: 0.000008 - momentum: 0.000000 2023-10-25 16:38:24,785 epoch 8 - iter 1068/1786 - loss 0.01499282 - time (sec): 54.92 - samples/sec: 2733.34 - lr: 0.000008 - momentum: 0.000000 2023-10-25 16:38:33,747 epoch 8 - iter 1246/1786 - loss 0.01424428 - time (sec): 63.88 - samples/sec: 2734.25 - lr: 0.000008 - momentum: 0.000000 2023-10-25 16:38:42,790 epoch 8 - iter 1424/1786 - loss 0.01406695 - time (sec): 72.93 - samples/sec: 2738.06 - lr: 0.000007 - momentum: 0.000000 2023-10-25 16:38:51,780 epoch 8 - iter 1602/1786 - loss 0.01437150 - time (sec): 81.92 - samples/sec: 2723.87 - lr: 0.000007 - momentum: 0.000000 2023-10-25 16:39:00,844 epoch 8 - iter 1780/1786 - loss 0.01533311 - time (sec): 90.98 - samples/sec: 2725.94 - lr: 0.000007 - momentum: 0.000000 2023-10-25 16:39:01,145 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:39:01,145 EPOCH 8 done: loss 0.0153 - lr: 0.000007 2023-10-25 16:39:05,268 DEV : loss 0.21181654930114746 - f1-score (micro avg) 0.7842 2023-10-25 16:39:05,288 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:39:14,500 epoch 9 - iter 178/1786 - loss 0.00816641 - time (sec): 9.21 - samples/sec: 2907.50 - lr: 0.000006 - momentum: 0.000000 2023-10-25 16:39:23,973 epoch 9 - iter 356/1786 - loss 0.01013131 - time (sec): 18.68 - samples/sec: 2780.63 - lr: 0.000006 - momentum: 0.000000 2023-10-25 16:39:33,578 epoch 9 - iter 534/1786 - loss 0.00954521 - time (sec): 28.29 - samples/sec: 2756.96 - lr: 0.000006 - momentum: 0.000000 2023-10-25 16:39:43,144 epoch 9 - iter 712/1786 - loss 0.00969192 - time (sec): 37.85 - samples/sec: 2674.70 - lr: 0.000005 - momentum: 0.000000 2023-10-25 16:39:52,805 epoch 9 - iter 890/1786 - loss 0.01062753 - time (sec): 47.51 - samples/sec: 2602.39 - lr: 0.000005 - momentum: 0.000000 2023-10-25 16:40:01,825 epoch 9 - iter 1068/1786 - loss 0.01065964 - time (sec): 56.53 - samples/sec: 2596.08 - lr: 0.000005 - momentum: 0.000000 2023-10-25 16:40:10,834 epoch 9 - iter 1246/1786 - loss 0.01011395 - time (sec): 65.54 - samples/sec: 2627.85 - lr: 0.000004 - momentum: 0.000000 2023-10-25 16:40:19,865 epoch 9 - iter 1424/1786 - loss 0.01040764 - time (sec): 74.58 - samples/sec: 2643.30 - lr: 0.000004 - momentum: 0.000000 2023-10-25 16:40:28,886 epoch 9 - iter 1602/1786 - loss 0.01055140 - time (sec): 83.60 - samples/sec: 2648.51 - lr: 0.000004 - momentum: 0.000000 2023-10-25 16:40:37,881 epoch 9 - iter 1780/1786 - loss 0.01019071 - time (sec): 92.59 - samples/sec: 2679.20 - lr: 0.000003 - momentum: 0.000000 2023-10-25 16:40:38,162 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:40:38,163 EPOCH 9 done: loss 0.0102 - lr: 0.000003 2023-10-25 16:40:43,043 DEV : loss 0.23426829278469086 - f1-score (micro avg) 0.7848 2023-10-25 16:40:43,066 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:40:52,584 epoch 10 - iter 178/1786 - loss 0.00771099 - time (sec): 9.52 - samples/sec: 2746.65 - lr: 0.000003 - momentum: 0.000000 2023-10-25 16:41:02,050 epoch 10 - iter 356/1786 - loss 0.00805677 - time (sec): 18.98 - samples/sec: 2646.83 - lr: 0.000003 - momentum: 0.000000 2023-10-25 16:41:11,538 epoch 10 - iter 534/1786 - loss 0.00817679 - time (sec): 28.47 - samples/sec: 2656.92 - lr: 0.000002 - momentum: 0.000000 2023-10-25 16:41:20,720 epoch 10 - iter 712/1786 - loss 0.00878410 - time (sec): 37.65 - samples/sec: 2640.29 - lr: 0.000002 - momentum: 0.000000 2023-10-25 16:41:29,664 epoch 10 - iter 890/1786 - loss 0.00887421 - time (sec): 46.60 - samples/sec: 2652.81 - lr: 0.000002 - momentum: 0.000000 2023-10-25 16:41:38,852 epoch 10 - iter 1068/1786 - loss 0.00789479 - time (sec): 55.78 - samples/sec: 2679.34 - lr: 0.000001 - momentum: 0.000000 2023-10-25 16:41:47,525 epoch 10 - iter 1246/1786 - loss 0.00784114 - time (sec): 64.46 - samples/sec: 2718.01 - lr: 0.000001 - momentum: 0.000000 2023-10-25 16:41:56,178 epoch 10 - iter 1424/1786 - loss 0.00742785 - time (sec): 73.11 - samples/sec: 2736.80 - lr: 0.000001 - momentum: 0.000000 2023-10-25 16:42:04,870 epoch 10 - iter 1602/1786 - loss 0.00792667 - time (sec): 81.80 - samples/sec: 2734.65 - lr: 0.000000 - momentum: 0.000000 2023-10-25 16:42:13,881 epoch 10 - iter 1780/1786 - loss 0.00755996 - time (sec): 90.81 - samples/sec: 2733.58 - lr: 0.000000 - momentum: 0.000000 2023-10-25 16:42:14,170 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:42:14,171 EPOCH 10 done: loss 0.0075 - lr: 0.000000 2023-10-25 16:42:18,903 DEV : loss 0.23706591129302979 - f1-score (micro avg) 0.7914 2023-10-25 16:42:19,357 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:42:19,358 Loading model from best epoch ... 2023-10-25 16:42:21,248 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-25 16:42:34,839 Results: - F-score (micro) 0.6924 - F-score (macro) 0.62 - Accuracy 0.5472 By class: precision recall f1-score support LOC 0.7289 0.6703 0.6984 1095 PER 0.7611 0.7806 0.7707 1012 ORG 0.4414 0.5910 0.5054 357 HumanProd 0.3966 0.6970 0.5055 33 micro avg 0.6811 0.7040 0.6924 2497 macro avg 0.5820 0.6847 0.6200 2497 weighted avg 0.6964 0.7040 0.6976 2497 2023-10-25 16:42:34,839 ----------------------------------------------------------------------------------------------------