2023-10-25 15:46:38,696 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:46:38,697 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 15:46:38,697 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:46:38,697 MultiCorpus: 7142 train + 698 dev + 2570 test sentences - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator 2023-10-25 15:46:38,697 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:46:38,697 Train: 7142 sentences 2023-10-25 15:46:38,697 (train_with_dev=False, train_with_test=False) 2023-10-25 15:46:38,697 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:46:38,697 Training Params: 2023-10-25 15:46:38,697 - learning_rate: "5e-05" 2023-10-25 15:46:38,697 - mini_batch_size: "4" 2023-10-25 15:46:38,697 - max_epochs: "10" 2023-10-25 15:46:38,697 - shuffle: "True" 2023-10-25 15:46:38,697 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:46:38,697 Plugins: 2023-10-25 15:46:38,697 - TensorboardLogger 2023-10-25 15:46:38,697 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 15:46:38,697 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:46:38,697 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 15:46:38,697 - metric: "('micro avg', 'f1-score')" 2023-10-25 15:46:38,698 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:46:38,698 Computation: 2023-10-25 15:46:38,698 - compute on device: cuda:0 2023-10-25 15:46:38,698 - embedding storage: none 2023-10-25 15:46:38,698 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:46:38,698 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-25 15:46:38,698 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:46:38,698 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:46:38,698 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 15:46:48,318 epoch 1 - iter 178/1786 - loss 1.71122294 - time (sec): 9.62 - samples/sec: 2650.79 - lr: 0.000005 - momentum: 0.000000 2023-10-25 15:46:57,968 epoch 1 - iter 356/1786 - loss 1.10034904 - time (sec): 19.27 - samples/sec: 2534.06 - lr: 0.000010 - momentum: 0.000000 2023-10-25 15:47:07,424 epoch 1 - iter 534/1786 - loss 0.83110797 - time (sec): 28.73 - samples/sec: 2515.44 - lr: 0.000015 - momentum: 0.000000 2023-10-25 15:47:16,715 epoch 1 - iter 712/1786 - loss 0.67436102 - time (sec): 38.02 - samples/sec: 2546.45 - lr: 0.000020 - momentum: 0.000000 2023-10-25 15:47:26,198 epoch 1 - iter 890/1786 - loss 0.57044616 - time (sec): 47.50 - samples/sec: 2567.57 - lr: 0.000025 - momentum: 0.000000 2023-10-25 15:47:35,361 epoch 1 - iter 1068/1786 - loss 0.49625403 - time (sec): 56.66 - samples/sec: 2617.57 - lr: 0.000030 - momentum: 0.000000 2023-10-25 15:47:44,672 epoch 1 - iter 1246/1786 - loss 0.44959498 - time (sec): 65.97 - samples/sec: 2624.58 - lr: 0.000035 - momentum: 0.000000 2023-10-25 15:47:54,339 epoch 1 - iter 1424/1786 - loss 0.41355462 - time (sec): 75.64 - samples/sec: 2617.35 - lr: 0.000040 - momentum: 0.000000 2023-10-25 15:48:03,738 epoch 1 - iter 1602/1786 - loss 0.38486574 - time (sec): 85.04 - samples/sec: 2618.41 - lr: 0.000045 - momentum: 0.000000 2023-10-25 15:48:12,801 epoch 1 - iter 1780/1786 - loss 0.36139254 - time (sec): 94.10 - samples/sec: 2636.46 - lr: 0.000050 - momentum: 0.000000 2023-10-25 15:48:13,083 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:48:13,083 EPOCH 1 done: loss 0.3608 - lr: 0.000050 2023-10-25 15:48:17,092 DEV : loss 0.13503123819828033 - f1-score (micro avg) 0.7202 2023-10-25 15:48:17,116 saving best model 2023-10-25 15:48:17,612 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:48:27,486 epoch 2 - iter 178/1786 - loss 0.12407203 - time (sec): 9.87 - samples/sec: 2599.84 - lr: 0.000049 - momentum: 0.000000 2023-10-25 15:48:37,042 epoch 2 - iter 356/1786 - loss 0.12736955 - time (sec): 19.43 - samples/sec: 2426.93 - lr: 0.000049 - momentum: 0.000000 2023-10-25 15:48:46,727 epoch 2 - iter 534/1786 - loss 0.12458317 - time (sec): 29.11 - samples/sec: 2535.20 - lr: 0.000048 - momentum: 0.000000 2023-10-25 15:48:56,148 epoch 2 - iter 712/1786 - loss 0.12184214 - time (sec): 38.53 - samples/sec: 2555.59 - lr: 0.000048 - momentum: 0.000000 2023-10-25 15:49:05,444 epoch 2 - iter 890/1786 - loss 0.12055895 - time (sec): 47.83 - samples/sec: 2593.23 - lr: 0.000047 - momentum: 0.000000 2023-10-25 15:49:14,963 epoch 2 - iter 1068/1786 - loss 0.12103444 - time (sec): 57.35 - samples/sec: 2594.01 - lr: 0.000047 - momentum: 0.000000 2023-10-25 15:49:24,474 epoch 2 - iter 1246/1786 - loss 0.11982038 - time (sec): 66.86 - samples/sec: 2615.69 - lr: 0.000046 - momentum: 0.000000 2023-10-25 15:49:33,793 epoch 2 - iter 1424/1786 - loss 0.11974673 - time (sec): 76.18 - samples/sec: 2589.45 - lr: 0.000046 - momentum: 0.000000 2023-10-25 15:49:43,364 epoch 2 - iter 1602/1786 - loss 0.11956401 - time (sec): 85.75 - samples/sec: 2596.32 - lr: 0.000045 - momentum: 0.000000 2023-10-25 15:49:53,170 epoch 2 - iter 1780/1786 - loss 0.11947260 - time (sec): 95.56 - samples/sec: 2592.81 - lr: 0.000044 - momentum: 0.000000 2023-10-25 15:49:53,498 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:49:53,498 EPOCH 2 done: loss 0.1196 - lr: 0.000044 2023-10-25 15:49:57,539 DEV : loss 0.11700031161308289 - f1-score (micro avg) 0.7623 2023-10-25 15:49:57,560 saving best model 2023-10-25 15:49:58,199 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:50:07,682 epoch 3 - iter 178/1786 - loss 0.07945384 - time (sec): 9.48 - samples/sec: 2683.68 - lr: 0.000044 - momentum: 0.000000 2023-10-25 15:50:17,051 epoch 3 - iter 356/1786 - loss 0.07503847 - time (sec): 18.85 - samples/sec: 2596.72 - lr: 0.000043 - momentum: 0.000000 2023-10-25 15:50:26,519 epoch 3 - iter 534/1786 - loss 0.07795492 - time (sec): 28.32 - samples/sec: 2648.33 - lr: 0.000043 - momentum: 0.000000 2023-10-25 15:50:35,768 epoch 3 - iter 712/1786 - loss 0.08107118 - time (sec): 37.57 - samples/sec: 2654.03 - lr: 0.000042 - momentum: 0.000000 2023-10-25 15:50:44,402 epoch 3 - iter 890/1786 - loss 0.08084430 - time (sec): 46.20 - samples/sec: 2662.85 - lr: 0.000042 - momentum: 0.000000 2023-10-25 15:50:53,144 epoch 3 - iter 1068/1786 - loss 0.08209011 - time (sec): 54.94 - samples/sec: 2688.32 - lr: 0.000041 - momentum: 0.000000 2023-10-25 15:51:02,410 epoch 3 - iter 1246/1786 - loss 0.08279690 - time (sec): 64.21 - samples/sec: 2707.24 - lr: 0.000041 - momentum: 0.000000 2023-10-25 15:51:11,472 epoch 3 - iter 1424/1786 - loss 0.08293172 - time (sec): 73.27 - samples/sec: 2725.81 - lr: 0.000040 - momentum: 0.000000 2023-10-25 15:51:20,625 epoch 3 - iter 1602/1786 - loss 0.08232577 - time (sec): 82.42 - samples/sec: 2731.48 - lr: 0.000039 - momentum: 0.000000 2023-10-25 15:51:29,623 epoch 3 - iter 1780/1786 - loss 0.08244948 - time (sec): 91.42 - samples/sec: 2714.74 - lr: 0.000039 - momentum: 0.000000 2023-10-25 15:51:29,917 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:51:29,917 EPOCH 3 done: loss 0.0825 - lr: 0.000039 2023-10-25 15:51:34,711 DEV : loss 0.1378549337387085 - f1-score (micro avg) 0.755 2023-10-25 15:51:34,733 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:51:43,650 epoch 4 - iter 178/1786 - loss 0.05427530 - time (sec): 8.91 - samples/sec: 2798.78 - lr: 0.000038 - momentum: 0.000000 2023-10-25 15:51:52,829 epoch 4 - iter 356/1786 - loss 0.06260664 - time (sec): 18.09 - samples/sec: 2759.29 - lr: 0.000038 - momentum: 0.000000 2023-10-25 15:52:01,468 epoch 4 - iter 534/1786 - loss 0.06187774 - time (sec): 26.73 - samples/sec: 2746.84 - lr: 0.000037 - momentum: 0.000000 2023-10-25 15:52:10,600 epoch 4 - iter 712/1786 - loss 0.06195480 - time (sec): 35.87 - samples/sec: 2765.81 - lr: 0.000037 - momentum: 0.000000 2023-10-25 15:52:19,483 epoch 4 - iter 890/1786 - loss 0.06149034 - time (sec): 44.75 - samples/sec: 2776.77 - lr: 0.000036 - momentum: 0.000000 2023-10-25 15:52:28,516 epoch 4 - iter 1068/1786 - loss 0.06169674 - time (sec): 53.78 - samples/sec: 2806.92 - lr: 0.000036 - momentum: 0.000000 2023-10-25 15:52:37,349 epoch 4 - iter 1246/1786 - loss 0.06445329 - time (sec): 62.61 - samples/sec: 2795.13 - lr: 0.000035 - momentum: 0.000000 2023-10-25 15:52:46,539 epoch 4 - iter 1424/1786 - loss 0.06499429 - time (sec): 71.80 - samples/sec: 2758.74 - lr: 0.000034 - momentum: 0.000000 2023-10-25 15:52:55,747 epoch 4 - iter 1602/1786 - loss 0.06306222 - time (sec): 81.01 - samples/sec: 2771.19 - lr: 0.000034 - momentum: 0.000000 2023-10-25 15:53:04,431 epoch 4 - iter 1780/1786 - loss 0.06282893 - time (sec): 89.70 - samples/sec: 2767.02 - lr: 0.000033 - momentum: 0.000000 2023-10-25 15:53:04,706 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:53:04,707 EPOCH 4 done: loss 0.0630 - lr: 0.000033 2023-10-25 15:53:09,579 DEV : loss 0.18551814556121826 - f1-score (micro avg) 0.7519 2023-10-25 15:53:09,603 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:53:18,813 epoch 5 - iter 178/1786 - loss 0.04430229 - time (sec): 9.21 - samples/sec: 2578.40 - lr: 0.000033 - momentum: 0.000000 2023-10-25 15:53:28,613 epoch 5 - iter 356/1786 - loss 0.04196870 - time (sec): 19.01 - samples/sec: 2602.50 - lr: 0.000032 - momentum: 0.000000 2023-10-25 15:53:38,464 epoch 5 - iter 534/1786 - loss 0.04425551 - time (sec): 28.86 - samples/sec: 2605.34 - lr: 0.000032 - momentum: 0.000000 2023-10-25 15:53:47,762 epoch 5 - iter 712/1786 - loss 0.04299756 - time (sec): 38.16 - samples/sec: 2606.77 - lr: 0.000031 - momentum: 0.000000 2023-10-25 15:53:56,812 epoch 5 - iter 890/1786 - loss 0.04398595 - time (sec): 47.21 - samples/sec: 2619.00 - lr: 0.000031 - momentum: 0.000000 2023-10-25 15:54:05,975 epoch 5 - iter 1068/1786 - loss 0.04465443 - time (sec): 56.37 - samples/sec: 2659.82 - lr: 0.000030 - momentum: 0.000000 2023-10-25 15:54:14,748 epoch 5 - iter 1246/1786 - loss 0.04413175 - time (sec): 65.14 - samples/sec: 2672.35 - lr: 0.000029 - momentum: 0.000000 2023-10-25 15:54:23,729 epoch 5 - iter 1424/1786 - loss 0.04310012 - time (sec): 74.12 - samples/sec: 2673.94 - lr: 0.000029 - momentum: 0.000000 2023-10-25 15:54:32,836 epoch 5 - iter 1602/1786 - loss 0.04362101 - time (sec): 83.23 - samples/sec: 2677.74 - lr: 0.000028 - momentum: 0.000000 2023-10-25 15:54:42,313 epoch 5 - iter 1780/1786 - loss 0.04499036 - time (sec): 92.71 - samples/sec: 2675.55 - lr: 0.000028 - momentum: 0.000000 2023-10-25 15:54:42,605 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:54:42,605 EPOCH 5 done: loss 0.0449 - lr: 0.000028 2023-10-25 15:54:46,479 DEV : loss 0.18028688430786133 - f1-score (micro avg) 0.8033 2023-10-25 15:54:46,503 saving best model 2023-10-25 15:54:47,160 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:54:56,791 epoch 6 - iter 178/1786 - loss 0.02633064 - time (sec): 9.63 - samples/sec: 2614.15 - lr: 0.000027 - momentum: 0.000000 2023-10-25 15:55:06,138 epoch 6 - iter 356/1786 - loss 0.02540360 - time (sec): 18.97 - samples/sec: 2572.31 - lr: 0.000027 - momentum: 0.000000 2023-10-25 15:55:15,646 epoch 6 - iter 534/1786 - loss 0.02731547 - time (sec): 28.48 - samples/sec: 2601.99 - lr: 0.000026 - momentum: 0.000000 2023-10-25 15:55:25,066 epoch 6 - iter 712/1786 - loss 0.03161683 - time (sec): 37.90 - samples/sec: 2616.15 - lr: 0.000026 - momentum: 0.000000 2023-10-25 15:55:34,736 epoch 6 - iter 890/1786 - loss 0.03258884 - time (sec): 47.57 - samples/sec: 2632.34 - lr: 0.000025 - momentum: 0.000000 2023-10-25 15:55:44,052 epoch 6 - iter 1068/1786 - loss 0.03444115 - time (sec): 56.89 - samples/sec: 2608.85 - lr: 0.000024 - momentum: 0.000000 2023-10-25 15:55:53,428 epoch 6 - iter 1246/1786 - loss 0.03455081 - time (sec): 66.27 - samples/sec: 2624.54 - lr: 0.000024 - momentum: 0.000000 2023-10-25 15:56:02,885 epoch 6 - iter 1424/1786 - loss 0.03507010 - time (sec): 75.72 - samples/sec: 2631.78 - lr: 0.000023 - momentum: 0.000000 2023-10-25 15:56:11,781 epoch 6 - iter 1602/1786 - loss 0.03617707 - time (sec): 84.62 - samples/sec: 2646.56 - lr: 0.000023 - momentum: 0.000000 2023-10-25 15:56:20,674 epoch 6 - iter 1780/1786 - loss 0.03554486 - time (sec): 93.51 - samples/sec: 2654.32 - lr: 0.000022 - momentum: 0.000000 2023-10-25 15:56:20,971 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:56:20,971 EPOCH 6 done: loss 0.0357 - lr: 0.000022 2023-10-25 15:56:25,780 DEV : loss 0.1820164918899536 - f1-score (micro avg) 0.7943 2023-10-25 15:56:25,801 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:56:35,262 epoch 7 - iter 178/1786 - loss 0.02389501 - time (sec): 9.46 - samples/sec: 2817.92 - lr: 0.000022 - momentum: 0.000000 2023-10-25 15:56:44,521 epoch 7 - iter 356/1786 - loss 0.03026582 - time (sec): 18.72 - samples/sec: 2707.42 - lr: 0.000021 - momentum: 0.000000 2023-10-25 15:56:53,894 epoch 7 - iter 534/1786 - loss 0.03128907 - time (sec): 28.09 - samples/sec: 2670.28 - lr: 0.000021 - momentum: 0.000000 2023-10-25 15:57:03,184 epoch 7 - iter 712/1786 - loss 0.03038936 - time (sec): 37.38 - samples/sec: 2681.89 - lr: 0.000020 - momentum: 0.000000 2023-10-25 15:57:12,520 epoch 7 - iter 890/1786 - loss 0.02884850 - time (sec): 46.72 - samples/sec: 2668.10 - lr: 0.000019 - momentum: 0.000000 2023-10-25 15:57:21,892 epoch 7 - iter 1068/1786 - loss 0.02871245 - time (sec): 56.09 - samples/sec: 2650.46 - lr: 0.000019 - momentum: 0.000000 2023-10-25 15:57:31,104 epoch 7 - iter 1246/1786 - loss 0.02822760 - time (sec): 65.30 - samples/sec: 2629.02 - lr: 0.000018 - momentum: 0.000000 2023-10-25 15:57:40,635 epoch 7 - iter 1424/1786 - loss 0.02843002 - time (sec): 74.83 - samples/sec: 2645.70 - lr: 0.000018 - momentum: 0.000000 2023-10-25 15:57:49,434 epoch 7 - iter 1602/1786 - loss 0.02811501 - time (sec): 83.63 - samples/sec: 2659.82 - lr: 0.000017 - momentum: 0.000000 2023-10-25 15:57:58,178 epoch 7 - iter 1780/1786 - loss 0.02826981 - time (sec): 92.38 - samples/sec: 2685.53 - lr: 0.000017 - momentum: 0.000000 2023-10-25 15:57:58,469 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:57:58,470 EPOCH 7 done: loss 0.0282 - lr: 0.000017 2023-10-25 15:58:03,620 DEV : loss 0.2033814787864685 - f1-score (micro avg) 0.7832 2023-10-25 15:58:03,643 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:58:13,174 epoch 8 - iter 178/1786 - loss 0.02698386 - time (sec): 9.53 - samples/sec: 2513.72 - lr: 0.000016 - momentum: 0.000000 2023-10-25 15:58:23,042 epoch 8 - iter 356/1786 - loss 0.02189818 - time (sec): 19.40 - samples/sec: 2504.44 - lr: 0.000016 - momentum: 0.000000 2023-10-25 15:58:32,872 epoch 8 - iter 534/1786 - loss 0.02241369 - time (sec): 29.23 - samples/sec: 2531.08 - lr: 0.000015 - momentum: 0.000000 2023-10-25 15:58:42,433 epoch 8 - iter 712/1786 - loss 0.02213136 - time (sec): 38.79 - samples/sec: 2517.54 - lr: 0.000014 - momentum: 0.000000 2023-10-25 15:58:52,151 epoch 8 - iter 890/1786 - loss 0.02156072 - time (sec): 48.51 - samples/sec: 2522.71 - lr: 0.000014 - momentum: 0.000000 2023-10-25 15:59:01,830 epoch 8 - iter 1068/1786 - loss 0.02028695 - time (sec): 58.18 - samples/sec: 2539.64 - lr: 0.000013 - momentum: 0.000000 2023-10-25 15:59:11,492 epoch 8 - iter 1246/1786 - loss 0.01971787 - time (sec): 67.85 - samples/sec: 2567.47 - lr: 0.000013 - momentum: 0.000000 2023-10-25 15:59:21,218 epoch 8 - iter 1424/1786 - loss 0.01980862 - time (sec): 77.57 - samples/sec: 2547.94 - lr: 0.000012 - momentum: 0.000000 2023-10-25 15:59:30,611 epoch 8 - iter 1602/1786 - loss 0.02006421 - time (sec): 86.97 - samples/sec: 2557.81 - lr: 0.000012 - momentum: 0.000000 2023-10-25 15:59:39,890 epoch 8 - iter 1780/1786 - loss 0.01964747 - time (sec): 96.25 - samples/sec: 2577.09 - lr: 0.000011 - momentum: 0.000000 2023-10-25 15:59:40,217 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:59:40,217 EPOCH 8 done: loss 0.0197 - lr: 0.000011 2023-10-25 15:59:44,159 DEV : loss 0.21475903689861298 - f1-score (micro avg) 0.79 2023-10-25 15:59:44,183 ---------------------------------------------------------------------------------------------------- 2023-10-25 15:59:53,943 epoch 9 - iter 178/1786 - loss 0.00854585 - time (sec): 9.76 - samples/sec: 2529.14 - lr: 0.000011 - momentum: 0.000000 2023-10-25 16:00:03,415 epoch 9 - iter 356/1786 - loss 0.00730957 - time (sec): 19.23 - samples/sec: 2546.52 - lr: 0.000010 - momentum: 0.000000 2023-10-25 16:00:12,902 epoch 9 - iter 534/1786 - loss 0.00824375 - time (sec): 28.72 - samples/sec: 2565.61 - lr: 0.000009 - momentum: 0.000000 2023-10-25 16:00:22,790 epoch 9 - iter 712/1786 - loss 0.01124988 - time (sec): 38.61 - samples/sec: 2607.10 - lr: 0.000009 - momentum: 0.000000 2023-10-25 16:00:32,365 epoch 9 - iter 890/1786 - loss 0.01242606 - time (sec): 48.18 - samples/sec: 2611.93 - lr: 0.000008 - momentum: 0.000000 2023-10-25 16:00:41,872 epoch 9 - iter 1068/1786 - loss 0.01230465 - time (sec): 57.69 - samples/sec: 2624.15 - lr: 0.000008 - momentum: 0.000000 2023-10-25 16:00:51,095 epoch 9 - iter 1246/1786 - loss 0.01297617 - time (sec): 66.91 - samples/sec: 2616.87 - lr: 0.000007 - momentum: 0.000000 2023-10-25 16:01:00,313 epoch 9 - iter 1424/1786 - loss 0.01268858 - time (sec): 76.13 - samples/sec: 2634.73 - lr: 0.000007 - momentum: 0.000000 2023-10-25 16:01:09,907 epoch 9 - iter 1602/1786 - loss 0.01258302 - time (sec): 85.72 - samples/sec: 2619.87 - lr: 0.000006 - momentum: 0.000000 2023-10-25 16:01:19,662 epoch 9 - iter 1780/1786 - loss 0.01218671 - time (sec): 95.48 - samples/sec: 2598.21 - lr: 0.000006 - momentum: 0.000000 2023-10-25 16:01:19,987 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:01:19,987 EPOCH 9 done: loss 0.0122 - lr: 0.000006 2023-10-25 16:01:25,365 DEV : loss 0.21478621661663055 - f1-score (micro avg) 0.7938 2023-10-25 16:01:25,386 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:01:34,658 epoch 10 - iter 178/1786 - loss 0.01114122 - time (sec): 9.27 - samples/sec: 2577.39 - lr: 0.000005 - momentum: 0.000000 2023-10-25 16:01:44,026 epoch 10 - iter 356/1786 - loss 0.00936457 - time (sec): 18.64 - samples/sec: 2663.26 - lr: 0.000004 - momentum: 0.000000 2023-10-25 16:01:53,778 epoch 10 - iter 534/1786 - loss 0.00753507 - time (sec): 28.39 - samples/sec: 2632.67 - lr: 0.000004 - momentum: 0.000000 2023-10-25 16:02:02,735 epoch 10 - iter 712/1786 - loss 0.00724194 - time (sec): 37.35 - samples/sec: 2675.25 - lr: 0.000003 - momentum: 0.000000 2023-10-25 16:02:12,223 epoch 10 - iter 890/1786 - loss 0.00681051 - time (sec): 46.84 - samples/sec: 2606.92 - lr: 0.000003 - momentum: 0.000000 2023-10-25 16:02:21,833 epoch 10 - iter 1068/1786 - loss 0.00691939 - time (sec): 56.45 - samples/sec: 2622.33 - lr: 0.000002 - momentum: 0.000000 2023-10-25 16:02:31,013 epoch 10 - iter 1246/1786 - loss 0.00730625 - time (sec): 65.63 - samples/sec: 2633.25 - lr: 0.000002 - momentum: 0.000000 2023-10-25 16:02:40,785 epoch 10 - iter 1424/1786 - loss 0.00715163 - time (sec): 75.40 - samples/sec: 2622.41 - lr: 0.000001 - momentum: 0.000000 2023-10-25 16:02:50,482 epoch 10 - iter 1602/1786 - loss 0.00713533 - time (sec): 85.09 - samples/sec: 2603.48 - lr: 0.000001 - momentum: 0.000000 2023-10-25 16:03:00,096 epoch 10 - iter 1780/1786 - loss 0.00759595 - time (sec): 94.71 - samples/sec: 2619.46 - lr: 0.000000 - momentum: 0.000000 2023-10-25 16:03:00,416 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:03:00,416 EPOCH 10 done: loss 0.0076 - lr: 0.000000 2023-10-25 16:03:04,801 DEV : loss 0.2166852504014969 - f1-score (micro avg) 0.7957 2023-10-25 16:03:06,067 ---------------------------------------------------------------------------------------------------- 2023-10-25 16:03:06,068 Loading model from best epoch ... 2023-10-25 16:03:07,869 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-25 16:03:20,794 Results: - F-score (micro) 0.6746 - F-score (macro) 0.5843 - Accuracy 0.521 By class: precision recall f1-score support LOC 0.6880 0.6484 0.6676 1095 PER 0.7798 0.7559 0.7677 1012 ORG 0.4706 0.4706 0.4706 357 HumanProd 0.3188 0.6667 0.4314 33 micro avg 0.6827 0.6668 0.6746 2497 macro avg 0.5643 0.6354 0.5843 2497 weighted avg 0.6892 0.6668 0.6769 2497 2023-10-25 16:03:20,794 ----------------------------------------------------------------------------------------------------