2023-10-13 15:42:35,866 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:42:35,867 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 15:42:35,867 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:42:35,867 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator 2023-10-13 15:42:35,867 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:42:35,867 Train: 5901 sentences 2023-10-13 15:42:35,867 (train_with_dev=False, train_with_test=False) 2023-10-13 15:42:35,867 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:42:35,867 Training Params: 2023-10-13 15:42:35,867 - learning_rate: "5e-05" 2023-10-13 15:42:35,867 - mini_batch_size: "4" 2023-10-13 15:42:35,868 - max_epochs: "10" 2023-10-13 15:42:35,868 - shuffle: "True" 2023-10-13 15:42:35,868 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:42:35,868 Plugins: 2023-10-13 15:42:35,868 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 15:42:35,868 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:42:35,868 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 15:42:35,868 - metric: "('micro avg', 'f1-score')" 2023-10-13 15:42:35,868 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:42:35,868 Computation: 2023-10-13 15:42:35,868 - compute on device: cuda:0 2023-10-13 15:42:35,868 - embedding storage: none 2023-10-13 15:42:35,868 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:42:35,868 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-13 15:42:35,868 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:42:35,868 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:42:42,852 epoch 1 - iter 147/1476 - loss 2.30028384 - time (sec): 6.98 - samples/sec: 2417.10 - lr: 0.000005 - momentum: 0.000000 2023-10-13 15:42:49,682 epoch 1 - iter 294/1476 - loss 1.45013254 - time (sec): 13.81 - samples/sec: 2399.46 - lr: 0.000010 - momentum: 0.000000 2023-10-13 15:42:57,042 epoch 1 - iter 441/1476 - loss 1.06928097 - time (sec): 21.17 - samples/sec: 2472.20 - lr: 0.000015 - momentum: 0.000000 2023-10-13 15:43:03,830 epoch 1 - iter 588/1476 - loss 0.89478293 - time (sec): 27.96 - samples/sec: 2410.19 - lr: 0.000020 - momentum: 0.000000 2023-10-13 15:43:10,679 epoch 1 - iter 735/1476 - loss 0.77703571 - time (sec): 34.81 - samples/sec: 2401.38 - lr: 0.000025 - momentum: 0.000000 2023-10-13 15:43:17,429 epoch 1 - iter 882/1476 - loss 0.69040845 - time (sec): 41.56 - samples/sec: 2383.42 - lr: 0.000030 - momentum: 0.000000 2023-10-13 15:43:24,222 epoch 1 - iter 1029/1476 - loss 0.62755418 - time (sec): 48.35 - samples/sec: 2364.76 - lr: 0.000035 - momentum: 0.000000 2023-10-13 15:43:30,951 epoch 1 - iter 1176/1476 - loss 0.57577336 - time (sec): 55.08 - samples/sec: 2354.56 - lr: 0.000040 - momentum: 0.000000 2023-10-13 15:43:38,279 epoch 1 - iter 1323/1476 - loss 0.52275812 - time (sec): 62.41 - samples/sec: 2389.60 - lr: 0.000045 - momentum: 0.000000 2023-10-13 15:43:45,281 epoch 1 - iter 1470/1476 - loss 0.48888637 - time (sec): 69.41 - samples/sec: 2388.74 - lr: 0.000050 - momentum: 0.000000 2023-10-13 15:43:45,558 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:43:45,558 EPOCH 1 done: loss 0.4876 - lr: 0.000050 2023-10-13 15:43:51,735 DEV : loss 0.13950972259044647 - f1-score (micro avg) 0.6846 2023-10-13 15:43:51,763 saving best model 2023-10-13 15:43:52,181 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:43:59,160 epoch 2 - iter 147/1476 - loss 0.15312363 - time (sec): 6.98 - samples/sec: 2436.22 - lr: 0.000049 - momentum: 0.000000 2023-10-13 15:44:05,796 epoch 2 - iter 294/1476 - loss 0.14515788 - time (sec): 13.61 - samples/sec: 2295.82 - lr: 0.000049 - momentum: 0.000000 2023-10-13 15:44:12,599 epoch 2 - iter 441/1476 - loss 0.14926136 - time (sec): 20.42 - samples/sec: 2289.42 - lr: 0.000048 - momentum: 0.000000 2023-10-13 15:44:19,464 epoch 2 - iter 588/1476 - loss 0.14665798 - time (sec): 27.28 - samples/sec: 2311.85 - lr: 0.000048 - momentum: 0.000000 2023-10-13 15:44:26,111 epoch 2 - iter 735/1476 - loss 0.14371952 - time (sec): 33.93 - samples/sec: 2309.91 - lr: 0.000047 - momentum: 0.000000 2023-10-13 15:44:34,075 epoch 2 - iter 882/1476 - loss 0.14402450 - time (sec): 41.89 - samples/sec: 2392.34 - lr: 0.000047 - momentum: 0.000000 2023-10-13 15:44:41,076 epoch 2 - iter 1029/1476 - loss 0.14095397 - time (sec): 48.89 - samples/sec: 2394.34 - lr: 0.000046 - momentum: 0.000000 2023-10-13 15:44:47,968 epoch 2 - iter 1176/1476 - loss 0.14022068 - time (sec): 55.79 - samples/sec: 2387.41 - lr: 0.000046 - momentum: 0.000000 2023-10-13 15:44:54,873 epoch 2 - iter 1323/1476 - loss 0.13964413 - time (sec): 62.69 - samples/sec: 2391.33 - lr: 0.000045 - momentum: 0.000000 2023-10-13 15:45:01,710 epoch 2 - iter 1470/1476 - loss 0.13667497 - time (sec): 69.53 - samples/sec: 2385.92 - lr: 0.000044 - momentum: 0.000000 2023-10-13 15:45:01,973 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:45:01,974 EPOCH 2 done: loss 0.1366 - lr: 0.000044 2023-10-13 15:45:13,154 DEV : loss 0.14815327525138855 - f1-score (micro avg) 0.783 2023-10-13 15:45:13,184 saving best model 2023-10-13 15:45:13,775 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:45:20,684 epoch 3 - iter 147/1476 - loss 0.08626971 - time (sec): 6.90 - samples/sec: 2220.34 - lr: 0.000044 - momentum: 0.000000 2023-10-13 15:45:27,452 epoch 3 - iter 294/1476 - loss 0.08510510 - time (sec): 13.67 - samples/sec: 2295.99 - lr: 0.000043 - momentum: 0.000000 2023-10-13 15:45:34,356 epoch 3 - iter 441/1476 - loss 0.09017739 - time (sec): 20.58 - samples/sec: 2360.38 - lr: 0.000043 - momentum: 0.000000 2023-10-13 15:45:41,425 epoch 3 - iter 588/1476 - loss 0.09251949 - time (sec): 27.64 - samples/sec: 2381.15 - lr: 0.000042 - momentum: 0.000000 2023-10-13 15:45:48,404 epoch 3 - iter 735/1476 - loss 0.09530762 - time (sec): 34.62 - samples/sec: 2408.54 - lr: 0.000042 - momentum: 0.000000 2023-10-13 15:45:54,862 epoch 3 - iter 882/1476 - loss 0.09482173 - time (sec): 41.08 - samples/sec: 2397.41 - lr: 0.000041 - momentum: 0.000000 2023-10-13 15:46:01,516 epoch 3 - iter 1029/1476 - loss 0.09283997 - time (sec): 47.74 - samples/sec: 2417.28 - lr: 0.000041 - momentum: 0.000000 2023-10-13 15:46:08,514 epoch 3 - iter 1176/1476 - loss 0.09432800 - time (sec): 54.73 - samples/sec: 2414.61 - lr: 0.000040 - momentum: 0.000000 2023-10-13 15:46:15,114 epoch 3 - iter 1323/1476 - loss 0.09378249 - time (sec): 61.33 - samples/sec: 2424.67 - lr: 0.000039 - momentum: 0.000000 2023-10-13 15:46:22,293 epoch 3 - iter 1470/1476 - loss 0.09168940 - time (sec): 68.51 - samples/sec: 2422.02 - lr: 0.000039 - momentum: 0.000000 2023-10-13 15:46:22,555 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:46:22,556 EPOCH 3 done: loss 0.0916 - lr: 0.000039 2023-10-13 15:46:33,729 DEV : loss 0.16625124216079712 - f1-score (micro avg) 0.7842 2023-10-13 15:46:33,759 saving best model 2023-10-13 15:46:34,290 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:46:40,939 epoch 4 - iter 147/1476 - loss 0.05685033 - time (sec): 6.65 - samples/sec: 2384.08 - lr: 0.000038 - momentum: 0.000000 2023-10-13 15:46:48,069 epoch 4 - iter 294/1476 - loss 0.06479773 - time (sec): 13.78 - samples/sec: 2436.98 - lr: 0.000038 - momentum: 0.000000 2023-10-13 15:46:55,648 epoch 4 - iter 441/1476 - loss 0.06082232 - time (sec): 21.35 - samples/sec: 2464.05 - lr: 0.000037 - momentum: 0.000000 2023-10-13 15:47:02,562 epoch 4 - iter 588/1476 - loss 0.06505374 - time (sec): 28.27 - samples/sec: 2398.45 - lr: 0.000037 - momentum: 0.000000 2023-10-13 15:47:09,655 epoch 4 - iter 735/1476 - loss 0.06490679 - time (sec): 35.36 - samples/sec: 2358.79 - lr: 0.000036 - momentum: 0.000000 2023-10-13 15:47:16,482 epoch 4 - iter 882/1476 - loss 0.06350270 - time (sec): 42.19 - samples/sec: 2323.62 - lr: 0.000036 - momentum: 0.000000 2023-10-13 15:47:23,844 epoch 4 - iter 1029/1476 - loss 0.06399013 - time (sec): 49.55 - samples/sec: 2342.65 - lr: 0.000035 - momentum: 0.000000 2023-10-13 15:47:30,580 epoch 4 - iter 1176/1476 - loss 0.06430831 - time (sec): 56.29 - samples/sec: 2330.72 - lr: 0.000034 - momentum: 0.000000 2023-10-13 15:47:37,629 epoch 4 - iter 1323/1476 - loss 0.06572771 - time (sec): 63.33 - samples/sec: 2354.52 - lr: 0.000034 - momentum: 0.000000 2023-10-13 15:47:44,586 epoch 4 - iter 1470/1476 - loss 0.06583572 - time (sec): 70.29 - samples/sec: 2359.14 - lr: 0.000033 - momentum: 0.000000 2023-10-13 15:47:44,847 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:47:44,848 EPOCH 4 done: loss 0.0658 - lr: 0.000033 2023-10-13 15:47:56,082 DEV : loss 0.17630523443222046 - f1-score (micro avg) 0.8153 2023-10-13 15:47:56,112 saving best model 2023-10-13 15:47:56,711 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:48:03,253 epoch 5 - iter 147/1476 - loss 0.04485170 - time (sec): 6.54 - samples/sec: 2355.16 - lr: 0.000033 - momentum: 0.000000 2023-10-13 15:48:09,999 epoch 5 - iter 294/1476 - loss 0.03902816 - time (sec): 13.28 - samples/sec: 2369.02 - lr: 0.000032 - momentum: 0.000000 2023-10-13 15:48:17,002 epoch 5 - iter 441/1476 - loss 0.04504258 - time (sec): 20.29 - samples/sec: 2406.17 - lr: 0.000032 - momentum: 0.000000 2023-10-13 15:48:23,884 epoch 5 - iter 588/1476 - loss 0.04251002 - time (sec): 27.17 - samples/sec: 2378.67 - lr: 0.000031 - momentum: 0.000000 2023-10-13 15:48:31,087 epoch 5 - iter 735/1476 - loss 0.04220286 - time (sec): 34.37 - samples/sec: 2391.39 - lr: 0.000031 - momentum: 0.000000 2023-10-13 15:48:38,154 epoch 5 - iter 882/1476 - loss 0.04528667 - time (sec): 41.44 - samples/sec: 2388.13 - lr: 0.000030 - momentum: 0.000000 2023-10-13 15:48:45,300 epoch 5 - iter 1029/1476 - loss 0.04637233 - time (sec): 48.59 - samples/sec: 2356.76 - lr: 0.000029 - momentum: 0.000000 2023-10-13 15:48:52,720 epoch 5 - iter 1176/1476 - loss 0.04767054 - time (sec): 56.01 - samples/sec: 2372.28 - lr: 0.000029 - momentum: 0.000000 2023-10-13 15:48:59,761 epoch 5 - iter 1323/1476 - loss 0.04742459 - time (sec): 63.05 - samples/sec: 2367.60 - lr: 0.000028 - momentum: 0.000000 2023-10-13 15:49:06,688 epoch 5 - iter 1470/1476 - loss 0.04806456 - time (sec): 69.97 - samples/sec: 2371.06 - lr: 0.000028 - momentum: 0.000000 2023-10-13 15:49:06,946 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:49:06,947 EPOCH 5 done: loss 0.0484 - lr: 0.000028 2023-10-13 15:49:18,117 DEV : loss 0.18819278478622437 - f1-score (micro avg) 0.7999 2023-10-13 15:49:18,147 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:49:25,047 epoch 6 - iter 147/1476 - loss 0.03631651 - time (sec): 6.90 - samples/sec: 2179.78 - lr: 0.000027 - momentum: 0.000000 2023-10-13 15:49:31,878 epoch 6 - iter 294/1476 - loss 0.03341746 - time (sec): 13.73 - samples/sec: 2233.74 - lr: 0.000027 - momentum: 0.000000 2023-10-13 15:49:39,202 epoch 6 - iter 441/1476 - loss 0.03106010 - time (sec): 21.05 - samples/sec: 2342.54 - lr: 0.000026 - momentum: 0.000000 2023-10-13 15:49:46,275 epoch 6 - iter 588/1476 - loss 0.03516578 - time (sec): 28.13 - samples/sec: 2336.61 - lr: 0.000026 - momentum: 0.000000 2023-10-13 15:49:53,144 epoch 6 - iter 735/1476 - loss 0.03483987 - time (sec): 35.00 - samples/sec: 2348.33 - lr: 0.000025 - momentum: 0.000000 2023-10-13 15:50:00,155 epoch 6 - iter 882/1476 - loss 0.03313512 - time (sec): 42.01 - samples/sec: 2372.38 - lr: 0.000024 - momentum: 0.000000 2023-10-13 15:50:06,939 epoch 6 - iter 1029/1476 - loss 0.03286769 - time (sec): 48.79 - samples/sec: 2353.00 - lr: 0.000024 - momentum: 0.000000 2023-10-13 15:50:13,772 epoch 6 - iter 1176/1476 - loss 0.03254763 - time (sec): 55.62 - samples/sec: 2356.11 - lr: 0.000023 - momentum: 0.000000 2023-10-13 15:50:21,043 epoch 6 - iter 1323/1476 - loss 0.03345823 - time (sec): 62.89 - samples/sec: 2387.16 - lr: 0.000023 - momentum: 0.000000 2023-10-13 15:50:27,847 epoch 6 - iter 1470/1476 - loss 0.03287175 - time (sec): 69.70 - samples/sec: 2380.16 - lr: 0.000022 - momentum: 0.000000 2023-10-13 15:50:28,118 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:50:28,119 EPOCH 6 done: loss 0.0328 - lr: 0.000022 2023-10-13 15:50:39,277 DEV : loss 0.20484893023967743 - f1-score (micro avg) 0.8029 2023-10-13 15:50:39,307 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:50:46,072 epoch 7 - iter 147/1476 - loss 0.01775270 - time (sec): 6.76 - samples/sec: 2267.31 - lr: 0.000022 - momentum: 0.000000 2023-10-13 15:50:53,821 epoch 7 - iter 294/1476 - loss 0.02118488 - time (sec): 14.51 - samples/sec: 2336.41 - lr: 0.000021 - momentum: 0.000000 2023-10-13 15:51:00,455 epoch 7 - iter 441/1476 - loss 0.02237123 - time (sec): 21.15 - samples/sec: 2323.11 - lr: 0.000021 - momentum: 0.000000 2023-10-13 15:51:07,535 epoch 7 - iter 588/1476 - loss 0.02143611 - time (sec): 28.23 - samples/sec: 2322.19 - lr: 0.000020 - momentum: 0.000000 2023-10-13 15:51:14,428 epoch 7 - iter 735/1476 - loss 0.02320718 - time (sec): 35.12 - samples/sec: 2341.16 - lr: 0.000019 - momentum: 0.000000 2023-10-13 15:51:21,513 epoch 7 - iter 882/1476 - loss 0.02422687 - time (sec): 42.21 - samples/sec: 2384.20 - lr: 0.000019 - momentum: 0.000000 2023-10-13 15:51:28,587 epoch 7 - iter 1029/1476 - loss 0.02343024 - time (sec): 49.28 - samples/sec: 2402.37 - lr: 0.000018 - momentum: 0.000000 2023-10-13 15:51:35,674 epoch 7 - iter 1176/1476 - loss 0.02306815 - time (sec): 56.37 - samples/sec: 2392.98 - lr: 0.000018 - momentum: 0.000000 2023-10-13 15:51:42,416 epoch 7 - iter 1323/1476 - loss 0.02322261 - time (sec): 63.11 - samples/sec: 2374.09 - lr: 0.000017 - momentum: 0.000000 2023-10-13 15:51:49,345 epoch 7 - iter 1470/1476 - loss 0.02302954 - time (sec): 70.04 - samples/sec: 2368.47 - lr: 0.000017 - momentum: 0.000000 2023-10-13 15:51:49,612 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:51:49,612 EPOCH 7 done: loss 0.0231 - lr: 0.000017 2023-10-13 15:52:00,780 DEV : loss 0.2176404744386673 - f1-score (micro avg) 0.8104 2023-10-13 15:52:00,810 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:52:07,878 epoch 8 - iter 147/1476 - loss 0.01220283 - time (sec): 7.07 - samples/sec: 2310.70 - lr: 0.000016 - momentum: 0.000000 2023-10-13 15:52:14,683 epoch 8 - iter 294/1476 - loss 0.01477755 - time (sec): 13.87 - samples/sec: 2316.37 - lr: 0.000016 - momentum: 0.000000 2023-10-13 15:52:21,717 epoch 8 - iter 441/1476 - loss 0.01557002 - time (sec): 20.91 - samples/sec: 2387.69 - lr: 0.000015 - momentum: 0.000000 2023-10-13 15:52:28,637 epoch 8 - iter 588/1476 - loss 0.01728477 - time (sec): 27.83 - samples/sec: 2368.74 - lr: 0.000014 - momentum: 0.000000 2023-10-13 15:52:35,625 epoch 8 - iter 735/1476 - loss 0.01772118 - time (sec): 34.81 - samples/sec: 2350.82 - lr: 0.000014 - momentum: 0.000000 2023-10-13 15:52:42,709 epoch 8 - iter 882/1476 - loss 0.01840414 - time (sec): 41.90 - samples/sec: 2332.60 - lr: 0.000013 - momentum: 0.000000 2023-10-13 15:52:49,619 epoch 8 - iter 1029/1476 - loss 0.01749539 - time (sec): 48.81 - samples/sec: 2327.72 - lr: 0.000013 - momentum: 0.000000 2023-10-13 15:52:57,024 epoch 8 - iter 1176/1476 - loss 0.01800568 - time (sec): 56.21 - samples/sec: 2344.29 - lr: 0.000012 - momentum: 0.000000 2023-10-13 15:53:03,895 epoch 8 - iter 1323/1476 - loss 0.01708246 - time (sec): 63.08 - samples/sec: 2351.96 - lr: 0.000012 - momentum: 0.000000 2023-10-13 15:53:10,882 epoch 8 - iter 1470/1476 - loss 0.01733501 - time (sec): 70.07 - samples/sec: 2368.38 - lr: 0.000011 - momentum: 0.000000 2023-10-13 15:53:11,151 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:53:11,151 EPOCH 8 done: loss 0.0173 - lr: 0.000011 2023-10-13 15:53:22,272 DEV : loss 0.20916695892810822 - f1-score (micro avg) 0.8181 2023-10-13 15:53:22,301 saving best model 2023-10-13 15:53:22,822 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:53:29,893 epoch 9 - iter 147/1476 - loss 0.01635506 - time (sec): 7.07 - samples/sec: 2461.72 - lr: 0.000011 - momentum: 0.000000 2023-10-13 15:53:36,767 epoch 9 - iter 294/1476 - loss 0.01655047 - time (sec): 13.94 - samples/sec: 2426.64 - lr: 0.000010 - momentum: 0.000000 2023-10-13 15:53:43,754 epoch 9 - iter 441/1476 - loss 0.01517177 - time (sec): 20.93 - samples/sec: 2367.10 - lr: 0.000009 - momentum: 0.000000 2023-10-13 15:53:50,701 epoch 9 - iter 588/1476 - loss 0.01363129 - time (sec): 27.88 - samples/sec: 2364.14 - lr: 0.000009 - momentum: 0.000000 2023-10-13 15:53:57,583 epoch 9 - iter 735/1476 - loss 0.01340167 - time (sec): 34.76 - samples/sec: 2360.92 - lr: 0.000008 - momentum: 0.000000 2023-10-13 15:54:04,370 epoch 9 - iter 882/1476 - loss 0.01292839 - time (sec): 41.54 - samples/sec: 2347.88 - lr: 0.000008 - momentum: 0.000000 2023-10-13 15:54:11,324 epoch 9 - iter 1029/1476 - loss 0.01181704 - time (sec): 48.50 - samples/sec: 2370.28 - lr: 0.000007 - momentum: 0.000000 2023-10-13 15:54:18,449 epoch 9 - iter 1176/1476 - loss 0.01179880 - time (sec): 55.62 - samples/sec: 2378.75 - lr: 0.000007 - momentum: 0.000000 2023-10-13 15:54:25,296 epoch 9 - iter 1323/1476 - loss 0.01152101 - time (sec): 62.47 - samples/sec: 2384.12 - lr: 0.000006 - momentum: 0.000000 2023-10-13 15:54:32,279 epoch 9 - iter 1470/1476 - loss 0.01152718 - time (sec): 69.45 - samples/sec: 2389.38 - lr: 0.000006 - momentum: 0.000000 2023-10-13 15:54:32,541 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:54:32,541 EPOCH 9 done: loss 0.0115 - lr: 0.000006 2023-10-13 15:54:43,751 DEV : loss 0.22271640598773956 - f1-score (micro avg) 0.8152 2023-10-13 15:54:43,780 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:54:50,626 epoch 10 - iter 147/1476 - loss 0.01146130 - time (sec): 6.84 - samples/sec: 2359.28 - lr: 0.000005 - momentum: 0.000000 2023-10-13 15:54:58,158 epoch 10 - iter 294/1476 - loss 0.00823611 - time (sec): 14.38 - samples/sec: 2480.56 - lr: 0.000004 - momentum: 0.000000 2023-10-13 15:55:05,151 epoch 10 - iter 441/1476 - loss 0.00776873 - time (sec): 21.37 - samples/sec: 2419.29 - lr: 0.000004 - momentum: 0.000000 2023-10-13 15:55:12,220 epoch 10 - iter 588/1476 - loss 0.00660613 - time (sec): 28.44 - samples/sec: 2368.38 - lr: 0.000003 - momentum: 0.000000 2023-10-13 15:55:18,963 epoch 10 - iter 735/1476 - loss 0.00628384 - time (sec): 35.18 - samples/sec: 2351.18 - lr: 0.000003 - momentum: 0.000000 2023-10-13 15:55:25,778 epoch 10 - iter 882/1476 - loss 0.00654294 - time (sec): 42.00 - samples/sec: 2333.14 - lr: 0.000002 - momentum: 0.000000 2023-10-13 15:55:33,071 epoch 10 - iter 1029/1476 - loss 0.00618452 - time (sec): 49.29 - samples/sec: 2340.88 - lr: 0.000002 - momentum: 0.000000 2023-10-13 15:55:40,251 epoch 10 - iter 1176/1476 - loss 0.00649196 - time (sec): 56.47 - samples/sec: 2335.05 - lr: 0.000001 - momentum: 0.000000 2023-10-13 15:55:47,188 epoch 10 - iter 1323/1476 - loss 0.00610162 - time (sec): 63.41 - samples/sec: 2332.22 - lr: 0.000001 - momentum: 0.000000 2023-10-13 15:55:54,348 epoch 10 - iter 1470/1476 - loss 0.00601101 - time (sec): 70.57 - samples/sec: 2353.06 - lr: 0.000000 - momentum: 0.000000 2023-10-13 15:55:54,607 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:55:54,607 EPOCH 10 done: loss 0.0060 - lr: 0.000000 2023-10-13 15:56:05,754 DEV : loss 0.22833691537380219 - f1-score (micro avg) 0.8171 2023-10-13 15:56:06,208 ---------------------------------------------------------------------------------------------------- 2023-10-13 15:56:06,209 Loading model from best epoch ... 2023-10-13 15:56:07,748 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod 2023-10-13 15:56:13,701 Results: - F-score (micro) 0.7761 - F-score (macro) 0.6771 - Accuracy 0.6563 By class: precision recall f1-score support loc 0.8328 0.8590 0.8457 858 pers 0.7347 0.7840 0.7586 537 org 0.5094 0.6136 0.5567 132 time 0.5397 0.6296 0.5812 54 prod 0.6852 0.6066 0.6435 61 micro avg 0.7555 0.7978 0.7761 1642 macro avg 0.6604 0.6986 0.6771 1642 weighted avg 0.7596 0.7978 0.7777 1642 2023-10-13 15:56:13,701 ----------------------------------------------------------------------------------------------------