2023-10-13 18:47:26,353 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:47:26,354 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 18:47:26,354 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:47:26,354 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator 2023-10-13 18:47:26,354 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:47:26,354 Train: 5901 sentences 2023-10-13 18:47:26,354 (train_with_dev=False, train_with_test=False) 2023-10-13 18:47:26,354 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:47:26,354 Training Params: 2023-10-13 18:47:26,354 - learning_rate: "3e-05" 2023-10-13 18:47:26,354 - mini_batch_size: "4" 2023-10-13 18:47:26,354 - max_epochs: "10" 2023-10-13 18:47:26,354 - shuffle: "True" 2023-10-13 18:47:26,354 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:47:26,354 Plugins: 2023-10-13 18:47:26,355 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 18:47:26,355 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:47:26,355 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 18:47:26,355 - metric: "('micro avg', 'f1-score')" 2023-10-13 18:47:26,355 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:47:26,355 Computation: 2023-10-13 18:47:26,355 - compute on device: cuda:0 2023-10-13 18:47:26,355 - embedding storage: none 2023-10-13 18:47:26,355 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:47:26,355 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-13 18:47:26,355 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:47:26,355 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:47:33,216 epoch 1 - iter 147/1476 - loss 2.47203040 - time (sec): 6.86 - samples/sec: 2369.63 - lr: 0.000003 - momentum: 0.000000 2023-10-13 18:47:40,117 epoch 1 - iter 294/1476 - loss 1.53224631 - time (sec): 13.76 - samples/sec: 2378.60 - lr: 0.000006 - momentum: 0.000000 2023-10-13 18:47:46,944 epoch 1 - iter 441/1476 - loss 1.16457223 - time (sec): 20.59 - samples/sec: 2370.35 - lr: 0.000009 - momentum: 0.000000 2023-10-13 18:47:53,832 epoch 1 - iter 588/1476 - loss 0.95570083 - time (sec): 27.48 - samples/sec: 2364.52 - lr: 0.000012 - momentum: 0.000000 2023-10-13 18:48:01,003 epoch 1 - iter 735/1476 - loss 0.82886372 - time (sec): 34.65 - samples/sec: 2368.14 - lr: 0.000015 - momentum: 0.000000 2023-10-13 18:48:07,676 epoch 1 - iter 882/1476 - loss 0.73918529 - time (sec): 41.32 - samples/sec: 2345.23 - lr: 0.000018 - momentum: 0.000000 2023-10-13 18:48:14,768 epoch 1 - iter 1029/1476 - loss 0.66297514 - time (sec): 48.41 - samples/sec: 2363.63 - lr: 0.000021 - momentum: 0.000000 2023-10-13 18:48:21,818 epoch 1 - iter 1176/1476 - loss 0.60069633 - time (sec): 55.46 - samples/sec: 2382.71 - lr: 0.000024 - momentum: 0.000000 2023-10-13 18:48:28,594 epoch 1 - iter 1323/1476 - loss 0.55590049 - time (sec): 62.24 - samples/sec: 2387.70 - lr: 0.000027 - momentum: 0.000000 2023-10-13 18:48:35,652 epoch 1 - iter 1470/1476 - loss 0.51835277 - time (sec): 69.30 - samples/sec: 2393.10 - lr: 0.000030 - momentum: 0.000000 2023-10-13 18:48:35,906 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:48:35,906 EPOCH 1 done: loss 0.5173 - lr: 0.000030 2023-10-13 18:48:42,033 DEV : loss 0.1586601436138153 - f1-score (micro avg) 0.6953 2023-10-13 18:48:42,061 saving best model 2023-10-13 18:48:42,528 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:48:49,482 epoch 2 - iter 147/1476 - loss 0.13516442 - time (sec): 6.95 - samples/sec: 2412.34 - lr: 0.000030 - momentum: 0.000000 2023-10-13 18:48:56,332 epoch 2 - iter 294/1476 - loss 0.13649252 - time (sec): 13.80 - samples/sec: 2404.93 - lr: 0.000029 - momentum: 0.000000 2023-10-13 18:49:03,501 epoch 2 - iter 441/1476 - loss 0.13545397 - time (sec): 20.97 - samples/sec: 2383.51 - lr: 0.000029 - momentum: 0.000000 2023-10-13 18:49:10,326 epoch 2 - iter 588/1476 - loss 0.13055225 - time (sec): 27.80 - samples/sec: 2354.64 - lr: 0.000029 - momentum: 0.000000 2023-10-13 18:49:17,578 epoch 2 - iter 735/1476 - loss 0.12511794 - time (sec): 35.05 - samples/sec: 2390.45 - lr: 0.000028 - momentum: 0.000000 2023-10-13 18:49:25,271 epoch 2 - iter 882/1476 - loss 0.12920863 - time (sec): 42.74 - samples/sec: 2446.74 - lr: 0.000028 - momentum: 0.000000 2023-10-13 18:49:31,880 epoch 2 - iter 1029/1476 - loss 0.12753320 - time (sec): 49.35 - samples/sec: 2427.35 - lr: 0.000028 - momentum: 0.000000 2023-10-13 18:49:38,827 epoch 2 - iter 1176/1476 - loss 0.12657849 - time (sec): 56.30 - samples/sec: 2430.44 - lr: 0.000027 - momentum: 0.000000 2023-10-13 18:49:45,338 epoch 2 - iter 1323/1476 - loss 0.12667487 - time (sec): 62.81 - samples/sec: 2406.92 - lr: 0.000027 - momentum: 0.000000 2023-10-13 18:49:52,047 epoch 2 - iter 1470/1476 - loss 0.12663562 - time (sec): 69.52 - samples/sec: 2388.22 - lr: 0.000027 - momentum: 0.000000 2023-10-13 18:49:52,313 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:49:52,313 EPOCH 2 done: loss 0.1266 - lr: 0.000027 2023-10-13 18:50:03,449 DEV : loss 0.13426746428012848 - f1-score (micro avg) 0.7842 2023-10-13 18:50:03,480 saving best model 2023-10-13 18:50:04,015 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:50:11,260 epoch 3 - iter 147/1476 - loss 0.06400595 - time (sec): 7.24 - samples/sec: 2559.29 - lr: 0.000026 - momentum: 0.000000 2023-10-13 18:50:18,213 epoch 3 - iter 294/1476 - loss 0.06628312 - time (sec): 14.20 - samples/sec: 2483.12 - lr: 0.000026 - momentum: 0.000000 2023-10-13 18:50:25,140 epoch 3 - iter 441/1476 - loss 0.07296170 - time (sec): 21.12 - samples/sec: 2463.79 - lr: 0.000026 - momentum: 0.000000 2023-10-13 18:50:32,364 epoch 3 - iter 588/1476 - loss 0.08168640 - time (sec): 28.35 - samples/sec: 2478.57 - lr: 0.000025 - momentum: 0.000000 2023-10-13 18:50:39,055 epoch 3 - iter 735/1476 - loss 0.08289348 - time (sec): 35.04 - samples/sec: 2447.79 - lr: 0.000025 - momentum: 0.000000 2023-10-13 18:50:45,924 epoch 3 - iter 882/1476 - loss 0.08210391 - time (sec): 41.91 - samples/sec: 2430.46 - lr: 0.000025 - momentum: 0.000000 2023-10-13 18:50:52,737 epoch 3 - iter 1029/1476 - loss 0.08197948 - time (sec): 48.72 - samples/sec: 2410.53 - lr: 0.000024 - momentum: 0.000000 2023-10-13 18:50:59,573 epoch 3 - iter 1176/1476 - loss 0.08119702 - time (sec): 55.56 - samples/sec: 2406.42 - lr: 0.000024 - momentum: 0.000000 2023-10-13 18:51:06,452 epoch 3 - iter 1323/1476 - loss 0.08162410 - time (sec): 62.44 - samples/sec: 2401.71 - lr: 0.000024 - momentum: 0.000000 2023-10-13 18:51:13,132 epoch 3 - iter 1470/1476 - loss 0.08267840 - time (sec): 69.12 - samples/sec: 2400.10 - lr: 0.000023 - momentum: 0.000000 2023-10-13 18:51:13,388 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:51:13,389 EPOCH 3 done: loss 0.0825 - lr: 0.000023 2023-10-13 18:51:24,588 DEV : loss 0.15468844771385193 - f1-score (micro avg) 0.7969 2023-10-13 18:51:24,619 saving best model 2023-10-13 18:51:25,121 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:51:31,983 epoch 4 - iter 147/1476 - loss 0.06136683 - time (sec): 6.86 - samples/sec: 2352.06 - lr: 0.000023 - momentum: 0.000000 2023-10-13 18:51:39,343 epoch 4 - iter 294/1476 - loss 0.06998431 - time (sec): 14.22 - samples/sec: 2509.18 - lr: 0.000023 - momentum: 0.000000 2023-10-13 18:51:46,299 epoch 4 - iter 441/1476 - loss 0.06564666 - time (sec): 21.17 - samples/sec: 2446.49 - lr: 0.000022 - momentum: 0.000000 2023-10-13 18:51:52,964 epoch 4 - iter 588/1476 - loss 0.06527905 - time (sec): 27.84 - samples/sec: 2380.86 - lr: 0.000022 - momentum: 0.000000 2023-10-13 18:51:59,887 epoch 4 - iter 735/1476 - loss 0.06299237 - time (sec): 34.76 - samples/sec: 2400.68 - lr: 0.000022 - momentum: 0.000000 2023-10-13 18:52:06,636 epoch 4 - iter 882/1476 - loss 0.06239020 - time (sec): 41.51 - samples/sec: 2398.39 - lr: 0.000021 - momentum: 0.000000 2023-10-13 18:52:13,344 epoch 4 - iter 1029/1476 - loss 0.06088924 - time (sec): 48.22 - samples/sec: 2377.28 - lr: 0.000021 - momentum: 0.000000 2023-10-13 18:52:20,126 epoch 4 - iter 1176/1476 - loss 0.05807939 - time (sec): 55.00 - samples/sec: 2374.78 - lr: 0.000021 - momentum: 0.000000 2023-10-13 18:52:27,021 epoch 4 - iter 1323/1476 - loss 0.05765390 - time (sec): 61.89 - samples/sec: 2371.42 - lr: 0.000020 - momentum: 0.000000 2023-10-13 18:52:34,328 epoch 4 - iter 1470/1476 - loss 0.05616269 - time (sec): 69.20 - samples/sec: 2396.99 - lr: 0.000020 - momentum: 0.000000 2023-10-13 18:52:34,590 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:52:34,590 EPOCH 4 done: loss 0.0563 - lr: 0.000020 2023-10-13 18:52:45,814 DEV : loss 0.17033860087394714 - f1-score (micro avg) 0.8106 2023-10-13 18:52:45,844 saving best model 2023-10-13 18:52:46,323 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:52:53,267 epoch 5 - iter 147/1476 - loss 0.04144502 - time (sec): 6.94 - samples/sec: 2423.87 - lr: 0.000020 - momentum: 0.000000 2023-10-13 18:52:59,900 epoch 5 - iter 294/1476 - loss 0.04606724 - time (sec): 13.57 - samples/sec: 2328.10 - lr: 0.000019 - momentum: 0.000000 2023-10-13 18:53:06,912 epoch 5 - iter 441/1476 - loss 0.03973065 - time (sec): 20.58 - samples/sec: 2344.48 - lr: 0.000019 - momentum: 0.000000 2023-10-13 18:53:13,730 epoch 5 - iter 588/1476 - loss 0.03622589 - time (sec): 27.40 - samples/sec: 2366.89 - lr: 0.000019 - momentum: 0.000000 2023-10-13 18:53:20,750 epoch 5 - iter 735/1476 - loss 0.03811454 - time (sec): 34.42 - samples/sec: 2385.48 - lr: 0.000018 - momentum: 0.000000 2023-10-13 18:53:27,625 epoch 5 - iter 882/1476 - loss 0.03708256 - time (sec): 41.30 - samples/sec: 2394.12 - lr: 0.000018 - momentum: 0.000000 2023-10-13 18:53:34,708 epoch 5 - iter 1029/1476 - loss 0.03955991 - time (sec): 48.38 - samples/sec: 2398.84 - lr: 0.000018 - momentum: 0.000000 2023-10-13 18:53:41,663 epoch 5 - iter 1176/1476 - loss 0.04041590 - time (sec): 55.34 - samples/sec: 2397.38 - lr: 0.000017 - momentum: 0.000000 2023-10-13 18:53:48,587 epoch 5 - iter 1323/1476 - loss 0.04037167 - time (sec): 62.26 - samples/sec: 2404.17 - lr: 0.000017 - momentum: 0.000000 2023-10-13 18:53:55,407 epoch 5 - iter 1470/1476 - loss 0.04036140 - time (sec): 69.08 - samples/sec: 2400.21 - lr: 0.000017 - momentum: 0.000000 2023-10-13 18:53:55,675 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:53:55,675 EPOCH 5 done: loss 0.0402 - lr: 0.000017 2023-10-13 18:54:06,901 DEV : loss 0.1789896935224533 - f1-score (micro avg) 0.8262 2023-10-13 18:54:06,932 saving best model 2023-10-13 18:54:07,408 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:54:14,802 epoch 6 - iter 147/1476 - loss 0.02954124 - time (sec): 7.39 - samples/sec: 2309.03 - lr: 0.000016 - momentum: 0.000000 2023-10-13 18:54:21,552 epoch 6 - iter 294/1476 - loss 0.02823268 - time (sec): 14.14 - samples/sec: 2296.57 - lr: 0.000016 - momentum: 0.000000 2023-10-13 18:54:28,596 epoch 6 - iter 441/1476 - loss 0.02553327 - time (sec): 21.18 - samples/sec: 2307.96 - lr: 0.000016 - momentum: 0.000000 2023-10-13 18:54:35,572 epoch 6 - iter 588/1476 - loss 0.02831420 - time (sec): 28.16 - samples/sec: 2319.15 - lr: 0.000015 - momentum: 0.000000 2023-10-13 18:54:42,364 epoch 6 - iter 735/1476 - loss 0.02702751 - time (sec): 34.95 - samples/sec: 2309.41 - lr: 0.000015 - momentum: 0.000000 2023-10-13 18:54:49,162 epoch 6 - iter 882/1476 - loss 0.02657347 - time (sec): 41.75 - samples/sec: 2301.97 - lr: 0.000015 - momentum: 0.000000 2023-10-13 18:54:56,165 epoch 6 - iter 1029/1476 - loss 0.02729635 - time (sec): 48.75 - samples/sec: 2333.31 - lr: 0.000014 - momentum: 0.000000 2023-10-13 18:55:03,119 epoch 6 - iter 1176/1476 - loss 0.02915968 - time (sec): 55.71 - samples/sec: 2337.65 - lr: 0.000014 - momentum: 0.000000 2023-10-13 18:55:10,018 epoch 6 - iter 1323/1476 - loss 0.02965009 - time (sec): 62.60 - samples/sec: 2342.88 - lr: 0.000014 - momentum: 0.000000 2023-10-13 18:55:17,017 epoch 6 - iter 1470/1476 - loss 0.02919942 - time (sec): 69.60 - samples/sec: 2372.02 - lr: 0.000013 - momentum: 0.000000 2023-10-13 18:55:17,440 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:55:17,440 EPOCH 6 done: loss 0.0290 - lr: 0.000013 2023-10-13 18:55:28,623 DEV : loss 0.2175937294960022 - f1-score (micro avg) 0.8074 2023-10-13 18:55:28,651 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:55:35,576 epoch 7 - iter 147/1476 - loss 0.03078954 - time (sec): 6.92 - samples/sec: 2523.71 - lr: 0.000013 - momentum: 0.000000 2023-10-13 18:55:42,711 epoch 7 - iter 294/1476 - loss 0.02613132 - time (sec): 14.06 - samples/sec: 2560.65 - lr: 0.000013 - momentum: 0.000000 2023-10-13 18:55:49,569 epoch 7 - iter 441/1476 - loss 0.02307156 - time (sec): 20.92 - samples/sec: 2556.92 - lr: 0.000012 - momentum: 0.000000 2023-10-13 18:55:56,397 epoch 7 - iter 588/1476 - loss 0.02372509 - time (sec): 27.74 - samples/sec: 2577.17 - lr: 0.000012 - momentum: 0.000000 2023-10-13 18:56:02,657 epoch 7 - iter 735/1476 - loss 0.02290695 - time (sec): 34.00 - samples/sec: 2544.01 - lr: 0.000012 - momentum: 0.000000 2023-10-13 18:56:09,563 epoch 7 - iter 882/1476 - loss 0.02287812 - time (sec): 40.91 - samples/sec: 2529.13 - lr: 0.000011 - momentum: 0.000000 2023-10-13 18:56:16,218 epoch 7 - iter 1029/1476 - loss 0.02273386 - time (sec): 47.57 - samples/sec: 2497.38 - lr: 0.000011 - momentum: 0.000000 2023-10-13 18:56:22,941 epoch 7 - iter 1176/1476 - loss 0.02186207 - time (sec): 54.29 - samples/sec: 2476.20 - lr: 0.000011 - momentum: 0.000000 2023-10-13 18:56:29,677 epoch 7 - iter 1323/1476 - loss 0.02249581 - time (sec): 61.02 - samples/sec: 2460.89 - lr: 0.000010 - momentum: 0.000000 2023-10-13 18:56:36,428 epoch 7 - iter 1470/1476 - loss 0.02161244 - time (sec): 67.78 - samples/sec: 2447.38 - lr: 0.000010 - momentum: 0.000000 2023-10-13 18:56:36,692 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:56:36,692 EPOCH 7 done: loss 0.0216 - lr: 0.000010 2023-10-13 18:56:47,948 DEV : loss 0.20365940034389496 - f1-score (micro avg) 0.8297 2023-10-13 18:56:47,979 saving best model 2023-10-13 18:56:48,508 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:56:55,502 epoch 8 - iter 147/1476 - loss 0.01368682 - time (sec): 6.99 - samples/sec: 2531.95 - lr: 0.000010 - momentum: 0.000000 2023-10-13 18:57:02,204 epoch 8 - iter 294/1476 - loss 0.01246331 - time (sec): 13.69 - samples/sec: 2414.93 - lr: 0.000009 - momentum: 0.000000 2023-10-13 18:57:09,400 epoch 8 - iter 441/1476 - loss 0.01614129 - time (sec): 20.89 - samples/sec: 2483.02 - lr: 0.000009 - momentum: 0.000000 2023-10-13 18:57:16,317 epoch 8 - iter 588/1476 - loss 0.01524708 - time (sec): 27.81 - samples/sec: 2425.87 - lr: 0.000009 - momentum: 0.000000 2023-10-13 18:57:22,822 epoch 8 - iter 735/1476 - loss 0.01581981 - time (sec): 34.31 - samples/sec: 2384.81 - lr: 0.000008 - momentum: 0.000000 2023-10-13 18:57:29,986 epoch 8 - iter 882/1476 - loss 0.01639056 - time (sec): 41.48 - samples/sec: 2406.11 - lr: 0.000008 - momentum: 0.000000 2023-10-13 18:57:36,747 epoch 8 - iter 1029/1476 - loss 0.01574671 - time (sec): 48.24 - samples/sec: 2404.17 - lr: 0.000008 - momentum: 0.000000 2023-10-13 18:57:43,644 epoch 8 - iter 1176/1476 - loss 0.01490286 - time (sec): 55.13 - samples/sec: 2390.65 - lr: 0.000007 - momentum: 0.000000 2023-10-13 18:57:50,534 epoch 8 - iter 1323/1476 - loss 0.01468391 - time (sec): 62.02 - samples/sec: 2389.87 - lr: 0.000007 - momentum: 0.000000 2023-10-13 18:57:57,551 epoch 8 - iter 1470/1476 - loss 0.01407014 - time (sec): 69.04 - samples/sec: 2402.22 - lr: 0.000007 - momentum: 0.000000 2023-10-13 18:57:57,816 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:57:57,816 EPOCH 8 done: loss 0.0140 - lr: 0.000007 2023-10-13 18:58:09,064 DEV : loss 0.20413334667682648 - f1-score (micro avg) 0.8364 2023-10-13 18:58:09,094 saving best model 2023-10-13 18:58:09,586 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:58:16,496 epoch 9 - iter 147/1476 - loss 0.01422764 - time (sec): 6.91 - samples/sec: 2327.73 - lr: 0.000006 - momentum: 0.000000 2023-10-13 18:58:23,345 epoch 9 - iter 294/1476 - loss 0.01591919 - time (sec): 13.76 - samples/sec: 2349.67 - lr: 0.000006 - momentum: 0.000000 2023-10-13 18:58:29,983 epoch 9 - iter 441/1476 - loss 0.01220372 - time (sec): 20.40 - samples/sec: 2307.67 - lr: 0.000006 - momentum: 0.000000 2023-10-13 18:58:37,103 epoch 9 - iter 588/1476 - loss 0.01151508 - time (sec): 27.52 - samples/sec: 2333.85 - lr: 0.000005 - momentum: 0.000000 2023-10-13 18:58:44,024 epoch 9 - iter 735/1476 - loss 0.01111149 - time (sec): 34.44 - samples/sec: 2316.75 - lr: 0.000005 - momentum: 0.000000 2023-10-13 18:58:51,025 epoch 9 - iter 882/1476 - loss 0.01047146 - time (sec): 41.44 - samples/sec: 2316.50 - lr: 0.000005 - momentum: 0.000000 2023-10-13 18:58:58,130 epoch 9 - iter 1029/1476 - loss 0.00997977 - time (sec): 48.54 - samples/sec: 2345.47 - lr: 0.000004 - momentum: 0.000000 2023-10-13 18:59:05,454 epoch 9 - iter 1176/1476 - loss 0.01061230 - time (sec): 55.87 - samples/sec: 2367.08 - lr: 0.000004 - momentum: 0.000000 2023-10-13 18:59:12,178 epoch 9 - iter 1323/1476 - loss 0.01020875 - time (sec): 62.59 - samples/sec: 2357.94 - lr: 0.000004 - momentum: 0.000000 2023-10-13 18:59:19,220 epoch 9 - iter 1470/1476 - loss 0.01015093 - time (sec): 69.63 - samples/sec: 2370.02 - lr: 0.000003 - momentum: 0.000000 2023-10-13 18:59:19,651 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:59:19,651 EPOCH 9 done: loss 0.0101 - lr: 0.000003 2023-10-13 18:59:30,827 DEV : loss 0.21342383325099945 - f1-score (micro avg) 0.832 2023-10-13 18:59:30,857 ---------------------------------------------------------------------------------------------------- 2023-10-13 18:59:37,684 epoch 10 - iter 147/1476 - loss 0.00481638 - time (sec): 6.83 - samples/sec: 2248.25 - lr: 0.000003 - momentum: 0.000000 2023-10-13 18:59:44,816 epoch 10 - iter 294/1476 - loss 0.00647829 - time (sec): 13.96 - samples/sec: 2359.27 - lr: 0.000003 - momentum: 0.000000 2023-10-13 18:59:51,824 epoch 10 - iter 441/1476 - loss 0.00596617 - time (sec): 20.97 - samples/sec: 2362.46 - lr: 0.000002 - momentum: 0.000000 2023-10-13 18:59:58,867 epoch 10 - iter 588/1476 - loss 0.00519237 - time (sec): 28.01 - samples/sec: 2381.30 - lr: 0.000002 - momentum: 0.000000 2023-10-13 19:00:06,044 epoch 10 - iter 735/1476 - loss 0.00546008 - time (sec): 35.19 - samples/sec: 2406.51 - lr: 0.000002 - momentum: 0.000000 2023-10-13 19:00:12,744 epoch 10 - iter 882/1476 - loss 0.00545956 - time (sec): 41.89 - samples/sec: 2395.55 - lr: 0.000001 - momentum: 0.000000 2023-10-13 19:00:19,404 epoch 10 - iter 1029/1476 - loss 0.00758172 - time (sec): 48.55 - samples/sec: 2385.54 - lr: 0.000001 - momentum: 0.000000 2023-10-13 19:00:26,452 epoch 10 - iter 1176/1476 - loss 0.00734000 - time (sec): 55.59 - samples/sec: 2379.50 - lr: 0.000001 - momentum: 0.000000 2023-10-13 19:00:33,555 epoch 10 - iter 1323/1476 - loss 0.00755594 - time (sec): 62.70 - samples/sec: 2397.75 - lr: 0.000000 - momentum: 0.000000 2023-10-13 19:00:40,328 epoch 10 - iter 1470/1476 - loss 0.00764843 - time (sec): 69.47 - samples/sec: 2386.23 - lr: 0.000000 - momentum: 0.000000 2023-10-13 19:00:40,604 ---------------------------------------------------------------------------------------------------- 2023-10-13 19:00:40,605 EPOCH 10 done: loss 0.0076 - lr: 0.000000 2023-10-13 19:00:51,848 DEV : loss 0.21848614513874054 - f1-score (micro avg) 0.8312 2023-10-13 19:00:52,254 ---------------------------------------------------------------------------------------------------- 2023-10-13 19:00:52,255 Loading model from best epoch ... 2023-10-13 19:00:53,719 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod 2023-10-13 19:01:00,089 Results: - F-score (micro) 0.805 - F-score (macro) 0.704 - Accuracy 0.6957 By class: precision recall f1-score support loc 0.8842 0.8718 0.8779 858 pers 0.7330 0.8231 0.7754 537 org 0.6308 0.6212 0.6260 132 time 0.5075 0.6296 0.5620 54 prod 0.7451 0.6230 0.6786 61 micro avg 0.7920 0.8185 0.8050 1642 macro avg 0.7001 0.7137 0.7040 1642 weighted avg 0.7968 0.8185 0.8064 1642 2023-10-13 19:01:00,089 ----------------------------------------------------------------------------------------------------