stefan-it's picture
Upload folder using huggingface_hub
b8d0e50
2023-10-13 18:47:26,353 ----------------------------------------------------------------------------------------------------
2023-10-13 18:47:26,354 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 18:47:26,354 ----------------------------------------------------------------------------------------------------
2023-10-13 18:47:26,354 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-13 18:47:26,354 ----------------------------------------------------------------------------------------------------
2023-10-13 18:47:26,354 Train: 5901 sentences
2023-10-13 18:47:26,354 (train_with_dev=False, train_with_test=False)
2023-10-13 18:47:26,354 ----------------------------------------------------------------------------------------------------
2023-10-13 18:47:26,354 Training Params:
2023-10-13 18:47:26,354 - learning_rate: "3e-05"
2023-10-13 18:47:26,354 - mini_batch_size: "4"
2023-10-13 18:47:26,354 - max_epochs: "10"
2023-10-13 18:47:26,354 - shuffle: "True"
2023-10-13 18:47:26,354 ----------------------------------------------------------------------------------------------------
2023-10-13 18:47:26,354 Plugins:
2023-10-13 18:47:26,355 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 18:47:26,355 ----------------------------------------------------------------------------------------------------
2023-10-13 18:47:26,355 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 18:47:26,355 - metric: "('micro avg', 'f1-score')"
2023-10-13 18:47:26,355 ----------------------------------------------------------------------------------------------------
2023-10-13 18:47:26,355 Computation:
2023-10-13 18:47:26,355 - compute on device: cuda:0
2023-10-13 18:47:26,355 - embedding storage: none
2023-10-13 18:47:26,355 ----------------------------------------------------------------------------------------------------
2023-10-13 18:47:26,355 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-13 18:47:26,355 ----------------------------------------------------------------------------------------------------
2023-10-13 18:47:26,355 ----------------------------------------------------------------------------------------------------
2023-10-13 18:47:33,216 epoch 1 - iter 147/1476 - loss 2.47203040 - time (sec): 6.86 - samples/sec: 2369.63 - lr: 0.000003 - momentum: 0.000000
2023-10-13 18:47:40,117 epoch 1 - iter 294/1476 - loss 1.53224631 - time (sec): 13.76 - samples/sec: 2378.60 - lr: 0.000006 - momentum: 0.000000
2023-10-13 18:47:46,944 epoch 1 - iter 441/1476 - loss 1.16457223 - time (sec): 20.59 - samples/sec: 2370.35 - lr: 0.000009 - momentum: 0.000000
2023-10-13 18:47:53,832 epoch 1 - iter 588/1476 - loss 0.95570083 - time (sec): 27.48 - samples/sec: 2364.52 - lr: 0.000012 - momentum: 0.000000
2023-10-13 18:48:01,003 epoch 1 - iter 735/1476 - loss 0.82886372 - time (sec): 34.65 - samples/sec: 2368.14 - lr: 0.000015 - momentum: 0.000000
2023-10-13 18:48:07,676 epoch 1 - iter 882/1476 - loss 0.73918529 - time (sec): 41.32 - samples/sec: 2345.23 - lr: 0.000018 - momentum: 0.000000
2023-10-13 18:48:14,768 epoch 1 - iter 1029/1476 - loss 0.66297514 - time (sec): 48.41 - samples/sec: 2363.63 - lr: 0.000021 - momentum: 0.000000
2023-10-13 18:48:21,818 epoch 1 - iter 1176/1476 - loss 0.60069633 - time (sec): 55.46 - samples/sec: 2382.71 - lr: 0.000024 - momentum: 0.000000
2023-10-13 18:48:28,594 epoch 1 - iter 1323/1476 - loss 0.55590049 - time (sec): 62.24 - samples/sec: 2387.70 - lr: 0.000027 - momentum: 0.000000
2023-10-13 18:48:35,652 epoch 1 - iter 1470/1476 - loss 0.51835277 - time (sec): 69.30 - samples/sec: 2393.10 - lr: 0.000030 - momentum: 0.000000
2023-10-13 18:48:35,906 ----------------------------------------------------------------------------------------------------
2023-10-13 18:48:35,906 EPOCH 1 done: loss 0.5173 - lr: 0.000030
2023-10-13 18:48:42,033 DEV : loss 0.1586601436138153 - f1-score (micro avg) 0.6953
2023-10-13 18:48:42,061 saving best model
2023-10-13 18:48:42,528 ----------------------------------------------------------------------------------------------------
2023-10-13 18:48:49,482 epoch 2 - iter 147/1476 - loss 0.13516442 - time (sec): 6.95 - samples/sec: 2412.34 - lr: 0.000030 - momentum: 0.000000
2023-10-13 18:48:56,332 epoch 2 - iter 294/1476 - loss 0.13649252 - time (sec): 13.80 - samples/sec: 2404.93 - lr: 0.000029 - momentum: 0.000000
2023-10-13 18:49:03,501 epoch 2 - iter 441/1476 - loss 0.13545397 - time (sec): 20.97 - samples/sec: 2383.51 - lr: 0.000029 - momentum: 0.000000
2023-10-13 18:49:10,326 epoch 2 - iter 588/1476 - loss 0.13055225 - time (sec): 27.80 - samples/sec: 2354.64 - lr: 0.000029 - momentum: 0.000000
2023-10-13 18:49:17,578 epoch 2 - iter 735/1476 - loss 0.12511794 - time (sec): 35.05 - samples/sec: 2390.45 - lr: 0.000028 - momentum: 0.000000
2023-10-13 18:49:25,271 epoch 2 - iter 882/1476 - loss 0.12920863 - time (sec): 42.74 - samples/sec: 2446.74 - lr: 0.000028 - momentum: 0.000000
2023-10-13 18:49:31,880 epoch 2 - iter 1029/1476 - loss 0.12753320 - time (sec): 49.35 - samples/sec: 2427.35 - lr: 0.000028 - momentum: 0.000000
2023-10-13 18:49:38,827 epoch 2 - iter 1176/1476 - loss 0.12657849 - time (sec): 56.30 - samples/sec: 2430.44 - lr: 0.000027 - momentum: 0.000000
2023-10-13 18:49:45,338 epoch 2 - iter 1323/1476 - loss 0.12667487 - time (sec): 62.81 - samples/sec: 2406.92 - lr: 0.000027 - momentum: 0.000000
2023-10-13 18:49:52,047 epoch 2 - iter 1470/1476 - loss 0.12663562 - time (sec): 69.52 - samples/sec: 2388.22 - lr: 0.000027 - momentum: 0.000000
2023-10-13 18:49:52,313 ----------------------------------------------------------------------------------------------------
2023-10-13 18:49:52,313 EPOCH 2 done: loss 0.1266 - lr: 0.000027
2023-10-13 18:50:03,449 DEV : loss 0.13426746428012848 - f1-score (micro avg) 0.7842
2023-10-13 18:50:03,480 saving best model
2023-10-13 18:50:04,015 ----------------------------------------------------------------------------------------------------
2023-10-13 18:50:11,260 epoch 3 - iter 147/1476 - loss 0.06400595 - time (sec): 7.24 - samples/sec: 2559.29 - lr: 0.000026 - momentum: 0.000000
2023-10-13 18:50:18,213 epoch 3 - iter 294/1476 - loss 0.06628312 - time (sec): 14.20 - samples/sec: 2483.12 - lr: 0.000026 - momentum: 0.000000
2023-10-13 18:50:25,140 epoch 3 - iter 441/1476 - loss 0.07296170 - time (sec): 21.12 - samples/sec: 2463.79 - lr: 0.000026 - momentum: 0.000000
2023-10-13 18:50:32,364 epoch 3 - iter 588/1476 - loss 0.08168640 - time (sec): 28.35 - samples/sec: 2478.57 - lr: 0.000025 - momentum: 0.000000
2023-10-13 18:50:39,055 epoch 3 - iter 735/1476 - loss 0.08289348 - time (sec): 35.04 - samples/sec: 2447.79 - lr: 0.000025 - momentum: 0.000000
2023-10-13 18:50:45,924 epoch 3 - iter 882/1476 - loss 0.08210391 - time (sec): 41.91 - samples/sec: 2430.46 - lr: 0.000025 - momentum: 0.000000
2023-10-13 18:50:52,737 epoch 3 - iter 1029/1476 - loss 0.08197948 - time (sec): 48.72 - samples/sec: 2410.53 - lr: 0.000024 - momentum: 0.000000
2023-10-13 18:50:59,573 epoch 3 - iter 1176/1476 - loss 0.08119702 - time (sec): 55.56 - samples/sec: 2406.42 - lr: 0.000024 - momentum: 0.000000
2023-10-13 18:51:06,452 epoch 3 - iter 1323/1476 - loss 0.08162410 - time (sec): 62.44 - samples/sec: 2401.71 - lr: 0.000024 - momentum: 0.000000
2023-10-13 18:51:13,132 epoch 3 - iter 1470/1476 - loss 0.08267840 - time (sec): 69.12 - samples/sec: 2400.10 - lr: 0.000023 - momentum: 0.000000
2023-10-13 18:51:13,388 ----------------------------------------------------------------------------------------------------
2023-10-13 18:51:13,389 EPOCH 3 done: loss 0.0825 - lr: 0.000023
2023-10-13 18:51:24,588 DEV : loss 0.15468844771385193 - f1-score (micro avg) 0.7969
2023-10-13 18:51:24,619 saving best model
2023-10-13 18:51:25,121 ----------------------------------------------------------------------------------------------------
2023-10-13 18:51:31,983 epoch 4 - iter 147/1476 - loss 0.06136683 - time (sec): 6.86 - samples/sec: 2352.06 - lr: 0.000023 - momentum: 0.000000
2023-10-13 18:51:39,343 epoch 4 - iter 294/1476 - loss 0.06998431 - time (sec): 14.22 - samples/sec: 2509.18 - lr: 0.000023 - momentum: 0.000000
2023-10-13 18:51:46,299 epoch 4 - iter 441/1476 - loss 0.06564666 - time (sec): 21.17 - samples/sec: 2446.49 - lr: 0.000022 - momentum: 0.000000
2023-10-13 18:51:52,964 epoch 4 - iter 588/1476 - loss 0.06527905 - time (sec): 27.84 - samples/sec: 2380.86 - lr: 0.000022 - momentum: 0.000000
2023-10-13 18:51:59,887 epoch 4 - iter 735/1476 - loss 0.06299237 - time (sec): 34.76 - samples/sec: 2400.68 - lr: 0.000022 - momentum: 0.000000
2023-10-13 18:52:06,636 epoch 4 - iter 882/1476 - loss 0.06239020 - time (sec): 41.51 - samples/sec: 2398.39 - lr: 0.000021 - momentum: 0.000000
2023-10-13 18:52:13,344 epoch 4 - iter 1029/1476 - loss 0.06088924 - time (sec): 48.22 - samples/sec: 2377.28 - lr: 0.000021 - momentum: 0.000000
2023-10-13 18:52:20,126 epoch 4 - iter 1176/1476 - loss 0.05807939 - time (sec): 55.00 - samples/sec: 2374.78 - lr: 0.000021 - momentum: 0.000000
2023-10-13 18:52:27,021 epoch 4 - iter 1323/1476 - loss 0.05765390 - time (sec): 61.89 - samples/sec: 2371.42 - lr: 0.000020 - momentum: 0.000000
2023-10-13 18:52:34,328 epoch 4 - iter 1470/1476 - loss 0.05616269 - time (sec): 69.20 - samples/sec: 2396.99 - lr: 0.000020 - momentum: 0.000000
2023-10-13 18:52:34,590 ----------------------------------------------------------------------------------------------------
2023-10-13 18:52:34,590 EPOCH 4 done: loss 0.0563 - lr: 0.000020
2023-10-13 18:52:45,814 DEV : loss 0.17033860087394714 - f1-score (micro avg) 0.8106
2023-10-13 18:52:45,844 saving best model
2023-10-13 18:52:46,323 ----------------------------------------------------------------------------------------------------
2023-10-13 18:52:53,267 epoch 5 - iter 147/1476 - loss 0.04144502 - time (sec): 6.94 - samples/sec: 2423.87 - lr: 0.000020 - momentum: 0.000000
2023-10-13 18:52:59,900 epoch 5 - iter 294/1476 - loss 0.04606724 - time (sec): 13.57 - samples/sec: 2328.10 - lr: 0.000019 - momentum: 0.000000
2023-10-13 18:53:06,912 epoch 5 - iter 441/1476 - loss 0.03973065 - time (sec): 20.58 - samples/sec: 2344.48 - lr: 0.000019 - momentum: 0.000000
2023-10-13 18:53:13,730 epoch 5 - iter 588/1476 - loss 0.03622589 - time (sec): 27.40 - samples/sec: 2366.89 - lr: 0.000019 - momentum: 0.000000
2023-10-13 18:53:20,750 epoch 5 - iter 735/1476 - loss 0.03811454 - time (sec): 34.42 - samples/sec: 2385.48 - lr: 0.000018 - momentum: 0.000000
2023-10-13 18:53:27,625 epoch 5 - iter 882/1476 - loss 0.03708256 - time (sec): 41.30 - samples/sec: 2394.12 - lr: 0.000018 - momentum: 0.000000
2023-10-13 18:53:34,708 epoch 5 - iter 1029/1476 - loss 0.03955991 - time (sec): 48.38 - samples/sec: 2398.84 - lr: 0.000018 - momentum: 0.000000
2023-10-13 18:53:41,663 epoch 5 - iter 1176/1476 - loss 0.04041590 - time (sec): 55.34 - samples/sec: 2397.38 - lr: 0.000017 - momentum: 0.000000
2023-10-13 18:53:48,587 epoch 5 - iter 1323/1476 - loss 0.04037167 - time (sec): 62.26 - samples/sec: 2404.17 - lr: 0.000017 - momentum: 0.000000
2023-10-13 18:53:55,407 epoch 5 - iter 1470/1476 - loss 0.04036140 - time (sec): 69.08 - samples/sec: 2400.21 - lr: 0.000017 - momentum: 0.000000
2023-10-13 18:53:55,675 ----------------------------------------------------------------------------------------------------
2023-10-13 18:53:55,675 EPOCH 5 done: loss 0.0402 - lr: 0.000017
2023-10-13 18:54:06,901 DEV : loss 0.1789896935224533 - f1-score (micro avg) 0.8262
2023-10-13 18:54:06,932 saving best model
2023-10-13 18:54:07,408 ----------------------------------------------------------------------------------------------------
2023-10-13 18:54:14,802 epoch 6 - iter 147/1476 - loss 0.02954124 - time (sec): 7.39 - samples/sec: 2309.03 - lr: 0.000016 - momentum: 0.000000
2023-10-13 18:54:21,552 epoch 6 - iter 294/1476 - loss 0.02823268 - time (sec): 14.14 - samples/sec: 2296.57 - lr: 0.000016 - momentum: 0.000000
2023-10-13 18:54:28,596 epoch 6 - iter 441/1476 - loss 0.02553327 - time (sec): 21.18 - samples/sec: 2307.96 - lr: 0.000016 - momentum: 0.000000
2023-10-13 18:54:35,572 epoch 6 - iter 588/1476 - loss 0.02831420 - time (sec): 28.16 - samples/sec: 2319.15 - lr: 0.000015 - momentum: 0.000000
2023-10-13 18:54:42,364 epoch 6 - iter 735/1476 - loss 0.02702751 - time (sec): 34.95 - samples/sec: 2309.41 - lr: 0.000015 - momentum: 0.000000
2023-10-13 18:54:49,162 epoch 6 - iter 882/1476 - loss 0.02657347 - time (sec): 41.75 - samples/sec: 2301.97 - lr: 0.000015 - momentum: 0.000000
2023-10-13 18:54:56,165 epoch 6 - iter 1029/1476 - loss 0.02729635 - time (sec): 48.75 - samples/sec: 2333.31 - lr: 0.000014 - momentum: 0.000000
2023-10-13 18:55:03,119 epoch 6 - iter 1176/1476 - loss 0.02915968 - time (sec): 55.71 - samples/sec: 2337.65 - lr: 0.000014 - momentum: 0.000000
2023-10-13 18:55:10,018 epoch 6 - iter 1323/1476 - loss 0.02965009 - time (sec): 62.60 - samples/sec: 2342.88 - lr: 0.000014 - momentum: 0.000000
2023-10-13 18:55:17,017 epoch 6 - iter 1470/1476 - loss 0.02919942 - time (sec): 69.60 - samples/sec: 2372.02 - lr: 0.000013 - momentum: 0.000000
2023-10-13 18:55:17,440 ----------------------------------------------------------------------------------------------------
2023-10-13 18:55:17,440 EPOCH 6 done: loss 0.0290 - lr: 0.000013
2023-10-13 18:55:28,623 DEV : loss 0.2175937294960022 - f1-score (micro avg) 0.8074
2023-10-13 18:55:28,651 ----------------------------------------------------------------------------------------------------
2023-10-13 18:55:35,576 epoch 7 - iter 147/1476 - loss 0.03078954 - time (sec): 6.92 - samples/sec: 2523.71 - lr: 0.000013 - momentum: 0.000000
2023-10-13 18:55:42,711 epoch 7 - iter 294/1476 - loss 0.02613132 - time (sec): 14.06 - samples/sec: 2560.65 - lr: 0.000013 - momentum: 0.000000
2023-10-13 18:55:49,569 epoch 7 - iter 441/1476 - loss 0.02307156 - time (sec): 20.92 - samples/sec: 2556.92 - lr: 0.000012 - momentum: 0.000000
2023-10-13 18:55:56,397 epoch 7 - iter 588/1476 - loss 0.02372509 - time (sec): 27.74 - samples/sec: 2577.17 - lr: 0.000012 - momentum: 0.000000
2023-10-13 18:56:02,657 epoch 7 - iter 735/1476 - loss 0.02290695 - time (sec): 34.00 - samples/sec: 2544.01 - lr: 0.000012 - momentum: 0.000000
2023-10-13 18:56:09,563 epoch 7 - iter 882/1476 - loss 0.02287812 - time (sec): 40.91 - samples/sec: 2529.13 - lr: 0.000011 - momentum: 0.000000
2023-10-13 18:56:16,218 epoch 7 - iter 1029/1476 - loss 0.02273386 - time (sec): 47.57 - samples/sec: 2497.38 - lr: 0.000011 - momentum: 0.000000
2023-10-13 18:56:22,941 epoch 7 - iter 1176/1476 - loss 0.02186207 - time (sec): 54.29 - samples/sec: 2476.20 - lr: 0.000011 - momentum: 0.000000
2023-10-13 18:56:29,677 epoch 7 - iter 1323/1476 - loss 0.02249581 - time (sec): 61.02 - samples/sec: 2460.89 - lr: 0.000010 - momentum: 0.000000
2023-10-13 18:56:36,428 epoch 7 - iter 1470/1476 - loss 0.02161244 - time (sec): 67.78 - samples/sec: 2447.38 - lr: 0.000010 - momentum: 0.000000
2023-10-13 18:56:36,692 ----------------------------------------------------------------------------------------------------
2023-10-13 18:56:36,692 EPOCH 7 done: loss 0.0216 - lr: 0.000010
2023-10-13 18:56:47,948 DEV : loss 0.20365940034389496 - f1-score (micro avg) 0.8297
2023-10-13 18:56:47,979 saving best model
2023-10-13 18:56:48,508 ----------------------------------------------------------------------------------------------------
2023-10-13 18:56:55,502 epoch 8 - iter 147/1476 - loss 0.01368682 - time (sec): 6.99 - samples/sec: 2531.95 - lr: 0.000010 - momentum: 0.000000
2023-10-13 18:57:02,204 epoch 8 - iter 294/1476 - loss 0.01246331 - time (sec): 13.69 - samples/sec: 2414.93 - lr: 0.000009 - momentum: 0.000000
2023-10-13 18:57:09,400 epoch 8 - iter 441/1476 - loss 0.01614129 - time (sec): 20.89 - samples/sec: 2483.02 - lr: 0.000009 - momentum: 0.000000
2023-10-13 18:57:16,317 epoch 8 - iter 588/1476 - loss 0.01524708 - time (sec): 27.81 - samples/sec: 2425.87 - lr: 0.000009 - momentum: 0.000000
2023-10-13 18:57:22,822 epoch 8 - iter 735/1476 - loss 0.01581981 - time (sec): 34.31 - samples/sec: 2384.81 - lr: 0.000008 - momentum: 0.000000
2023-10-13 18:57:29,986 epoch 8 - iter 882/1476 - loss 0.01639056 - time (sec): 41.48 - samples/sec: 2406.11 - lr: 0.000008 - momentum: 0.000000
2023-10-13 18:57:36,747 epoch 8 - iter 1029/1476 - loss 0.01574671 - time (sec): 48.24 - samples/sec: 2404.17 - lr: 0.000008 - momentum: 0.000000
2023-10-13 18:57:43,644 epoch 8 - iter 1176/1476 - loss 0.01490286 - time (sec): 55.13 - samples/sec: 2390.65 - lr: 0.000007 - momentum: 0.000000
2023-10-13 18:57:50,534 epoch 8 - iter 1323/1476 - loss 0.01468391 - time (sec): 62.02 - samples/sec: 2389.87 - lr: 0.000007 - momentum: 0.000000
2023-10-13 18:57:57,551 epoch 8 - iter 1470/1476 - loss 0.01407014 - time (sec): 69.04 - samples/sec: 2402.22 - lr: 0.000007 - momentum: 0.000000
2023-10-13 18:57:57,816 ----------------------------------------------------------------------------------------------------
2023-10-13 18:57:57,816 EPOCH 8 done: loss 0.0140 - lr: 0.000007
2023-10-13 18:58:09,064 DEV : loss 0.20413334667682648 - f1-score (micro avg) 0.8364
2023-10-13 18:58:09,094 saving best model
2023-10-13 18:58:09,586 ----------------------------------------------------------------------------------------------------
2023-10-13 18:58:16,496 epoch 9 - iter 147/1476 - loss 0.01422764 - time (sec): 6.91 - samples/sec: 2327.73 - lr: 0.000006 - momentum: 0.000000
2023-10-13 18:58:23,345 epoch 9 - iter 294/1476 - loss 0.01591919 - time (sec): 13.76 - samples/sec: 2349.67 - lr: 0.000006 - momentum: 0.000000
2023-10-13 18:58:29,983 epoch 9 - iter 441/1476 - loss 0.01220372 - time (sec): 20.40 - samples/sec: 2307.67 - lr: 0.000006 - momentum: 0.000000
2023-10-13 18:58:37,103 epoch 9 - iter 588/1476 - loss 0.01151508 - time (sec): 27.52 - samples/sec: 2333.85 - lr: 0.000005 - momentum: 0.000000
2023-10-13 18:58:44,024 epoch 9 - iter 735/1476 - loss 0.01111149 - time (sec): 34.44 - samples/sec: 2316.75 - lr: 0.000005 - momentum: 0.000000
2023-10-13 18:58:51,025 epoch 9 - iter 882/1476 - loss 0.01047146 - time (sec): 41.44 - samples/sec: 2316.50 - lr: 0.000005 - momentum: 0.000000
2023-10-13 18:58:58,130 epoch 9 - iter 1029/1476 - loss 0.00997977 - time (sec): 48.54 - samples/sec: 2345.47 - lr: 0.000004 - momentum: 0.000000
2023-10-13 18:59:05,454 epoch 9 - iter 1176/1476 - loss 0.01061230 - time (sec): 55.87 - samples/sec: 2367.08 - lr: 0.000004 - momentum: 0.000000
2023-10-13 18:59:12,178 epoch 9 - iter 1323/1476 - loss 0.01020875 - time (sec): 62.59 - samples/sec: 2357.94 - lr: 0.000004 - momentum: 0.000000
2023-10-13 18:59:19,220 epoch 9 - iter 1470/1476 - loss 0.01015093 - time (sec): 69.63 - samples/sec: 2370.02 - lr: 0.000003 - momentum: 0.000000
2023-10-13 18:59:19,651 ----------------------------------------------------------------------------------------------------
2023-10-13 18:59:19,651 EPOCH 9 done: loss 0.0101 - lr: 0.000003
2023-10-13 18:59:30,827 DEV : loss 0.21342383325099945 - f1-score (micro avg) 0.832
2023-10-13 18:59:30,857 ----------------------------------------------------------------------------------------------------
2023-10-13 18:59:37,684 epoch 10 - iter 147/1476 - loss 0.00481638 - time (sec): 6.83 - samples/sec: 2248.25 - lr: 0.000003 - momentum: 0.000000
2023-10-13 18:59:44,816 epoch 10 - iter 294/1476 - loss 0.00647829 - time (sec): 13.96 - samples/sec: 2359.27 - lr: 0.000003 - momentum: 0.000000
2023-10-13 18:59:51,824 epoch 10 - iter 441/1476 - loss 0.00596617 - time (sec): 20.97 - samples/sec: 2362.46 - lr: 0.000002 - momentum: 0.000000
2023-10-13 18:59:58,867 epoch 10 - iter 588/1476 - loss 0.00519237 - time (sec): 28.01 - samples/sec: 2381.30 - lr: 0.000002 - momentum: 0.000000
2023-10-13 19:00:06,044 epoch 10 - iter 735/1476 - loss 0.00546008 - time (sec): 35.19 - samples/sec: 2406.51 - lr: 0.000002 - momentum: 0.000000
2023-10-13 19:00:12,744 epoch 10 - iter 882/1476 - loss 0.00545956 - time (sec): 41.89 - samples/sec: 2395.55 - lr: 0.000001 - momentum: 0.000000
2023-10-13 19:00:19,404 epoch 10 - iter 1029/1476 - loss 0.00758172 - time (sec): 48.55 - samples/sec: 2385.54 - lr: 0.000001 - momentum: 0.000000
2023-10-13 19:00:26,452 epoch 10 - iter 1176/1476 - loss 0.00734000 - time (sec): 55.59 - samples/sec: 2379.50 - lr: 0.000001 - momentum: 0.000000
2023-10-13 19:00:33,555 epoch 10 - iter 1323/1476 - loss 0.00755594 - time (sec): 62.70 - samples/sec: 2397.75 - lr: 0.000000 - momentum: 0.000000
2023-10-13 19:00:40,328 epoch 10 - iter 1470/1476 - loss 0.00764843 - time (sec): 69.47 - samples/sec: 2386.23 - lr: 0.000000 - momentum: 0.000000
2023-10-13 19:00:40,604 ----------------------------------------------------------------------------------------------------
2023-10-13 19:00:40,605 EPOCH 10 done: loss 0.0076 - lr: 0.000000
2023-10-13 19:00:51,848 DEV : loss 0.21848614513874054 - f1-score (micro avg) 0.8312
2023-10-13 19:00:52,254 ----------------------------------------------------------------------------------------------------
2023-10-13 19:00:52,255 Loading model from best epoch ...
2023-10-13 19:00:53,719 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-13 19:01:00,089
Results:
- F-score (micro) 0.805
- F-score (macro) 0.704
- Accuracy 0.6957
By class:
precision recall f1-score support
loc 0.8842 0.8718 0.8779 858
pers 0.7330 0.8231 0.7754 537
org 0.6308 0.6212 0.6260 132
time 0.5075 0.6296 0.5620 54
prod 0.7451 0.6230 0.6786 61
micro avg 0.7920 0.8185 0.8050 1642
macro avg 0.7001 0.7137 0.7040 1642
weighted avg 0.7968 0.8185 0.8064 1642
2023-10-13 19:01:00,089 ----------------------------------------------------------------------------------------------------