stefan-it's picture
Upload folder using huggingface_hub
15a90bf
raw
history blame
24.1 kB
2023-10-13 15:42:35,866 ----------------------------------------------------------------------------------------------------
2023-10-13 15:42:35,867 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 15:42:35,867 ----------------------------------------------------------------------------------------------------
2023-10-13 15:42:35,867 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-13 15:42:35,867 ----------------------------------------------------------------------------------------------------
2023-10-13 15:42:35,867 Train: 5901 sentences
2023-10-13 15:42:35,867 (train_with_dev=False, train_with_test=False)
2023-10-13 15:42:35,867 ----------------------------------------------------------------------------------------------------
2023-10-13 15:42:35,867 Training Params:
2023-10-13 15:42:35,867 - learning_rate: "5e-05"
2023-10-13 15:42:35,867 - mini_batch_size: "4"
2023-10-13 15:42:35,868 - max_epochs: "10"
2023-10-13 15:42:35,868 - shuffle: "True"
2023-10-13 15:42:35,868 ----------------------------------------------------------------------------------------------------
2023-10-13 15:42:35,868 Plugins:
2023-10-13 15:42:35,868 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 15:42:35,868 ----------------------------------------------------------------------------------------------------
2023-10-13 15:42:35,868 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 15:42:35,868 - metric: "('micro avg', 'f1-score')"
2023-10-13 15:42:35,868 ----------------------------------------------------------------------------------------------------
2023-10-13 15:42:35,868 Computation:
2023-10-13 15:42:35,868 - compute on device: cuda:0
2023-10-13 15:42:35,868 - embedding storage: none
2023-10-13 15:42:35,868 ----------------------------------------------------------------------------------------------------
2023-10-13 15:42:35,868 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-13 15:42:35,868 ----------------------------------------------------------------------------------------------------
2023-10-13 15:42:35,868 ----------------------------------------------------------------------------------------------------
2023-10-13 15:42:42,852 epoch 1 - iter 147/1476 - loss 2.30028384 - time (sec): 6.98 - samples/sec: 2417.10 - lr: 0.000005 - momentum: 0.000000
2023-10-13 15:42:49,682 epoch 1 - iter 294/1476 - loss 1.45013254 - time (sec): 13.81 - samples/sec: 2399.46 - lr: 0.000010 - momentum: 0.000000
2023-10-13 15:42:57,042 epoch 1 - iter 441/1476 - loss 1.06928097 - time (sec): 21.17 - samples/sec: 2472.20 - lr: 0.000015 - momentum: 0.000000
2023-10-13 15:43:03,830 epoch 1 - iter 588/1476 - loss 0.89478293 - time (sec): 27.96 - samples/sec: 2410.19 - lr: 0.000020 - momentum: 0.000000
2023-10-13 15:43:10,679 epoch 1 - iter 735/1476 - loss 0.77703571 - time (sec): 34.81 - samples/sec: 2401.38 - lr: 0.000025 - momentum: 0.000000
2023-10-13 15:43:17,429 epoch 1 - iter 882/1476 - loss 0.69040845 - time (sec): 41.56 - samples/sec: 2383.42 - lr: 0.000030 - momentum: 0.000000
2023-10-13 15:43:24,222 epoch 1 - iter 1029/1476 - loss 0.62755418 - time (sec): 48.35 - samples/sec: 2364.76 - lr: 0.000035 - momentum: 0.000000
2023-10-13 15:43:30,951 epoch 1 - iter 1176/1476 - loss 0.57577336 - time (sec): 55.08 - samples/sec: 2354.56 - lr: 0.000040 - momentum: 0.000000
2023-10-13 15:43:38,279 epoch 1 - iter 1323/1476 - loss 0.52275812 - time (sec): 62.41 - samples/sec: 2389.60 - lr: 0.000045 - momentum: 0.000000
2023-10-13 15:43:45,281 epoch 1 - iter 1470/1476 - loss 0.48888637 - time (sec): 69.41 - samples/sec: 2388.74 - lr: 0.000050 - momentum: 0.000000
2023-10-13 15:43:45,558 ----------------------------------------------------------------------------------------------------
2023-10-13 15:43:45,558 EPOCH 1 done: loss 0.4876 - lr: 0.000050
2023-10-13 15:43:51,735 DEV : loss 0.13950972259044647 - f1-score (micro avg) 0.6846
2023-10-13 15:43:51,763 saving best model
2023-10-13 15:43:52,181 ----------------------------------------------------------------------------------------------------
2023-10-13 15:43:59,160 epoch 2 - iter 147/1476 - loss 0.15312363 - time (sec): 6.98 - samples/sec: 2436.22 - lr: 0.000049 - momentum: 0.000000
2023-10-13 15:44:05,796 epoch 2 - iter 294/1476 - loss 0.14515788 - time (sec): 13.61 - samples/sec: 2295.82 - lr: 0.000049 - momentum: 0.000000
2023-10-13 15:44:12,599 epoch 2 - iter 441/1476 - loss 0.14926136 - time (sec): 20.42 - samples/sec: 2289.42 - lr: 0.000048 - momentum: 0.000000
2023-10-13 15:44:19,464 epoch 2 - iter 588/1476 - loss 0.14665798 - time (sec): 27.28 - samples/sec: 2311.85 - lr: 0.000048 - momentum: 0.000000
2023-10-13 15:44:26,111 epoch 2 - iter 735/1476 - loss 0.14371952 - time (sec): 33.93 - samples/sec: 2309.91 - lr: 0.000047 - momentum: 0.000000
2023-10-13 15:44:34,075 epoch 2 - iter 882/1476 - loss 0.14402450 - time (sec): 41.89 - samples/sec: 2392.34 - lr: 0.000047 - momentum: 0.000000
2023-10-13 15:44:41,076 epoch 2 - iter 1029/1476 - loss 0.14095397 - time (sec): 48.89 - samples/sec: 2394.34 - lr: 0.000046 - momentum: 0.000000
2023-10-13 15:44:47,968 epoch 2 - iter 1176/1476 - loss 0.14022068 - time (sec): 55.79 - samples/sec: 2387.41 - lr: 0.000046 - momentum: 0.000000
2023-10-13 15:44:54,873 epoch 2 - iter 1323/1476 - loss 0.13964413 - time (sec): 62.69 - samples/sec: 2391.33 - lr: 0.000045 - momentum: 0.000000
2023-10-13 15:45:01,710 epoch 2 - iter 1470/1476 - loss 0.13667497 - time (sec): 69.53 - samples/sec: 2385.92 - lr: 0.000044 - momentum: 0.000000
2023-10-13 15:45:01,973 ----------------------------------------------------------------------------------------------------
2023-10-13 15:45:01,974 EPOCH 2 done: loss 0.1366 - lr: 0.000044
2023-10-13 15:45:13,154 DEV : loss 0.14815327525138855 - f1-score (micro avg) 0.783
2023-10-13 15:45:13,184 saving best model
2023-10-13 15:45:13,775 ----------------------------------------------------------------------------------------------------
2023-10-13 15:45:20,684 epoch 3 - iter 147/1476 - loss 0.08626971 - time (sec): 6.90 - samples/sec: 2220.34 - lr: 0.000044 - momentum: 0.000000
2023-10-13 15:45:27,452 epoch 3 - iter 294/1476 - loss 0.08510510 - time (sec): 13.67 - samples/sec: 2295.99 - lr: 0.000043 - momentum: 0.000000
2023-10-13 15:45:34,356 epoch 3 - iter 441/1476 - loss 0.09017739 - time (sec): 20.58 - samples/sec: 2360.38 - lr: 0.000043 - momentum: 0.000000
2023-10-13 15:45:41,425 epoch 3 - iter 588/1476 - loss 0.09251949 - time (sec): 27.64 - samples/sec: 2381.15 - lr: 0.000042 - momentum: 0.000000
2023-10-13 15:45:48,404 epoch 3 - iter 735/1476 - loss 0.09530762 - time (sec): 34.62 - samples/sec: 2408.54 - lr: 0.000042 - momentum: 0.000000
2023-10-13 15:45:54,862 epoch 3 - iter 882/1476 - loss 0.09482173 - time (sec): 41.08 - samples/sec: 2397.41 - lr: 0.000041 - momentum: 0.000000
2023-10-13 15:46:01,516 epoch 3 - iter 1029/1476 - loss 0.09283997 - time (sec): 47.74 - samples/sec: 2417.28 - lr: 0.000041 - momentum: 0.000000
2023-10-13 15:46:08,514 epoch 3 - iter 1176/1476 - loss 0.09432800 - time (sec): 54.73 - samples/sec: 2414.61 - lr: 0.000040 - momentum: 0.000000
2023-10-13 15:46:15,114 epoch 3 - iter 1323/1476 - loss 0.09378249 - time (sec): 61.33 - samples/sec: 2424.67 - lr: 0.000039 - momentum: 0.000000
2023-10-13 15:46:22,293 epoch 3 - iter 1470/1476 - loss 0.09168940 - time (sec): 68.51 - samples/sec: 2422.02 - lr: 0.000039 - momentum: 0.000000
2023-10-13 15:46:22,555 ----------------------------------------------------------------------------------------------------
2023-10-13 15:46:22,556 EPOCH 3 done: loss 0.0916 - lr: 0.000039
2023-10-13 15:46:33,729 DEV : loss 0.16625124216079712 - f1-score (micro avg) 0.7842
2023-10-13 15:46:33,759 saving best model
2023-10-13 15:46:34,290 ----------------------------------------------------------------------------------------------------
2023-10-13 15:46:40,939 epoch 4 - iter 147/1476 - loss 0.05685033 - time (sec): 6.65 - samples/sec: 2384.08 - lr: 0.000038 - momentum: 0.000000
2023-10-13 15:46:48,069 epoch 4 - iter 294/1476 - loss 0.06479773 - time (sec): 13.78 - samples/sec: 2436.98 - lr: 0.000038 - momentum: 0.000000
2023-10-13 15:46:55,648 epoch 4 - iter 441/1476 - loss 0.06082232 - time (sec): 21.35 - samples/sec: 2464.05 - lr: 0.000037 - momentum: 0.000000
2023-10-13 15:47:02,562 epoch 4 - iter 588/1476 - loss 0.06505374 - time (sec): 28.27 - samples/sec: 2398.45 - lr: 0.000037 - momentum: 0.000000
2023-10-13 15:47:09,655 epoch 4 - iter 735/1476 - loss 0.06490679 - time (sec): 35.36 - samples/sec: 2358.79 - lr: 0.000036 - momentum: 0.000000
2023-10-13 15:47:16,482 epoch 4 - iter 882/1476 - loss 0.06350270 - time (sec): 42.19 - samples/sec: 2323.62 - lr: 0.000036 - momentum: 0.000000
2023-10-13 15:47:23,844 epoch 4 - iter 1029/1476 - loss 0.06399013 - time (sec): 49.55 - samples/sec: 2342.65 - lr: 0.000035 - momentum: 0.000000
2023-10-13 15:47:30,580 epoch 4 - iter 1176/1476 - loss 0.06430831 - time (sec): 56.29 - samples/sec: 2330.72 - lr: 0.000034 - momentum: 0.000000
2023-10-13 15:47:37,629 epoch 4 - iter 1323/1476 - loss 0.06572771 - time (sec): 63.33 - samples/sec: 2354.52 - lr: 0.000034 - momentum: 0.000000
2023-10-13 15:47:44,586 epoch 4 - iter 1470/1476 - loss 0.06583572 - time (sec): 70.29 - samples/sec: 2359.14 - lr: 0.000033 - momentum: 0.000000
2023-10-13 15:47:44,847 ----------------------------------------------------------------------------------------------------
2023-10-13 15:47:44,848 EPOCH 4 done: loss 0.0658 - lr: 0.000033
2023-10-13 15:47:56,082 DEV : loss 0.17630523443222046 - f1-score (micro avg) 0.8153
2023-10-13 15:47:56,112 saving best model
2023-10-13 15:47:56,711 ----------------------------------------------------------------------------------------------------
2023-10-13 15:48:03,253 epoch 5 - iter 147/1476 - loss 0.04485170 - time (sec): 6.54 - samples/sec: 2355.16 - lr: 0.000033 - momentum: 0.000000
2023-10-13 15:48:09,999 epoch 5 - iter 294/1476 - loss 0.03902816 - time (sec): 13.28 - samples/sec: 2369.02 - lr: 0.000032 - momentum: 0.000000
2023-10-13 15:48:17,002 epoch 5 - iter 441/1476 - loss 0.04504258 - time (sec): 20.29 - samples/sec: 2406.17 - lr: 0.000032 - momentum: 0.000000
2023-10-13 15:48:23,884 epoch 5 - iter 588/1476 - loss 0.04251002 - time (sec): 27.17 - samples/sec: 2378.67 - lr: 0.000031 - momentum: 0.000000
2023-10-13 15:48:31,087 epoch 5 - iter 735/1476 - loss 0.04220286 - time (sec): 34.37 - samples/sec: 2391.39 - lr: 0.000031 - momentum: 0.000000
2023-10-13 15:48:38,154 epoch 5 - iter 882/1476 - loss 0.04528667 - time (sec): 41.44 - samples/sec: 2388.13 - lr: 0.000030 - momentum: 0.000000
2023-10-13 15:48:45,300 epoch 5 - iter 1029/1476 - loss 0.04637233 - time (sec): 48.59 - samples/sec: 2356.76 - lr: 0.000029 - momentum: 0.000000
2023-10-13 15:48:52,720 epoch 5 - iter 1176/1476 - loss 0.04767054 - time (sec): 56.01 - samples/sec: 2372.28 - lr: 0.000029 - momentum: 0.000000
2023-10-13 15:48:59,761 epoch 5 - iter 1323/1476 - loss 0.04742459 - time (sec): 63.05 - samples/sec: 2367.60 - lr: 0.000028 - momentum: 0.000000
2023-10-13 15:49:06,688 epoch 5 - iter 1470/1476 - loss 0.04806456 - time (sec): 69.97 - samples/sec: 2371.06 - lr: 0.000028 - momentum: 0.000000
2023-10-13 15:49:06,946 ----------------------------------------------------------------------------------------------------
2023-10-13 15:49:06,947 EPOCH 5 done: loss 0.0484 - lr: 0.000028
2023-10-13 15:49:18,117 DEV : loss 0.18819278478622437 - f1-score (micro avg) 0.7999
2023-10-13 15:49:18,147 ----------------------------------------------------------------------------------------------------
2023-10-13 15:49:25,047 epoch 6 - iter 147/1476 - loss 0.03631651 - time (sec): 6.90 - samples/sec: 2179.78 - lr: 0.000027 - momentum: 0.000000
2023-10-13 15:49:31,878 epoch 6 - iter 294/1476 - loss 0.03341746 - time (sec): 13.73 - samples/sec: 2233.74 - lr: 0.000027 - momentum: 0.000000
2023-10-13 15:49:39,202 epoch 6 - iter 441/1476 - loss 0.03106010 - time (sec): 21.05 - samples/sec: 2342.54 - lr: 0.000026 - momentum: 0.000000
2023-10-13 15:49:46,275 epoch 6 - iter 588/1476 - loss 0.03516578 - time (sec): 28.13 - samples/sec: 2336.61 - lr: 0.000026 - momentum: 0.000000
2023-10-13 15:49:53,144 epoch 6 - iter 735/1476 - loss 0.03483987 - time (sec): 35.00 - samples/sec: 2348.33 - lr: 0.000025 - momentum: 0.000000
2023-10-13 15:50:00,155 epoch 6 - iter 882/1476 - loss 0.03313512 - time (sec): 42.01 - samples/sec: 2372.38 - lr: 0.000024 - momentum: 0.000000
2023-10-13 15:50:06,939 epoch 6 - iter 1029/1476 - loss 0.03286769 - time (sec): 48.79 - samples/sec: 2353.00 - lr: 0.000024 - momentum: 0.000000
2023-10-13 15:50:13,772 epoch 6 - iter 1176/1476 - loss 0.03254763 - time (sec): 55.62 - samples/sec: 2356.11 - lr: 0.000023 - momentum: 0.000000
2023-10-13 15:50:21,043 epoch 6 - iter 1323/1476 - loss 0.03345823 - time (sec): 62.89 - samples/sec: 2387.16 - lr: 0.000023 - momentum: 0.000000
2023-10-13 15:50:27,847 epoch 6 - iter 1470/1476 - loss 0.03287175 - time (sec): 69.70 - samples/sec: 2380.16 - lr: 0.000022 - momentum: 0.000000
2023-10-13 15:50:28,118 ----------------------------------------------------------------------------------------------------
2023-10-13 15:50:28,119 EPOCH 6 done: loss 0.0328 - lr: 0.000022
2023-10-13 15:50:39,277 DEV : loss 0.20484893023967743 - f1-score (micro avg) 0.8029
2023-10-13 15:50:39,307 ----------------------------------------------------------------------------------------------------
2023-10-13 15:50:46,072 epoch 7 - iter 147/1476 - loss 0.01775270 - time (sec): 6.76 - samples/sec: 2267.31 - lr: 0.000022 - momentum: 0.000000
2023-10-13 15:50:53,821 epoch 7 - iter 294/1476 - loss 0.02118488 - time (sec): 14.51 - samples/sec: 2336.41 - lr: 0.000021 - momentum: 0.000000
2023-10-13 15:51:00,455 epoch 7 - iter 441/1476 - loss 0.02237123 - time (sec): 21.15 - samples/sec: 2323.11 - lr: 0.000021 - momentum: 0.000000
2023-10-13 15:51:07,535 epoch 7 - iter 588/1476 - loss 0.02143611 - time (sec): 28.23 - samples/sec: 2322.19 - lr: 0.000020 - momentum: 0.000000
2023-10-13 15:51:14,428 epoch 7 - iter 735/1476 - loss 0.02320718 - time (sec): 35.12 - samples/sec: 2341.16 - lr: 0.000019 - momentum: 0.000000
2023-10-13 15:51:21,513 epoch 7 - iter 882/1476 - loss 0.02422687 - time (sec): 42.21 - samples/sec: 2384.20 - lr: 0.000019 - momentum: 0.000000
2023-10-13 15:51:28,587 epoch 7 - iter 1029/1476 - loss 0.02343024 - time (sec): 49.28 - samples/sec: 2402.37 - lr: 0.000018 - momentum: 0.000000
2023-10-13 15:51:35,674 epoch 7 - iter 1176/1476 - loss 0.02306815 - time (sec): 56.37 - samples/sec: 2392.98 - lr: 0.000018 - momentum: 0.000000
2023-10-13 15:51:42,416 epoch 7 - iter 1323/1476 - loss 0.02322261 - time (sec): 63.11 - samples/sec: 2374.09 - lr: 0.000017 - momentum: 0.000000
2023-10-13 15:51:49,345 epoch 7 - iter 1470/1476 - loss 0.02302954 - time (sec): 70.04 - samples/sec: 2368.47 - lr: 0.000017 - momentum: 0.000000
2023-10-13 15:51:49,612 ----------------------------------------------------------------------------------------------------
2023-10-13 15:51:49,612 EPOCH 7 done: loss 0.0231 - lr: 0.000017
2023-10-13 15:52:00,780 DEV : loss 0.2176404744386673 - f1-score (micro avg) 0.8104
2023-10-13 15:52:00,810 ----------------------------------------------------------------------------------------------------
2023-10-13 15:52:07,878 epoch 8 - iter 147/1476 - loss 0.01220283 - time (sec): 7.07 - samples/sec: 2310.70 - lr: 0.000016 - momentum: 0.000000
2023-10-13 15:52:14,683 epoch 8 - iter 294/1476 - loss 0.01477755 - time (sec): 13.87 - samples/sec: 2316.37 - lr: 0.000016 - momentum: 0.000000
2023-10-13 15:52:21,717 epoch 8 - iter 441/1476 - loss 0.01557002 - time (sec): 20.91 - samples/sec: 2387.69 - lr: 0.000015 - momentum: 0.000000
2023-10-13 15:52:28,637 epoch 8 - iter 588/1476 - loss 0.01728477 - time (sec): 27.83 - samples/sec: 2368.74 - lr: 0.000014 - momentum: 0.000000
2023-10-13 15:52:35,625 epoch 8 - iter 735/1476 - loss 0.01772118 - time (sec): 34.81 - samples/sec: 2350.82 - lr: 0.000014 - momentum: 0.000000
2023-10-13 15:52:42,709 epoch 8 - iter 882/1476 - loss 0.01840414 - time (sec): 41.90 - samples/sec: 2332.60 - lr: 0.000013 - momentum: 0.000000
2023-10-13 15:52:49,619 epoch 8 - iter 1029/1476 - loss 0.01749539 - time (sec): 48.81 - samples/sec: 2327.72 - lr: 0.000013 - momentum: 0.000000
2023-10-13 15:52:57,024 epoch 8 - iter 1176/1476 - loss 0.01800568 - time (sec): 56.21 - samples/sec: 2344.29 - lr: 0.000012 - momentum: 0.000000
2023-10-13 15:53:03,895 epoch 8 - iter 1323/1476 - loss 0.01708246 - time (sec): 63.08 - samples/sec: 2351.96 - lr: 0.000012 - momentum: 0.000000
2023-10-13 15:53:10,882 epoch 8 - iter 1470/1476 - loss 0.01733501 - time (sec): 70.07 - samples/sec: 2368.38 - lr: 0.000011 - momentum: 0.000000
2023-10-13 15:53:11,151 ----------------------------------------------------------------------------------------------------
2023-10-13 15:53:11,151 EPOCH 8 done: loss 0.0173 - lr: 0.000011
2023-10-13 15:53:22,272 DEV : loss 0.20916695892810822 - f1-score (micro avg) 0.8181
2023-10-13 15:53:22,301 saving best model
2023-10-13 15:53:22,822 ----------------------------------------------------------------------------------------------------
2023-10-13 15:53:29,893 epoch 9 - iter 147/1476 - loss 0.01635506 - time (sec): 7.07 - samples/sec: 2461.72 - lr: 0.000011 - momentum: 0.000000
2023-10-13 15:53:36,767 epoch 9 - iter 294/1476 - loss 0.01655047 - time (sec): 13.94 - samples/sec: 2426.64 - lr: 0.000010 - momentum: 0.000000
2023-10-13 15:53:43,754 epoch 9 - iter 441/1476 - loss 0.01517177 - time (sec): 20.93 - samples/sec: 2367.10 - lr: 0.000009 - momentum: 0.000000
2023-10-13 15:53:50,701 epoch 9 - iter 588/1476 - loss 0.01363129 - time (sec): 27.88 - samples/sec: 2364.14 - lr: 0.000009 - momentum: 0.000000
2023-10-13 15:53:57,583 epoch 9 - iter 735/1476 - loss 0.01340167 - time (sec): 34.76 - samples/sec: 2360.92 - lr: 0.000008 - momentum: 0.000000
2023-10-13 15:54:04,370 epoch 9 - iter 882/1476 - loss 0.01292839 - time (sec): 41.54 - samples/sec: 2347.88 - lr: 0.000008 - momentum: 0.000000
2023-10-13 15:54:11,324 epoch 9 - iter 1029/1476 - loss 0.01181704 - time (sec): 48.50 - samples/sec: 2370.28 - lr: 0.000007 - momentum: 0.000000
2023-10-13 15:54:18,449 epoch 9 - iter 1176/1476 - loss 0.01179880 - time (sec): 55.62 - samples/sec: 2378.75 - lr: 0.000007 - momentum: 0.000000
2023-10-13 15:54:25,296 epoch 9 - iter 1323/1476 - loss 0.01152101 - time (sec): 62.47 - samples/sec: 2384.12 - lr: 0.000006 - momentum: 0.000000
2023-10-13 15:54:32,279 epoch 9 - iter 1470/1476 - loss 0.01152718 - time (sec): 69.45 - samples/sec: 2389.38 - lr: 0.000006 - momentum: 0.000000
2023-10-13 15:54:32,541 ----------------------------------------------------------------------------------------------------
2023-10-13 15:54:32,541 EPOCH 9 done: loss 0.0115 - lr: 0.000006
2023-10-13 15:54:43,751 DEV : loss 0.22271640598773956 - f1-score (micro avg) 0.8152
2023-10-13 15:54:43,780 ----------------------------------------------------------------------------------------------------
2023-10-13 15:54:50,626 epoch 10 - iter 147/1476 - loss 0.01146130 - time (sec): 6.84 - samples/sec: 2359.28 - lr: 0.000005 - momentum: 0.000000
2023-10-13 15:54:58,158 epoch 10 - iter 294/1476 - loss 0.00823611 - time (sec): 14.38 - samples/sec: 2480.56 - lr: 0.000004 - momentum: 0.000000
2023-10-13 15:55:05,151 epoch 10 - iter 441/1476 - loss 0.00776873 - time (sec): 21.37 - samples/sec: 2419.29 - lr: 0.000004 - momentum: 0.000000
2023-10-13 15:55:12,220 epoch 10 - iter 588/1476 - loss 0.00660613 - time (sec): 28.44 - samples/sec: 2368.38 - lr: 0.000003 - momentum: 0.000000
2023-10-13 15:55:18,963 epoch 10 - iter 735/1476 - loss 0.00628384 - time (sec): 35.18 - samples/sec: 2351.18 - lr: 0.000003 - momentum: 0.000000
2023-10-13 15:55:25,778 epoch 10 - iter 882/1476 - loss 0.00654294 - time (sec): 42.00 - samples/sec: 2333.14 - lr: 0.000002 - momentum: 0.000000
2023-10-13 15:55:33,071 epoch 10 - iter 1029/1476 - loss 0.00618452 - time (sec): 49.29 - samples/sec: 2340.88 - lr: 0.000002 - momentum: 0.000000
2023-10-13 15:55:40,251 epoch 10 - iter 1176/1476 - loss 0.00649196 - time (sec): 56.47 - samples/sec: 2335.05 - lr: 0.000001 - momentum: 0.000000
2023-10-13 15:55:47,188 epoch 10 - iter 1323/1476 - loss 0.00610162 - time (sec): 63.41 - samples/sec: 2332.22 - lr: 0.000001 - momentum: 0.000000
2023-10-13 15:55:54,348 epoch 10 - iter 1470/1476 - loss 0.00601101 - time (sec): 70.57 - samples/sec: 2353.06 - lr: 0.000000 - momentum: 0.000000
2023-10-13 15:55:54,607 ----------------------------------------------------------------------------------------------------
2023-10-13 15:55:54,607 EPOCH 10 done: loss 0.0060 - lr: 0.000000
2023-10-13 15:56:05,754 DEV : loss 0.22833691537380219 - f1-score (micro avg) 0.8171
2023-10-13 15:56:06,208 ----------------------------------------------------------------------------------------------------
2023-10-13 15:56:06,209 Loading model from best epoch ...
2023-10-13 15:56:07,748 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-13 15:56:13,701
Results:
- F-score (micro) 0.7761
- F-score (macro) 0.6771
- Accuracy 0.6563
By class:
precision recall f1-score support
loc 0.8328 0.8590 0.8457 858
pers 0.7347 0.7840 0.7586 537
org 0.5094 0.6136 0.5567 132
time 0.5397 0.6296 0.5812 54
prod 0.6852 0.6066 0.6435 61
micro avg 0.7555 0.7978 0.7761 1642
macro avg 0.6604 0.6986 0.6771 1642
weighted avg 0.7596 0.7978 0.7777 1642
2023-10-13 15:56:13,701 ----------------------------------------------------------------------------------------------------