stefan-it's picture
Upload ./training.log with huggingface_hub
41940c3
2023-10-25 15:46:38,696 ----------------------------------------------------------------------------------------------------
2023-10-25 15:46:38,697 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 15:46:38,697 ----------------------------------------------------------------------------------------------------
2023-10-25 15:46:38,697 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-25 15:46:38,697 ----------------------------------------------------------------------------------------------------
2023-10-25 15:46:38,697 Train: 7142 sentences
2023-10-25 15:46:38,697 (train_with_dev=False, train_with_test=False)
2023-10-25 15:46:38,697 ----------------------------------------------------------------------------------------------------
2023-10-25 15:46:38,697 Training Params:
2023-10-25 15:46:38,697 - learning_rate: "5e-05"
2023-10-25 15:46:38,697 - mini_batch_size: "4"
2023-10-25 15:46:38,697 - max_epochs: "10"
2023-10-25 15:46:38,697 - shuffle: "True"
2023-10-25 15:46:38,697 ----------------------------------------------------------------------------------------------------
2023-10-25 15:46:38,697 Plugins:
2023-10-25 15:46:38,697 - TensorboardLogger
2023-10-25 15:46:38,697 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 15:46:38,697 ----------------------------------------------------------------------------------------------------
2023-10-25 15:46:38,697 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 15:46:38,697 - metric: "('micro avg', 'f1-score')"
2023-10-25 15:46:38,698 ----------------------------------------------------------------------------------------------------
2023-10-25 15:46:38,698 Computation:
2023-10-25 15:46:38,698 - compute on device: cuda:0
2023-10-25 15:46:38,698 - embedding storage: none
2023-10-25 15:46:38,698 ----------------------------------------------------------------------------------------------------
2023-10-25 15:46:38,698 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-25 15:46:38,698 ----------------------------------------------------------------------------------------------------
2023-10-25 15:46:38,698 ----------------------------------------------------------------------------------------------------
2023-10-25 15:46:38,698 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 15:46:48,318 epoch 1 - iter 178/1786 - loss 1.71122294 - time (sec): 9.62 - samples/sec: 2650.79 - lr: 0.000005 - momentum: 0.000000
2023-10-25 15:46:57,968 epoch 1 - iter 356/1786 - loss 1.10034904 - time (sec): 19.27 - samples/sec: 2534.06 - lr: 0.000010 - momentum: 0.000000
2023-10-25 15:47:07,424 epoch 1 - iter 534/1786 - loss 0.83110797 - time (sec): 28.73 - samples/sec: 2515.44 - lr: 0.000015 - momentum: 0.000000
2023-10-25 15:47:16,715 epoch 1 - iter 712/1786 - loss 0.67436102 - time (sec): 38.02 - samples/sec: 2546.45 - lr: 0.000020 - momentum: 0.000000
2023-10-25 15:47:26,198 epoch 1 - iter 890/1786 - loss 0.57044616 - time (sec): 47.50 - samples/sec: 2567.57 - lr: 0.000025 - momentum: 0.000000
2023-10-25 15:47:35,361 epoch 1 - iter 1068/1786 - loss 0.49625403 - time (sec): 56.66 - samples/sec: 2617.57 - lr: 0.000030 - momentum: 0.000000
2023-10-25 15:47:44,672 epoch 1 - iter 1246/1786 - loss 0.44959498 - time (sec): 65.97 - samples/sec: 2624.58 - lr: 0.000035 - momentum: 0.000000
2023-10-25 15:47:54,339 epoch 1 - iter 1424/1786 - loss 0.41355462 - time (sec): 75.64 - samples/sec: 2617.35 - lr: 0.000040 - momentum: 0.000000
2023-10-25 15:48:03,738 epoch 1 - iter 1602/1786 - loss 0.38486574 - time (sec): 85.04 - samples/sec: 2618.41 - lr: 0.000045 - momentum: 0.000000
2023-10-25 15:48:12,801 epoch 1 - iter 1780/1786 - loss 0.36139254 - time (sec): 94.10 - samples/sec: 2636.46 - lr: 0.000050 - momentum: 0.000000
2023-10-25 15:48:13,083 ----------------------------------------------------------------------------------------------------
2023-10-25 15:48:13,083 EPOCH 1 done: loss 0.3608 - lr: 0.000050
2023-10-25 15:48:17,092 DEV : loss 0.13503123819828033 - f1-score (micro avg) 0.7202
2023-10-25 15:48:17,116 saving best model
2023-10-25 15:48:17,612 ----------------------------------------------------------------------------------------------------
2023-10-25 15:48:27,486 epoch 2 - iter 178/1786 - loss 0.12407203 - time (sec): 9.87 - samples/sec: 2599.84 - lr: 0.000049 - momentum: 0.000000
2023-10-25 15:48:37,042 epoch 2 - iter 356/1786 - loss 0.12736955 - time (sec): 19.43 - samples/sec: 2426.93 - lr: 0.000049 - momentum: 0.000000
2023-10-25 15:48:46,727 epoch 2 - iter 534/1786 - loss 0.12458317 - time (sec): 29.11 - samples/sec: 2535.20 - lr: 0.000048 - momentum: 0.000000
2023-10-25 15:48:56,148 epoch 2 - iter 712/1786 - loss 0.12184214 - time (sec): 38.53 - samples/sec: 2555.59 - lr: 0.000048 - momentum: 0.000000
2023-10-25 15:49:05,444 epoch 2 - iter 890/1786 - loss 0.12055895 - time (sec): 47.83 - samples/sec: 2593.23 - lr: 0.000047 - momentum: 0.000000
2023-10-25 15:49:14,963 epoch 2 - iter 1068/1786 - loss 0.12103444 - time (sec): 57.35 - samples/sec: 2594.01 - lr: 0.000047 - momentum: 0.000000
2023-10-25 15:49:24,474 epoch 2 - iter 1246/1786 - loss 0.11982038 - time (sec): 66.86 - samples/sec: 2615.69 - lr: 0.000046 - momentum: 0.000000
2023-10-25 15:49:33,793 epoch 2 - iter 1424/1786 - loss 0.11974673 - time (sec): 76.18 - samples/sec: 2589.45 - lr: 0.000046 - momentum: 0.000000
2023-10-25 15:49:43,364 epoch 2 - iter 1602/1786 - loss 0.11956401 - time (sec): 85.75 - samples/sec: 2596.32 - lr: 0.000045 - momentum: 0.000000
2023-10-25 15:49:53,170 epoch 2 - iter 1780/1786 - loss 0.11947260 - time (sec): 95.56 - samples/sec: 2592.81 - lr: 0.000044 - momentum: 0.000000
2023-10-25 15:49:53,498 ----------------------------------------------------------------------------------------------------
2023-10-25 15:49:53,498 EPOCH 2 done: loss 0.1196 - lr: 0.000044
2023-10-25 15:49:57,539 DEV : loss 0.11700031161308289 - f1-score (micro avg) 0.7623
2023-10-25 15:49:57,560 saving best model
2023-10-25 15:49:58,199 ----------------------------------------------------------------------------------------------------
2023-10-25 15:50:07,682 epoch 3 - iter 178/1786 - loss 0.07945384 - time (sec): 9.48 - samples/sec: 2683.68 - lr: 0.000044 - momentum: 0.000000
2023-10-25 15:50:17,051 epoch 3 - iter 356/1786 - loss 0.07503847 - time (sec): 18.85 - samples/sec: 2596.72 - lr: 0.000043 - momentum: 0.000000
2023-10-25 15:50:26,519 epoch 3 - iter 534/1786 - loss 0.07795492 - time (sec): 28.32 - samples/sec: 2648.33 - lr: 0.000043 - momentum: 0.000000
2023-10-25 15:50:35,768 epoch 3 - iter 712/1786 - loss 0.08107118 - time (sec): 37.57 - samples/sec: 2654.03 - lr: 0.000042 - momentum: 0.000000
2023-10-25 15:50:44,402 epoch 3 - iter 890/1786 - loss 0.08084430 - time (sec): 46.20 - samples/sec: 2662.85 - lr: 0.000042 - momentum: 0.000000
2023-10-25 15:50:53,144 epoch 3 - iter 1068/1786 - loss 0.08209011 - time (sec): 54.94 - samples/sec: 2688.32 - lr: 0.000041 - momentum: 0.000000
2023-10-25 15:51:02,410 epoch 3 - iter 1246/1786 - loss 0.08279690 - time (sec): 64.21 - samples/sec: 2707.24 - lr: 0.000041 - momentum: 0.000000
2023-10-25 15:51:11,472 epoch 3 - iter 1424/1786 - loss 0.08293172 - time (sec): 73.27 - samples/sec: 2725.81 - lr: 0.000040 - momentum: 0.000000
2023-10-25 15:51:20,625 epoch 3 - iter 1602/1786 - loss 0.08232577 - time (sec): 82.42 - samples/sec: 2731.48 - lr: 0.000039 - momentum: 0.000000
2023-10-25 15:51:29,623 epoch 3 - iter 1780/1786 - loss 0.08244948 - time (sec): 91.42 - samples/sec: 2714.74 - lr: 0.000039 - momentum: 0.000000
2023-10-25 15:51:29,917 ----------------------------------------------------------------------------------------------------
2023-10-25 15:51:29,917 EPOCH 3 done: loss 0.0825 - lr: 0.000039
2023-10-25 15:51:34,711 DEV : loss 0.1378549337387085 - f1-score (micro avg) 0.755
2023-10-25 15:51:34,733 ----------------------------------------------------------------------------------------------------
2023-10-25 15:51:43,650 epoch 4 - iter 178/1786 - loss 0.05427530 - time (sec): 8.91 - samples/sec: 2798.78 - lr: 0.000038 - momentum: 0.000000
2023-10-25 15:51:52,829 epoch 4 - iter 356/1786 - loss 0.06260664 - time (sec): 18.09 - samples/sec: 2759.29 - lr: 0.000038 - momentum: 0.000000
2023-10-25 15:52:01,468 epoch 4 - iter 534/1786 - loss 0.06187774 - time (sec): 26.73 - samples/sec: 2746.84 - lr: 0.000037 - momentum: 0.000000
2023-10-25 15:52:10,600 epoch 4 - iter 712/1786 - loss 0.06195480 - time (sec): 35.87 - samples/sec: 2765.81 - lr: 0.000037 - momentum: 0.000000
2023-10-25 15:52:19,483 epoch 4 - iter 890/1786 - loss 0.06149034 - time (sec): 44.75 - samples/sec: 2776.77 - lr: 0.000036 - momentum: 0.000000
2023-10-25 15:52:28,516 epoch 4 - iter 1068/1786 - loss 0.06169674 - time (sec): 53.78 - samples/sec: 2806.92 - lr: 0.000036 - momentum: 0.000000
2023-10-25 15:52:37,349 epoch 4 - iter 1246/1786 - loss 0.06445329 - time (sec): 62.61 - samples/sec: 2795.13 - lr: 0.000035 - momentum: 0.000000
2023-10-25 15:52:46,539 epoch 4 - iter 1424/1786 - loss 0.06499429 - time (sec): 71.80 - samples/sec: 2758.74 - lr: 0.000034 - momentum: 0.000000
2023-10-25 15:52:55,747 epoch 4 - iter 1602/1786 - loss 0.06306222 - time (sec): 81.01 - samples/sec: 2771.19 - lr: 0.000034 - momentum: 0.000000
2023-10-25 15:53:04,431 epoch 4 - iter 1780/1786 - loss 0.06282893 - time (sec): 89.70 - samples/sec: 2767.02 - lr: 0.000033 - momentum: 0.000000
2023-10-25 15:53:04,706 ----------------------------------------------------------------------------------------------------
2023-10-25 15:53:04,707 EPOCH 4 done: loss 0.0630 - lr: 0.000033
2023-10-25 15:53:09,579 DEV : loss 0.18551814556121826 - f1-score (micro avg) 0.7519
2023-10-25 15:53:09,603 ----------------------------------------------------------------------------------------------------
2023-10-25 15:53:18,813 epoch 5 - iter 178/1786 - loss 0.04430229 - time (sec): 9.21 - samples/sec: 2578.40 - lr: 0.000033 - momentum: 0.000000
2023-10-25 15:53:28,613 epoch 5 - iter 356/1786 - loss 0.04196870 - time (sec): 19.01 - samples/sec: 2602.50 - lr: 0.000032 - momentum: 0.000000
2023-10-25 15:53:38,464 epoch 5 - iter 534/1786 - loss 0.04425551 - time (sec): 28.86 - samples/sec: 2605.34 - lr: 0.000032 - momentum: 0.000000
2023-10-25 15:53:47,762 epoch 5 - iter 712/1786 - loss 0.04299756 - time (sec): 38.16 - samples/sec: 2606.77 - lr: 0.000031 - momentum: 0.000000
2023-10-25 15:53:56,812 epoch 5 - iter 890/1786 - loss 0.04398595 - time (sec): 47.21 - samples/sec: 2619.00 - lr: 0.000031 - momentum: 0.000000
2023-10-25 15:54:05,975 epoch 5 - iter 1068/1786 - loss 0.04465443 - time (sec): 56.37 - samples/sec: 2659.82 - lr: 0.000030 - momentum: 0.000000
2023-10-25 15:54:14,748 epoch 5 - iter 1246/1786 - loss 0.04413175 - time (sec): 65.14 - samples/sec: 2672.35 - lr: 0.000029 - momentum: 0.000000
2023-10-25 15:54:23,729 epoch 5 - iter 1424/1786 - loss 0.04310012 - time (sec): 74.12 - samples/sec: 2673.94 - lr: 0.000029 - momentum: 0.000000
2023-10-25 15:54:32,836 epoch 5 - iter 1602/1786 - loss 0.04362101 - time (sec): 83.23 - samples/sec: 2677.74 - lr: 0.000028 - momentum: 0.000000
2023-10-25 15:54:42,313 epoch 5 - iter 1780/1786 - loss 0.04499036 - time (sec): 92.71 - samples/sec: 2675.55 - lr: 0.000028 - momentum: 0.000000
2023-10-25 15:54:42,605 ----------------------------------------------------------------------------------------------------
2023-10-25 15:54:42,605 EPOCH 5 done: loss 0.0449 - lr: 0.000028
2023-10-25 15:54:46,479 DEV : loss 0.18028688430786133 - f1-score (micro avg) 0.8033
2023-10-25 15:54:46,503 saving best model
2023-10-25 15:54:47,160 ----------------------------------------------------------------------------------------------------
2023-10-25 15:54:56,791 epoch 6 - iter 178/1786 - loss 0.02633064 - time (sec): 9.63 - samples/sec: 2614.15 - lr: 0.000027 - momentum: 0.000000
2023-10-25 15:55:06,138 epoch 6 - iter 356/1786 - loss 0.02540360 - time (sec): 18.97 - samples/sec: 2572.31 - lr: 0.000027 - momentum: 0.000000
2023-10-25 15:55:15,646 epoch 6 - iter 534/1786 - loss 0.02731547 - time (sec): 28.48 - samples/sec: 2601.99 - lr: 0.000026 - momentum: 0.000000
2023-10-25 15:55:25,066 epoch 6 - iter 712/1786 - loss 0.03161683 - time (sec): 37.90 - samples/sec: 2616.15 - lr: 0.000026 - momentum: 0.000000
2023-10-25 15:55:34,736 epoch 6 - iter 890/1786 - loss 0.03258884 - time (sec): 47.57 - samples/sec: 2632.34 - lr: 0.000025 - momentum: 0.000000
2023-10-25 15:55:44,052 epoch 6 - iter 1068/1786 - loss 0.03444115 - time (sec): 56.89 - samples/sec: 2608.85 - lr: 0.000024 - momentum: 0.000000
2023-10-25 15:55:53,428 epoch 6 - iter 1246/1786 - loss 0.03455081 - time (sec): 66.27 - samples/sec: 2624.54 - lr: 0.000024 - momentum: 0.000000
2023-10-25 15:56:02,885 epoch 6 - iter 1424/1786 - loss 0.03507010 - time (sec): 75.72 - samples/sec: 2631.78 - lr: 0.000023 - momentum: 0.000000
2023-10-25 15:56:11,781 epoch 6 - iter 1602/1786 - loss 0.03617707 - time (sec): 84.62 - samples/sec: 2646.56 - lr: 0.000023 - momentum: 0.000000
2023-10-25 15:56:20,674 epoch 6 - iter 1780/1786 - loss 0.03554486 - time (sec): 93.51 - samples/sec: 2654.32 - lr: 0.000022 - momentum: 0.000000
2023-10-25 15:56:20,971 ----------------------------------------------------------------------------------------------------
2023-10-25 15:56:20,971 EPOCH 6 done: loss 0.0357 - lr: 0.000022
2023-10-25 15:56:25,780 DEV : loss 0.1820164918899536 - f1-score (micro avg) 0.7943
2023-10-25 15:56:25,801 ----------------------------------------------------------------------------------------------------
2023-10-25 15:56:35,262 epoch 7 - iter 178/1786 - loss 0.02389501 - time (sec): 9.46 - samples/sec: 2817.92 - lr: 0.000022 - momentum: 0.000000
2023-10-25 15:56:44,521 epoch 7 - iter 356/1786 - loss 0.03026582 - time (sec): 18.72 - samples/sec: 2707.42 - lr: 0.000021 - momentum: 0.000000
2023-10-25 15:56:53,894 epoch 7 - iter 534/1786 - loss 0.03128907 - time (sec): 28.09 - samples/sec: 2670.28 - lr: 0.000021 - momentum: 0.000000
2023-10-25 15:57:03,184 epoch 7 - iter 712/1786 - loss 0.03038936 - time (sec): 37.38 - samples/sec: 2681.89 - lr: 0.000020 - momentum: 0.000000
2023-10-25 15:57:12,520 epoch 7 - iter 890/1786 - loss 0.02884850 - time (sec): 46.72 - samples/sec: 2668.10 - lr: 0.000019 - momentum: 0.000000
2023-10-25 15:57:21,892 epoch 7 - iter 1068/1786 - loss 0.02871245 - time (sec): 56.09 - samples/sec: 2650.46 - lr: 0.000019 - momentum: 0.000000
2023-10-25 15:57:31,104 epoch 7 - iter 1246/1786 - loss 0.02822760 - time (sec): 65.30 - samples/sec: 2629.02 - lr: 0.000018 - momentum: 0.000000
2023-10-25 15:57:40,635 epoch 7 - iter 1424/1786 - loss 0.02843002 - time (sec): 74.83 - samples/sec: 2645.70 - lr: 0.000018 - momentum: 0.000000
2023-10-25 15:57:49,434 epoch 7 - iter 1602/1786 - loss 0.02811501 - time (sec): 83.63 - samples/sec: 2659.82 - lr: 0.000017 - momentum: 0.000000
2023-10-25 15:57:58,178 epoch 7 - iter 1780/1786 - loss 0.02826981 - time (sec): 92.38 - samples/sec: 2685.53 - lr: 0.000017 - momentum: 0.000000
2023-10-25 15:57:58,469 ----------------------------------------------------------------------------------------------------
2023-10-25 15:57:58,470 EPOCH 7 done: loss 0.0282 - lr: 0.000017
2023-10-25 15:58:03,620 DEV : loss 0.2033814787864685 - f1-score (micro avg) 0.7832
2023-10-25 15:58:03,643 ----------------------------------------------------------------------------------------------------
2023-10-25 15:58:13,174 epoch 8 - iter 178/1786 - loss 0.02698386 - time (sec): 9.53 - samples/sec: 2513.72 - lr: 0.000016 - momentum: 0.000000
2023-10-25 15:58:23,042 epoch 8 - iter 356/1786 - loss 0.02189818 - time (sec): 19.40 - samples/sec: 2504.44 - lr: 0.000016 - momentum: 0.000000
2023-10-25 15:58:32,872 epoch 8 - iter 534/1786 - loss 0.02241369 - time (sec): 29.23 - samples/sec: 2531.08 - lr: 0.000015 - momentum: 0.000000
2023-10-25 15:58:42,433 epoch 8 - iter 712/1786 - loss 0.02213136 - time (sec): 38.79 - samples/sec: 2517.54 - lr: 0.000014 - momentum: 0.000000
2023-10-25 15:58:52,151 epoch 8 - iter 890/1786 - loss 0.02156072 - time (sec): 48.51 - samples/sec: 2522.71 - lr: 0.000014 - momentum: 0.000000
2023-10-25 15:59:01,830 epoch 8 - iter 1068/1786 - loss 0.02028695 - time (sec): 58.18 - samples/sec: 2539.64 - lr: 0.000013 - momentum: 0.000000
2023-10-25 15:59:11,492 epoch 8 - iter 1246/1786 - loss 0.01971787 - time (sec): 67.85 - samples/sec: 2567.47 - lr: 0.000013 - momentum: 0.000000
2023-10-25 15:59:21,218 epoch 8 - iter 1424/1786 - loss 0.01980862 - time (sec): 77.57 - samples/sec: 2547.94 - lr: 0.000012 - momentum: 0.000000
2023-10-25 15:59:30,611 epoch 8 - iter 1602/1786 - loss 0.02006421 - time (sec): 86.97 - samples/sec: 2557.81 - lr: 0.000012 - momentum: 0.000000
2023-10-25 15:59:39,890 epoch 8 - iter 1780/1786 - loss 0.01964747 - time (sec): 96.25 - samples/sec: 2577.09 - lr: 0.000011 - momentum: 0.000000
2023-10-25 15:59:40,217 ----------------------------------------------------------------------------------------------------
2023-10-25 15:59:40,217 EPOCH 8 done: loss 0.0197 - lr: 0.000011
2023-10-25 15:59:44,159 DEV : loss 0.21475903689861298 - f1-score (micro avg) 0.79
2023-10-25 15:59:44,183 ----------------------------------------------------------------------------------------------------
2023-10-25 15:59:53,943 epoch 9 - iter 178/1786 - loss 0.00854585 - time (sec): 9.76 - samples/sec: 2529.14 - lr: 0.000011 - momentum: 0.000000
2023-10-25 16:00:03,415 epoch 9 - iter 356/1786 - loss 0.00730957 - time (sec): 19.23 - samples/sec: 2546.52 - lr: 0.000010 - momentum: 0.000000
2023-10-25 16:00:12,902 epoch 9 - iter 534/1786 - loss 0.00824375 - time (sec): 28.72 - samples/sec: 2565.61 - lr: 0.000009 - momentum: 0.000000
2023-10-25 16:00:22,790 epoch 9 - iter 712/1786 - loss 0.01124988 - time (sec): 38.61 - samples/sec: 2607.10 - lr: 0.000009 - momentum: 0.000000
2023-10-25 16:00:32,365 epoch 9 - iter 890/1786 - loss 0.01242606 - time (sec): 48.18 - samples/sec: 2611.93 - lr: 0.000008 - momentum: 0.000000
2023-10-25 16:00:41,872 epoch 9 - iter 1068/1786 - loss 0.01230465 - time (sec): 57.69 - samples/sec: 2624.15 - lr: 0.000008 - momentum: 0.000000
2023-10-25 16:00:51,095 epoch 9 - iter 1246/1786 - loss 0.01297617 - time (sec): 66.91 - samples/sec: 2616.87 - lr: 0.000007 - momentum: 0.000000
2023-10-25 16:01:00,313 epoch 9 - iter 1424/1786 - loss 0.01268858 - time (sec): 76.13 - samples/sec: 2634.73 - lr: 0.000007 - momentum: 0.000000
2023-10-25 16:01:09,907 epoch 9 - iter 1602/1786 - loss 0.01258302 - time (sec): 85.72 - samples/sec: 2619.87 - lr: 0.000006 - momentum: 0.000000
2023-10-25 16:01:19,662 epoch 9 - iter 1780/1786 - loss 0.01218671 - time (sec): 95.48 - samples/sec: 2598.21 - lr: 0.000006 - momentum: 0.000000
2023-10-25 16:01:19,987 ----------------------------------------------------------------------------------------------------
2023-10-25 16:01:19,987 EPOCH 9 done: loss 0.0122 - lr: 0.000006
2023-10-25 16:01:25,365 DEV : loss 0.21478621661663055 - f1-score (micro avg) 0.7938
2023-10-25 16:01:25,386 ----------------------------------------------------------------------------------------------------
2023-10-25 16:01:34,658 epoch 10 - iter 178/1786 - loss 0.01114122 - time (sec): 9.27 - samples/sec: 2577.39 - lr: 0.000005 - momentum: 0.000000
2023-10-25 16:01:44,026 epoch 10 - iter 356/1786 - loss 0.00936457 - time (sec): 18.64 - samples/sec: 2663.26 - lr: 0.000004 - momentum: 0.000000
2023-10-25 16:01:53,778 epoch 10 - iter 534/1786 - loss 0.00753507 - time (sec): 28.39 - samples/sec: 2632.67 - lr: 0.000004 - momentum: 0.000000
2023-10-25 16:02:02,735 epoch 10 - iter 712/1786 - loss 0.00724194 - time (sec): 37.35 - samples/sec: 2675.25 - lr: 0.000003 - momentum: 0.000000
2023-10-25 16:02:12,223 epoch 10 - iter 890/1786 - loss 0.00681051 - time (sec): 46.84 - samples/sec: 2606.92 - lr: 0.000003 - momentum: 0.000000
2023-10-25 16:02:21,833 epoch 10 - iter 1068/1786 - loss 0.00691939 - time (sec): 56.45 - samples/sec: 2622.33 - lr: 0.000002 - momentum: 0.000000
2023-10-25 16:02:31,013 epoch 10 - iter 1246/1786 - loss 0.00730625 - time (sec): 65.63 - samples/sec: 2633.25 - lr: 0.000002 - momentum: 0.000000
2023-10-25 16:02:40,785 epoch 10 - iter 1424/1786 - loss 0.00715163 - time (sec): 75.40 - samples/sec: 2622.41 - lr: 0.000001 - momentum: 0.000000
2023-10-25 16:02:50,482 epoch 10 - iter 1602/1786 - loss 0.00713533 - time (sec): 85.09 - samples/sec: 2603.48 - lr: 0.000001 - momentum: 0.000000
2023-10-25 16:03:00,096 epoch 10 - iter 1780/1786 - loss 0.00759595 - time (sec): 94.71 - samples/sec: 2619.46 - lr: 0.000000 - momentum: 0.000000
2023-10-25 16:03:00,416 ----------------------------------------------------------------------------------------------------
2023-10-25 16:03:00,416 EPOCH 10 done: loss 0.0076 - lr: 0.000000
2023-10-25 16:03:04,801 DEV : loss 0.2166852504014969 - f1-score (micro avg) 0.7957
2023-10-25 16:03:06,067 ----------------------------------------------------------------------------------------------------
2023-10-25 16:03:06,068 Loading model from best epoch ...
2023-10-25 16:03:07,869 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 16:03:20,794
Results:
- F-score (micro) 0.6746
- F-score (macro) 0.5843
- Accuracy 0.521
By class:
precision recall f1-score support
LOC 0.6880 0.6484 0.6676 1095
PER 0.7798 0.7559 0.7677 1012
ORG 0.4706 0.4706 0.4706 357
HumanProd 0.3188 0.6667 0.4314 33
micro avg 0.6827 0.6668 0.6746 2497
macro avg 0.5643 0.6354 0.5843 2497
weighted avg 0.6892 0.6668 0.6769 2497
2023-10-25 16:03:20,794 ----------------------------------------------------------------------------------------------------