stefan-it's picture
Upload ./training.log with huggingface_hub
d1bb402
2023-10-25 16:26:03,207 ----------------------------------------------------------------------------------------------------
2023-10-25 16:26:03,208 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 16:26:03,208 ----------------------------------------------------------------------------------------------------
2023-10-25 16:26:03,208 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-25 16:26:03,208 ----------------------------------------------------------------------------------------------------
2023-10-25 16:26:03,208 Train: 7142 sentences
2023-10-25 16:26:03,208 (train_with_dev=False, train_with_test=False)
2023-10-25 16:26:03,208 ----------------------------------------------------------------------------------------------------
2023-10-25 16:26:03,208 Training Params:
2023-10-25 16:26:03,208 - learning_rate: "3e-05"
2023-10-25 16:26:03,208 - mini_batch_size: "4"
2023-10-25 16:26:03,208 - max_epochs: "10"
2023-10-25 16:26:03,208 - shuffle: "True"
2023-10-25 16:26:03,208 ----------------------------------------------------------------------------------------------------
2023-10-25 16:26:03,208 Plugins:
2023-10-25 16:26:03,208 - TensorboardLogger
2023-10-25 16:26:03,208 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 16:26:03,208 ----------------------------------------------------------------------------------------------------
2023-10-25 16:26:03,208 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 16:26:03,209 - metric: "('micro avg', 'f1-score')"
2023-10-25 16:26:03,209 ----------------------------------------------------------------------------------------------------
2023-10-25 16:26:03,209 Computation:
2023-10-25 16:26:03,209 - compute on device: cuda:0
2023-10-25 16:26:03,209 - embedding storage: none
2023-10-25 16:26:03,209 ----------------------------------------------------------------------------------------------------
2023-10-25 16:26:03,209 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-25 16:26:03,209 ----------------------------------------------------------------------------------------------------
2023-10-25 16:26:03,209 ----------------------------------------------------------------------------------------------------
2023-10-25 16:26:03,209 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 16:26:12,778 epoch 1 - iter 178/1786 - loss 1.72389527 - time (sec): 9.57 - samples/sec: 2499.38 - lr: 0.000003 - momentum: 0.000000
2023-10-25 16:26:22,214 epoch 1 - iter 356/1786 - loss 1.11489437 - time (sec): 19.00 - samples/sec: 2559.80 - lr: 0.000006 - momentum: 0.000000
2023-10-25 16:26:31,700 epoch 1 - iter 534/1786 - loss 0.85501815 - time (sec): 28.49 - samples/sec: 2565.38 - lr: 0.000009 - momentum: 0.000000
2023-10-25 16:26:41,250 epoch 1 - iter 712/1786 - loss 0.69063084 - time (sec): 38.04 - samples/sec: 2617.48 - lr: 0.000012 - momentum: 0.000000
2023-10-25 16:26:50,769 epoch 1 - iter 890/1786 - loss 0.59262743 - time (sec): 47.56 - samples/sec: 2593.45 - lr: 0.000015 - momentum: 0.000000
2023-10-25 16:27:00,269 epoch 1 - iter 1068/1786 - loss 0.52253362 - time (sec): 57.06 - samples/sec: 2598.06 - lr: 0.000018 - momentum: 0.000000
2023-10-25 16:27:09,548 epoch 1 - iter 1246/1786 - loss 0.46852649 - time (sec): 66.34 - samples/sec: 2629.99 - lr: 0.000021 - momentum: 0.000000
2023-10-25 16:27:18,901 epoch 1 - iter 1424/1786 - loss 0.42908150 - time (sec): 75.69 - samples/sec: 2630.06 - lr: 0.000024 - momentum: 0.000000
2023-10-25 16:27:28,243 epoch 1 - iter 1602/1786 - loss 0.39816297 - time (sec): 85.03 - samples/sec: 2627.16 - lr: 0.000027 - momentum: 0.000000
2023-10-25 16:27:37,690 epoch 1 - iter 1780/1786 - loss 0.37429285 - time (sec): 94.48 - samples/sec: 2626.94 - lr: 0.000030 - momentum: 0.000000
2023-10-25 16:27:37,985 ----------------------------------------------------------------------------------------------------
2023-10-25 16:27:37,985 EPOCH 1 done: loss 0.3737 - lr: 0.000030
2023-10-25 16:27:42,018 DEV : loss 0.11665168404579163 - f1-score (micro avg) 0.7065
2023-10-25 16:27:42,041 saving best model
2023-10-25 16:27:42,503 ----------------------------------------------------------------------------------------------------
2023-10-25 16:27:51,329 epoch 2 - iter 178/1786 - loss 0.10461983 - time (sec): 8.82 - samples/sec: 2853.63 - lr: 0.000030 - momentum: 0.000000
2023-10-25 16:28:00,430 epoch 2 - iter 356/1786 - loss 0.11400338 - time (sec): 17.93 - samples/sec: 2630.89 - lr: 0.000029 - momentum: 0.000000
2023-10-25 16:28:09,657 epoch 2 - iter 534/1786 - loss 0.11338586 - time (sec): 27.15 - samples/sec: 2665.75 - lr: 0.000029 - momentum: 0.000000
2023-10-25 16:28:19,268 epoch 2 - iter 712/1786 - loss 0.11187233 - time (sec): 36.76 - samples/sec: 2721.39 - lr: 0.000029 - momentum: 0.000000
2023-10-25 16:28:28,609 epoch 2 - iter 890/1786 - loss 0.11112092 - time (sec): 46.10 - samples/sec: 2702.62 - lr: 0.000028 - momentum: 0.000000
2023-10-25 16:28:37,715 epoch 2 - iter 1068/1786 - loss 0.11480554 - time (sec): 55.21 - samples/sec: 2693.47 - lr: 0.000028 - momentum: 0.000000
2023-10-25 16:28:47,250 epoch 2 - iter 1246/1786 - loss 0.11362112 - time (sec): 64.75 - samples/sec: 2683.58 - lr: 0.000028 - momentum: 0.000000
2023-10-25 16:28:56,729 epoch 2 - iter 1424/1786 - loss 0.11345575 - time (sec): 74.22 - samples/sec: 2663.61 - lr: 0.000027 - momentum: 0.000000
2023-10-25 16:29:06,404 epoch 2 - iter 1602/1786 - loss 0.11331499 - time (sec): 83.90 - samples/sec: 2661.42 - lr: 0.000027 - momentum: 0.000000
2023-10-25 16:29:15,745 epoch 2 - iter 1780/1786 - loss 0.11389365 - time (sec): 93.24 - samples/sec: 2659.65 - lr: 0.000027 - momentum: 0.000000
2023-10-25 16:29:16,043 ----------------------------------------------------------------------------------------------------
2023-10-25 16:29:16,043 EPOCH 2 done: loss 0.1141 - lr: 0.000027
2023-10-25 16:29:20,360 DEV : loss 0.15264025330543518 - f1-score (micro avg) 0.7495
2023-10-25 16:29:20,382 saving best model
2023-10-25 16:29:21,051 ----------------------------------------------------------------------------------------------------
2023-10-25 16:29:30,883 epoch 3 - iter 178/1786 - loss 0.07629665 - time (sec): 9.83 - samples/sec: 2419.01 - lr: 0.000026 - momentum: 0.000000
2023-10-25 16:29:40,271 epoch 3 - iter 356/1786 - loss 0.06863531 - time (sec): 19.22 - samples/sec: 2509.35 - lr: 0.000026 - momentum: 0.000000
2023-10-25 16:29:49,295 epoch 3 - iter 534/1786 - loss 0.07758255 - time (sec): 28.24 - samples/sec: 2575.16 - lr: 0.000026 - momentum: 0.000000
2023-10-25 16:29:58,201 epoch 3 - iter 712/1786 - loss 0.07961424 - time (sec): 37.15 - samples/sec: 2586.45 - lr: 0.000025 - momentum: 0.000000
2023-10-25 16:30:07,236 epoch 3 - iter 890/1786 - loss 0.08178772 - time (sec): 46.18 - samples/sec: 2596.65 - lr: 0.000025 - momentum: 0.000000
2023-10-25 16:30:16,358 epoch 3 - iter 1068/1786 - loss 0.08404606 - time (sec): 55.30 - samples/sec: 2628.21 - lr: 0.000025 - momentum: 0.000000
2023-10-25 16:30:25,496 epoch 3 - iter 1246/1786 - loss 0.08188804 - time (sec): 64.44 - samples/sec: 2647.28 - lr: 0.000024 - momentum: 0.000000
2023-10-25 16:30:35,432 epoch 3 - iter 1424/1786 - loss 0.08055218 - time (sec): 74.38 - samples/sec: 2653.16 - lr: 0.000024 - momentum: 0.000000
2023-10-25 16:30:44,086 epoch 3 - iter 1602/1786 - loss 0.07964001 - time (sec): 83.03 - samples/sec: 2683.32 - lr: 0.000024 - momentum: 0.000000
2023-10-25 16:30:52,968 epoch 3 - iter 1780/1786 - loss 0.07759944 - time (sec): 91.91 - samples/sec: 2697.45 - lr: 0.000023 - momentum: 0.000000
2023-10-25 16:30:53,255 ----------------------------------------------------------------------------------------------------
2023-10-25 16:30:53,255 EPOCH 3 done: loss 0.0776 - lr: 0.000023
2023-10-25 16:30:57,163 DEV : loss 0.13464441895484924 - f1-score (micro avg) 0.7748
2023-10-25 16:30:57,184 saving best model
2023-10-25 16:30:57,839 ----------------------------------------------------------------------------------------------------
2023-10-25 16:31:06,592 epoch 4 - iter 178/1786 - loss 0.04440627 - time (sec): 8.75 - samples/sec: 2715.40 - lr: 0.000023 - momentum: 0.000000
2023-10-25 16:31:15,474 epoch 4 - iter 356/1786 - loss 0.05450741 - time (sec): 17.63 - samples/sec: 2733.26 - lr: 0.000023 - momentum: 0.000000
2023-10-25 16:31:24,257 epoch 4 - iter 534/1786 - loss 0.05779653 - time (sec): 26.41 - samples/sec: 2703.26 - lr: 0.000022 - momentum: 0.000000
2023-10-25 16:31:33,177 epoch 4 - iter 712/1786 - loss 0.05478759 - time (sec): 35.33 - samples/sec: 2760.99 - lr: 0.000022 - momentum: 0.000000
2023-10-25 16:31:41,855 epoch 4 - iter 890/1786 - loss 0.05436695 - time (sec): 44.01 - samples/sec: 2764.41 - lr: 0.000022 - momentum: 0.000000
2023-10-25 16:31:50,747 epoch 4 - iter 1068/1786 - loss 0.05371101 - time (sec): 52.90 - samples/sec: 2780.40 - lr: 0.000021 - momentum: 0.000000
2023-10-25 16:31:59,923 epoch 4 - iter 1246/1786 - loss 0.05398691 - time (sec): 62.08 - samples/sec: 2776.06 - lr: 0.000021 - momentum: 0.000000
2023-10-25 16:32:09,269 epoch 4 - iter 1424/1786 - loss 0.05442093 - time (sec): 71.43 - samples/sec: 2779.56 - lr: 0.000021 - momentum: 0.000000
2023-10-25 16:32:18,063 epoch 4 - iter 1602/1786 - loss 0.05445676 - time (sec): 80.22 - samples/sec: 2774.76 - lr: 0.000020 - momentum: 0.000000
2023-10-25 16:32:27,167 epoch 4 - iter 1780/1786 - loss 0.05567026 - time (sec): 89.32 - samples/sec: 2778.43 - lr: 0.000020 - momentum: 0.000000
2023-10-25 16:32:27,482 ----------------------------------------------------------------------------------------------------
2023-10-25 16:32:27,482 EPOCH 4 done: loss 0.0557 - lr: 0.000020
2023-10-25 16:32:32,322 DEV : loss 0.1592852622270584 - f1-score (micro avg) 0.7807
2023-10-25 16:32:32,358 saving best model
2023-10-25 16:32:33,008 ----------------------------------------------------------------------------------------------------
2023-10-25 16:32:42,076 epoch 5 - iter 178/1786 - loss 0.03505898 - time (sec): 9.07 - samples/sec: 2675.18 - lr: 0.000020 - momentum: 0.000000
2023-10-25 16:32:51,082 epoch 5 - iter 356/1786 - loss 0.03624807 - time (sec): 18.07 - samples/sec: 2659.88 - lr: 0.000019 - momentum: 0.000000
2023-10-25 16:33:00,526 epoch 5 - iter 534/1786 - loss 0.03535087 - time (sec): 27.52 - samples/sec: 2648.84 - lr: 0.000019 - momentum: 0.000000
2023-10-25 16:33:10,149 epoch 5 - iter 712/1786 - loss 0.03748850 - time (sec): 37.14 - samples/sec: 2613.01 - lr: 0.000019 - momentum: 0.000000
2023-10-25 16:33:18,985 epoch 5 - iter 890/1786 - loss 0.03909736 - time (sec): 45.97 - samples/sec: 2621.81 - lr: 0.000018 - momentum: 0.000000
2023-10-25 16:33:27,925 epoch 5 - iter 1068/1786 - loss 0.03966780 - time (sec): 54.91 - samples/sec: 2632.78 - lr: 0.000018 - momentum: 0.000000
2023-10-25 16:33:37,092 epoch 5 - iter 1246/1786 - loss 0.03878930 - time (sec): 64.08 - samples/sec: 2662.85 - lr: 0.000018 - momentum: 0.000000
2023-10-25 16:33:45,813 epoch 5 - iter 1424/1786 - loss 0.03882227 - time (sec): 72.80 - samples/sec: 2682.22 - lr: 0.000017 - momentum: 0.000000
2023-10-25 16:33:55,003 epoch 5 - iter 1602/1786 - loss 0.03840917 - time (sec): 81.99 - samples/sec: 2715.84 - lr: 0.000017 - momentum: 0.000000
2023-10-25 16:34:03,749 epoch 5 - iter 1780/1786 - loss 0.03930397 - time (sec): 90.74 - samples/sec: 2734.76 - lr: 0.000017 - momentum: 0.000000
2023-10-25 16:34:04,047 ----------------------------------------------------------------------------------------------------
2023-10-25 16:34:04,047 EPOCH 5 done: loss 0.0395 - lr: 0.000017
2023-10-25 16:34:07,972 DEV : loss 0.18587026000022888 - f1-score (micro avg) 0.762
2023-10-25 16:34:07,993 ----------------------------------------------------------------------------------------------------
2023-10-25 16:34:17,461 epoch 6 - iter 178/1786 - loss 0.02228299 - time (sec): 9.46 - samples/sec: 2513.38 - lr: 0.000016 - momentum: 0.000000
2023-10-25 16:34:26,900 epoch 6 - iter 356/1786 - loss 0.02707967 - time (sec): 18.90 - samples/sec: 2582.19 - lr: 0.000016 - momentum: 0.000000
2023-10-25 16:34:36,267 epoch 6 - iter 534/1786 - loss 0.02969054 - time (sec): 28.27 - samples/sec: 2599.89 - lr: 0.000016 - momentum: 0.000000
2023-10-25 16:34:45,851 epoch 6 - iter 712/1786 - loss 0.02850474 - time (sec): 37.86 - samples/sec: 2596.22 - lr: 0.000015 - momentum: 0.000000
2023-10-25 16:34:55,626 epoch 6 - iter 890/1786 - loss 0.02952048 - time (sec): 47.63 - samples/sec: 2593.98 - lr: 0.000015 - momentum: 0.000000
2023-10-25 16:35:05,298 epoch 6 - iter 1068/1786 - loss 0.02959778 - time (sec): 57.30 - samples/sec: 2579.41 - lr: 0.000015 - momentum: 0.000000
2023-10-25 16:35:14,998 epoch 6 - iter 1246/1786 - loss 0.02950384 - time (sec): 67.00 - samples/sec: 2578.56 - lr: 0.000014 - momentum: 0.000000
2023-10-25 16:35:24,826 epoch 6 - iter 1424/1786 - loss 0.02899248 - time (sec): 76.83 - samples/sec: 2587.24 - lr: 0.000014 - momentum: 0.000000
2023-10-25 16:35:34,276 epoch 6 - iter 1602/1786 - loss 0.02952807 - time (sec): 86.28 - samples/sec: 2584.11 - lr: 0.000014 - momentum: 0.000000
2023-10-25 16:35:43,735 epoch 6 - iter 1780/1786 - loss 0.02944428 - time (sec): 95.74 - samples/sec: 2590.73 - lr: 0.000013 - momentum: 0.000000
2023-10-25 16:35:44,048 ----------------------------------------------------------------------------------------------------
2023-10-25 16:35:44,049 EPOCH 6 done: loss 0.0295 - lr: 0.000013
2023-10-25 16:35:49,183 DEV : loss 0.18982860445976257 - f1-score (micro avg) 0.7865
2023-10-25 16:35:49,208 saving best model
2023-10-25 16:35:49,872 ----------------------------------------------------------------------------------------------------
2023-10-25 16:35:59,426 epoch 7 - iter 178/1786 - loss 0.02267264 - time (sec): 9.55 - samples/sec: 2718.81 - lr: 0.000013 - momentum: 0.000000
2023-10-25 16:36:08,982 epoch 7 - iter 356/1786 - loss 0.02071867 - time (sec): 19.11 - samples/sec: 2692.77 - lr: 0.000013 - momentum: 0.000000
2023-10-25 16:36:18,619 epoch 7 - iter 534/1786 - loss 0.02250539 - time (sec): 28.74 - samples/sec: 2676.34 - lr: 0.000012 - momentum: 0.000000
2023-10-25 16:36:28,445 epoch 7 - iter 712/1786 - loss 0.02189393 - time (sec): 38.57 - samples/sec: 2645.65 - lr: 0.000012 - momentum: 0.000000
2023-10-25 16:36:37,705 epoch 7 - iter 890/1786 - loss 0.02316080 - time (sec): 47.83 - samples/sec: 2631.90 - lr: 0.000012 - momentum: 0.000000
2023-10-25 16:36:46,646 epoch 7 - iter 1068/1786 - loss 0.02272730 - time (sec): 56.77 - samples/sec: 2667.04 - lr: 0.000011 - momentum: 0.000000
2023-10-25 16:36:55,632 epoch 7 - iter 1246/1786 - loss 0.02207885 - time (sec): 65.76 - samples/sec: 2665.92 - lr: 0.000011 - momentum: 0.000000
2023-10-25 16:37:05,058 epoch 7 - iter 1424/1786 - loss 0.02206815 - time (sec): 75.18 - samples/sec: 2654.28 - lr: 0.000011 - momentum: 0.000000
2023-10-25 16:37:14,475 epoch 7 - iter 1602/1786 - loss 0.02192510 - time (sec): 84.60 - samples/sec: 2655.92 - lr: 0.000010 - momentum: 0.000000
2023-10-25 16:37:23,960 epoch 7 - iter 1780/1786 - loss 0.02204271 - time (sec): 94.09 - samples/sec: 2636.41 - lr: 0.000010 - momentum: 0.000000
2023-10-25 16:37:24,276 ----------------------------------------------------------------------------------------------------
2023-10-25 16:37:24,276 EPOCH 7 done: loss 0.0220 - lr: 0.000010
2023-10-25 16:37:29,189 DEV : loss 0.19655928015708923 - f1-score (micro avg) 0.7919
2023-10-25 16:37:29,211 saving best model
2023-10-25 16:37:29,862 ----------------------------------------------------------------------------------------------------
2023-10-25 16:37:39,385 epoch 8 - iter 178/1786 - loss 0.01876484 - time (sec): 9.52 - samples/sec: 2613.60 - lr: 0.000010 - momentum: 0.000000
2023-10-25 16:37:48,866 epoch 8 - iter 356/1786 - loss 0.01519318 - time (sec): 19.00 - samples/sec: 2526.15 - lr: 0.000009 - momentum: 0.000000
2023-10-25 16:37:58,302 epoch 8 - iter 534/1786 - loss 0.01373478 - time (sec): 28.44 - samples/sec: 2626.74 - lr: 0.000009 - momentum: 0.000000
2023-10-25 16:38:07,293 epoch 8 - iter 712/1786 - loss 0.01454254 - time (sec): 37.43 - samples/sec: 2645.54 - lr: 0.000009 - momentum: 0.000000
2023-10-25 16:38:16,013 epoch 8 - iter 890/1786 - loss 0.01484691 - time (sec): 46.15 - samples/sec: 2685.45 - lr: 0.000008 - momentum: 0.000000
2023-10-25 16:38:24,785 epoch 8 - iter 1068/1786 - loss 0.01499282 - time (sec): 54.92 - samples/sec: 2733.34 - lr: 0.000008 - momentum: 0.000000
2023-10-25 16:38:33,747 epoch 8 - iter 1246/1786 - loss 0.01424428 - time (sec): 63.88 - samples/sec: 2734.25 - lr: 0.000008 - momentum: 0.000000
2023-10-25 16:38:42,790 epoch 8 - iter 1424/1786 - loss 0.01406695 - time (sec): 72.93 - samples/sec: 2738.06 - lr: 0.000007 - momentum: 0.000000
2023-10-25 16:38:51,780 epoch 8 - iter 1602/1786 - loss 0.01437150 - time (sec): 81.92 - samples/sec: 2723.87 - lr: 0.000007 - momentum: 0.000000
2023-10-25 16:39:00,844 epoch 8 - iter 1780/1786 - loss 0.01533311 - time (sec): 90.98 - samples/sec: 2725.94 - lr: 0.000007 - momentum: 0.000000
2023-10-25 16:39:01,145 ----------------------------------------------------------------------------------------------------
2023-10-25 16:39:01,145 EPOCH 8 done: loss 0.0153 - lr: 0.000007
2023-10-25 16:39:05,268 DEV : loss 0.21181654930114746 - f1-score (micro avg) 0.7842
2023-10-25 16:39:05,288 ----------------------------------------------------------------------------------------------------
2023-10-25 16:39:14,500 epoch 9 - iter 178/1786 - loss 0.00816641 - time (sec): 9.21 - samples/sec: 2907.50 - lr: 0.000006 - momentum: 0.000000
2023-10-25 16:39:23,973 epoch 9 - iter 356/1786 - loss 0.01013131 - time (sec): 18.68 - samples/sec: 2780.63 - lr: 0.000006 - momentum: 0.000000
2023-10-25 16:39:33,578 epoch 9 - iter 534/1786 - loss 0.00954521 - time (sec): 28.29 - samples/sec: 2756.96 - lr: 0.000006 - momentum: 0.000000
2023-10-25 16:39:43,144 epoch 9 - iter 712/1786 - loss 0.00969192 - time (sec): 37.85 - samples/sec: 2674.70 - lr: 0.000005 - momentum: 0.000000
2023-10-25 16:39:52,805 epoch 9 - iter 890/1786 - loss 0.01062753 - time (sec): 47.51 - samples/sec: 2602.39 - lr: 0.000005 - momentum: 0.000000
2023-10-25 16:40:01,825 epoch 9 - iter 1068/1786 - loss 0.01065964 - time (sec): 56.53 - samples/sec: 2596.08 - lr: 0.000005 - momentum: 0.000000
2023-10-25 16:40:10,834 epoch 9 - iter 1246/1786 - loss 0.01011395 - time (sec): 65.54 - samples/sec: 2627.85 - lr: 0.000004 - momentum: 0.000000
2023-10-25 16:40:19,865 epoch 9 - iter 1424/1786 - loss 0.01040764 - time (sec): 74.58 - samples/sec: 2643.30 - lr: 0.000004 - momentum: 0.000000
2023-10-25 16:40:28,886 epoch 9 - iter 1602/1786 - loss 0.01055140 - time (sec): 83.60 - samples/sec: 2648.51 - lr: 0.000004 - momentum: 0.000000
2023-10-25 16:40:37,881 epoch 9 - iter 1780/1786 - loss 0.01019071 - time (sec): 92.59 - samples/sec: 2679.20 - lr: 0.000003 - momentum: 0.000000
2023-10-25 16:40:38,162 ----------------------------------------------------------------------------------------------------
2023-10-25 16:40:38,163 EPOCH 9 done: loss 0.0102 - lr: 0.000003
2023-10-25 16:40:43,043 DEV : loss 0.23426829278469086 - f1-score (micro avg) 0.7848
2023-10-25 16:40:43,066 ----------------------------------------------------------------------------------------------------
2023-10-25 16:40:52,584 epoch 10 - iter 178/1786 - loss 0.00771099 - time (sec): 9.52 - samples/sec: 2746.65 - lr: 0.000003 - momentum: 0.000000
2023-10-25 16:41:02,050 epoch 10 - iter 356/1786 - loss 0.00805677 - time (sec): 18.98 - samples/sec: 2646.83 - lr: 0.000003 - momentum: 0.000000
2023-10-25 16:41:11,538 epoch 10 - iter 534/1786 - loss 0.00817679 - time (sec): 28.47 - samples/sec: 2656.92 - lr: 0.000002 - momentum: 0.000000
2023-10-25 16:41:20,720 epoch 10 - iter 712/1786 - loss 0.00878410 - time (sec): 37.65 - samples/sec: 2640.29 - lr: 0.000002 - momentum: 0.000000
2023-10-25 16:41:29,664 epoch 10 - iter 890/1786 - loss 0.00887421 - time (sec): 46.60 - samples/sec: 2652.81 - lr: 0.000002 - momentum: 0.000000
2023-10-25 16:41:38,852 epoch 10 - iter 1068/1786 - loss 0.00789479 - time (sec): 55.78 - samples/sec: 2679.34 - lr: 0.000001 - momentum: 0.000000
2023-10-25 16:41:47,525 epoch 10 - iter 1246/1786 - loss 0.00784114 - time (sec): 64.46 - samples/sec: 2718.01 - lr: 0.000001 - momentum: 0.000000
2023-10-25 16:41:56,178 epoch 10 - iter 1424/1786 - loss 0.00742785 - time (sec): 73.11 - samples/sec: 2736.80 - lr: 0.000001 - momentum: 0.000000
2023-10-25 16:42:04,870 epoch 10 - iter 1602/1786 - loss 0.00792667 - time (sec): 81.80 - samples/sec: 2734.65 - lr: 0.000000 - momentum: 0.000000
2023-10-25 16:42:13,881 epoch 10 - iter 1780/1786 - loss 0.00755996 - time (sec): 90.81 - samples/sec: 2733.58 - lr: 0.000000 - momentum: 0.000000
2023-10-25 16:42:14,170 ----------------------------------------------------------------------------------------------------
2023-10-25 16:42:14,171 EPOCH 10 done: loss 0.0075 - lr: 0.000000
2023-10-25 16:42:18,903 DEV : loss 0.23706591129302979 - f1-score (micro avg) 0.7914
2023-10-25 16:42:19,357 ----------------------------------------------------------------------------------------------------
2023-10-25 16:42:19,358 Loading model from best epoch ...
2023-10-25 16:42:21,248 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 16:42:34,839
Results:
- F-score (micro) 0.6924
- F-score (macro) 0.62
- Accuracy 0.5472
By class:
precision recall f1-score support
LOC 0.7289 0.6703 0.6984 1095
PER 0.7611 0.7806 0.7707 1012
ORG 0.4414 0.5910 0.5054 357
HumanProd 0.3966 0.6970 0.5055 33
micro avg 0.6811 0.7040 0.6924 2497
macro avg 0.5820 0.6847 0.6200 2497
weighted avg 0.6964 0.7040 0.6976 2497
2023-10-25 16:42:34,839 ----------------------------------------------------------------------------------------------------