stefan-it's picture
Upload ./training.log with huggingface_hub
00c677b
2023-10-25 16:45:15,823 ----------------------------------------------------------------------------------------------------
2023-10-25 16:45:15,823 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 16:45:15,824 ----------------------------------------------------------------------------------------------------
2023-10-25 16:45:15,824 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-25 16:45:15,824 ----------------------------------------------------------------------------------------------------
2023-10-25 16:45:15,824 Train: 20847 sentences
2023-10-25 16:45:15,824 (train_with_dev=False, train_with_test=False)
2023-10-25 16:45:15,824 ----------------------------------------------------------------------------------------------------
2023-10-25 16:45:15,824 Training Params:
2023-10-25 16:45:15,824 - learning_rate: "5e-05"
2023-10-25 16:45:15,824 - mini_batch_size: "8"
2023-10-25 16:45:15,824 - max_epochs: "10"
2023-10-25 16:45:15,824 - shuffle: "True"
2023-10-25 16:45:15,824 ----------------------------------------------------------------------------------------------------
2023-10-25 16:45:15,824 Plugins:
2023-10-25 16:45:15,824 - TensorboardLogger
2023-10-25 16:45:15,824 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 16:45:15,824 ----------------------------------------------------------------------------------------------------
2023-10-25 16:45:15,824 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 16:45:15,824 - metric: "('micro avg', 'f1-score')"
2023-10-25 16:45:15,824 ----------------------------------------------------------------------------------------------------
2023-10-25 16:45:15,824 Computation:
2023-10-25 16:45:15,824 - compute on device: cuda:0
2023-10-25 16:45:15,824 - embedding storage: none
2023-10-25 16:45:15,824 ----------------------------------------------------------------------------------------------------
2023-10-25 16:45:15,824 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-25 16:45:15,824 ----------------------------------------------------------------------------------------------------
2023-10-25 16:45:15,824 ----------------------------------------------------------------------------------------------------
2023-10-25 16:45:15,824 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 16:45:29,924 epoch 1 - iter 260/2606 - loss 1.44327200 - time (sec): 14.10 - samples/sec: 2479.77 - lr: 0.000005 - momentum: 0.000000
2023-10-25 16:45:44,833 epoch 1 - iter 520/2606 - loss 0.87020391 - time (sec): 29.01 - samples/sec: 2516.71 - lr: 0.000010 - momentum: 0.000000
2023-10-25 16:45:58,506 epoch 1 - iter 780/2606 - loss 0.67200000 - time (sec): 42.68 - samples/sec: 2488.08 - lr: 0.000015 - momentum: 0.000000
2023-10-25 16:46:12,807 epoch 1 - iter 1040/2606 - loss 0.56002942 - time (sec): 56.98 - samples/sec: 2552.72 - lr: 0.000020 - momentum: 0.000000
2023-10-25 16:46:27,006 epoch 1 - iter 1300/2606 - loss 0.48938374 - time (sec): 71.18 - samples/sec: 2564.16 - lr: 0.000025 - momentum: 0.000000
2023-10-25 16:46:40,890 epoch 1 - iter 1560/2606 - loss 0.43900299 - time (sec): 85.06 - samples/sec: 2585.58 - lr: 0.000030 - momentum: 0.000000
2023-10-25 16:46:55,169 epoch 1 - iter 1820/2606 - loss 0.39943791 - time (sec): 99.34 - samples/sec: 2600.59 - lr: 0.000035 - momentum: 0.000000
2023-10-25 16:47:09,222 epoch 1 - iter 2080/2606 - loss 0.37314820 - time (sec): 113.40 - samples/sec: 2595.66 - lr: 0.000040 - momentum: 0.000000
2023-10-25 16:47:23,358 epoch 1 - iter 2340/2606 - loss 0.35456131 - time (sec): 127.53 - samples/sec: 2592.17 - lr: 0.000045 - momentum: 0.000000
2023-10-25 16:47:37,439 epoch 1 - iter 2600/2606 - loss 0.33854746 - time (sec): 141.61 - samples/sec: 2592.16 - lr: 0.000050 - momentum: 0.000000
2023-10-25 16:47:37,706 ----------------------------------------------------------------------------------------------------
2023-10-25 16:47:37,707 EPOCH 1 done: loss 0.3387 - lr: 0.000050
2023-10-25 16:47:41,324 DEV : loss 0.118812695145607 - f1-score (micro avg) 0.1625
2023-10-25 16:47:41,349 saving best model
2023-10-25 16:47:41,819 ----------------------------------------------------------------------------------------------------
2023-10-25 16:47:55,809 epoch 2 - iter 260/2606 - loss 0.17564576 - time (sec): 13.99 - samples/sec: 2630.19 - lr: 0.000049 - momentum: 0.000000
2023-10-25 16:48:10,903 epoch 2 - iter 520/2606 - loss 0.16339489 - time (sec): 29.08 - samples/sec: 2633.99 - lr: 0.000049 - momentum: 0.000000
2023-10-25 16:48:25,129 epoch 2 - iter 780/2606 - loss 0.16367825 - time (sec): 43.31 - samples/sec: 2648.13 - lr: 0.000048 - momentum: 0.000000
2023-10-25 16:48:39,023 epoch 2 - iter 1040/2606 - loss 0.16465426 - time (sec): 57.20 - samples/sec: 2644.39 - lr: 0.000048 - momentum: 0.000000
2023-10-25 16:48:52,698 epoch 2 - iter 1300/2606 - loss 0.16691656 - time (sec): 70.88 - samples/sec: 2618.17 - lr: 0.000047 - momentum: 0.000000
2023-10-25 16:49:07,060 epoch 2 - iter 1560/2606 - loss 0.16504912 - time (sec): 85.24 - samples/sec: 2615.67 - lr: 0.000047 - momentum: 0.000000
2023-10-25 16:49:21,199 epoch 2 - iter 1820/2606 - loss 0.16040319 - time (sec): 99.38 - samples/sec: 2625.20 - lr: 0.000046 - momentum: 0.000000
2023-10-25 16:49:35,311 epoch 2 - iter 2080/2606 - loss 0.15902007 - time (sec): 113.49 - samples/sec: 2637.30 - lr: 0.000046 - momentum: 0.000000
2023-10-25 16:49:48,475 epoch 2 - iter 2340/2606 - loss 0.15709971 - time (sec): 126.65 - samples/sec: 2628.74 - lr: 0.000045 - momentum: 0.000000
2023-10-25 16:50:02,559 epoch 2 - iter 2600/2606 - loss 0.15647373 - time (sec): 140.74 - samples/sec: 2604.39 - lr: 0.000044 - momentum: 0.000000
2023-10-25 16:50:02,869 ----------------------------------------------------------------------------------------------------
2023-10-25 16:50:02,869 EPOCH 2 done: loss 0.1566 - lr: 0.000044
2023-10-25 16:50:09,872 DEV : loss 0.1611776500940323 - f1-score (micro avg) 0.3391
2023-10-25 16:50:09,896 saving best model
2023-10-25 16:50:10,500 ----------------------------------------------------------------------------------------------------
2023-10-25 16:50:24,874 epoch 3 - iter 260/2606 - loss 0.10670473 - time (sec): 14.37 - samples/sec: 2724.91 - lr: 0.000044 - momentum: 0.000000
2023-10-25 16:50:38,583 epoch 3 - iter 520/2606 - loss 0.10898329 - time (sec): 28.08 - samples/sec: 2706.78 - lr: 0.000043 - momentum: 0.000000
2023-10-25 16:50:52,379 epoch 3 - iter 780/2606 - loss 0.11166505 - time (sec): 41.88 - samples/sec: 2643.92 - lr: 0.000043 - momentum: 0.000000
2023-10-25 16:51:06,834 epoch 3 - iter 1040/2606 - loss 0.10939807 - time (sec): 56.33 - samples/sec: 2663.11 - lr: 0.000042 - momentum: 0.000000
2023-10-25 16:51:20,398 epoch 3 - iter 1300/2606 - loss 0.10997620 - time (sec): 69.90 - samples/sec: 2648.46 - lr: 0.000042 - momentum: 0.000000
2023-10-25 16:51:34,567 epoch 3 - iter 1560/2606 - loss 0.11114702 - time (sec): 84.07 - samples/sec: 2627.29 - lr: 0.000041 - momentum: 0.000000
2023-10-25 16:51:48,566 epoch 3 - iter 1820/2606 - loss 0.10915658 - time (sec): 98.06 - samples/sec: 2619.10 - lr: 0.000041 - momentum: 0.000000
2023-10-25 16:52:02,818 epoch 3 - iter 2080/2606 - loss 0.10966065 - time (sec): 112.32 - samples/sec: 2620.09 - lr: 0.000040 - momentum: 0.000000
2023-10-25 16:52:16,817 epoch 3 - iter 2340/2606 - loss 0.11109113 - time (sec): 126.31 - samples/sec: 2612.60 - lr: 0.000039 - momentum: 0.000000
2023-10-25 16:52:30,695 epoch 3 - iter 2600/2606 - loss 0.10926086 - time (sec): 140.19 - samples/sec: 2616.49 - lr: 0.000039 - momentum: 0.000000
2023-10-25 16:52:30,986 ----------------------------------------------------------------------------------------------------
2023-10-25 16:52:30,986 EPOCH 3 done: loss 0.1093 - lr: 0.000039
2023-10-25 16:52:38,212 DEV : loss 0.23014920949935913 - f1-score (micro avg) 0.3626
2023-10-25 16:52:38,236 saving best model
2023-10-25 16:52:38,689 ----------------------------------------------------------------------------------------------------
2023-10-25 16:52:52,399 epoch 4 - iter 260/2606 - loss 0.08026601 - time (sec): 13.71 - samples/sec: 2578.59 - lr: 0.000038 - momentum: 0.000000
2023-10-25 16:53:06,053 epoch 4 - iter 520/2606 - loss 0.07866155 - time (sec): 27.36 - samples/sec: 2540.99 - lr: 0.000038 - momentum: 0.000000
2023-10-25 16:53:20,595 epoch 4 - iter 780/2606 - loss 0.07694070 - time (sec): 41.90 - samples/sec: 2599.38 - lr: 0.000037 - momentum: 0.000000
2023-10-25 16:53:34,797 epoch 4 - iter 1040/2606 - loss 0.07572775 - time (sec): 56.11 - samples/sec: 2612.43 - lr: 0.000037 - momentum: 0.000000
2023-10-25 16:53:48,644 epoch 4 - iter 1300/2606 - loss 0.07947650 - time (sec): 69.95 - samples/sec: 2613.93 - lr: 0.000036 - momentum: 0.000000
2023-10-25 16:54:02,886 epoch 4 - iter 1560/2606 - loss 0.07880117 - time (sec): 84.20 - samples/sec: 2609.19 - lr: 0.000036 - momentum: 0.000000
2023-10-25 16:54:16,960 epoch 4 - iter 1820/2606 - loss 0.07887358 - time (sec): 98.27 - samples/sec: 2611.18 - lr: 0.000035 - momentum: 0.000000
2023-10-25 16:54:31,164 epoch 4 - iter 2080/2606 - loss 0.07742549 - time (sec): 112.47 - samples/sec: 2620.31 - lr: 0.000034 - momentum: 0.000000
2023-10-25 16:54:44,982 epoch 4 - iter 2340/2606 - loss 0.07826687 - time (sec): 126.29 - samples/sec: 2623.98 - lr: 0.000034 - momentum: 0.000000
2023-10-25 16:54:58,682 epoch 4 - iter 2600/2606 - loss 0.07860251 - time (sec): 139.99 - samples/sec: 2620.12 - lr: 0.000033 - momentum: 0.000000
2023-10-25 16:54:58,958 ----------------------------------------------------------------------------------------------------
2023-10-25 16:54:58,958 EPOCH 4 done: loss 0.0788 - lr: 0.000033
2023-10-25 16:55:05,179 DEV : loss 0.24600794911384583 - f1-score (micro avg) 0.358
2023-10-25 16:55:05,203 ----------------------------------------------------------------------------------------------------
2023-10-25 16:55:19,990 epoch 5 - iter 260/2606 - loss 0.05672130 - time (sec): 14.79 - samples/sec: 2594.82 - lr: 0.000033 - momentum: 0.000000
2023-10-25 16:55:34,219 epoch 5 - iter 520/2606 - loss 0.06166018 - time (sec): 29.01 - samples/sec: 2621.33 - lr: 0.000032 - momentum: 0.000000
2023-10-25 16:55:48,173 epoch 5 - iter 780/2606 - loss 0.06054051 - time (sec): 42.97 - samples/sec: 2619.67 - lr: 0.000032 - momentum: 0.000000
2023-10-25 16:56:02,180 epoch 5 - iter 1040/2606 - loss 0.05994699 - time (sec): 56.98 - samples/sec: 2592.31 - lr: 0.000031 - momentum: 0.000000
2023-10-25 16:56:16,341 epoch 5 - iter 1300/2606 - loss 0.06032593 - time (sec): 71.14 - samples/sec: 2605.83 - lr: 0.000031 - momentum: 0.000000
2023-10-25 16:56:30,625 epoch 5 - iter 1560/2606 - loss 0.05835629 - time (sec): 85.42 - samples/sec: 2598.27 - lr: 0.000030 - momentum: 0.000000
2023-10-25 16:56:44,827 epoch 5 - iter 1820/2606 - loss 0.06037339 - time (sec): 99.62 - samples/sec: 2592.66 - lr: 0.000029 - momentum: 0.000000
2023-10-25 16:56:58,661 epoch 5 - iter 2080/2606 - loss 0.05960227 - time (sec): 113.46 - samples/sec: 2600.76 - lr: 0.000029 - momentum: 0.000000
2023-10-25 16:57:11,848 epoch 5 - iter 2340/2606 - loss 0.05877380 - time (sec): 126.64 - samples/sec: 2617.27 - lr: 0.000028 - momentum: 0.000000
2023-10-25 16:57:26,094 epoch 5 - iter 2600/2606 - loss 0.05813652 - time (sec): 140.89 - samples/sec: 2605.23 - lr: 0.000028 - momentum: 0.000000
2023-10-25 16:57:26,431 ----------------------------------------------------------------------------------------------------
2023-10-25 16:57:26,431 EPOCH 5 done: loss 0.0581 - lr: 0.000028
2023-10-25 16:57:32,774 DEV : loss 0.29553988575935364 - f1-score (micro avg) 0.4099
2023-10-25 16:57:32,799 saving best model
2023-10-25 16:57:33,294 ----------------------------------------------------------------------------------------------------
2023-10-25 16:57:48,554 epoch 6 - iter 260/2606 - loss 0.04797258 - time (sec): 15.26 - samples/sec: 2573.89 - lr: 0.000027 - momentum: 0.000000
2023-10-25 16:58:02,779 epoch 6 - iter 520/2606 - loss 0.04837650 - time (sec): 29.48 - samples/sec: 2595.28 - lr: 0.000027 - momentum: 0.000000
2023-10-25 16:58:17,011 epoch 6 - iter 780/2606 - loss 0.04578472 - time (sec): 43.71 - samples/sec: 2609.95 - lr: 0.000026 - momentum: 0.000000
2023-10-25 16:58:31,560 epoch 6 - iter 1040/2606 - loss 0.04760570 - time (sec): 58.26 - samples/sec: 2572.96 - lr: 0.000026 - momentum: 0.000000
2023-10-25 16:58:45,137 epoch 6 - iter 1300/2606 - loss 0.04892803 - time (sec): 71.84 - samples/sec: 2569.00 - lr: 0.000025 - momentum: 0.000000
2023-10-25 16:58:58,688 epoch 6 - iter 1560/2606 - loss 0.05121577 - time (sec): 85.39 - samples/sec: 2567.67 - lr: 0.000024 - momentum: 0.000000
2023-10-25 16:59:13,684 epoch 6 - iter 1820/2606 - loss 0.05294998 - time (sec): 100.39 - samples/sec: 2576.77 - lr: 0.000024 - momentum: 0.000000
2023-10-25 16:59:27,754 epoch 6 - iter 2080/2606 - loss 0.05640148 - time (sec): 114.46 - samples/sec: 2573.59 - lr: 0.000023 - momentum: 0.000000
2023-10-25 16:59:41,576 epoch 6 - iter 2340/2606 - loss 0.05593627 - time (sec): 128.28 - samples/sec: 2576.66 - lr: 0.000023 - momentum: 0.000000
2023-10-25 16:59:55,997 epoch 6 - iter 2600/2606 - loss 0.05483941 - time (sec): 142.70 - samples/sec: 2565.84 - lr: 0.000022 - momentum: 0.000000
2023-10-25 16:59:56,369 ----------------------------------------------------------------------------------------------------
2023-10-25 16:59:56,369 EPOCH 6 done: loss 0.0547 - lr: 0.000022
2023-10-25 17:00:02,624 DEV : loss 0.33623284101486206 - f1-score (micro avg) 0.3687
2023-10-25 17:00:02,649 ----------------------------------------------------------------------------------------------------
2023-10-25 17:00:16,615 epoch 7 - iter 260/2606 - loss 0.03858880 - time (sec): 13.97 - samples/sec: 2649.66 - lr: 0.000022 - momentum: 0.000000
2023-10-25 17:00:30,568 epoch 7 - iter 520/2606 - loss 0.04139851 - time (sec): 27.92 - samples/sec: 2635.80 - lr: 0.000021 - momentum: 0.000000
2023-10-25 17:00:45,290 epoch 7 - iter 780/2606 - loss 0.04241529 - time (sec): 42.64 - samples/sec: 2604.64 - lr: 0.000021 - momentum: 0.000000
2023-10-25 17:00:59,285 epoch 7 - iter 1040/2606 - loss 0.04833894 - time (sec): 56.64 - samples/sec: 2601.68 - lr: 0.000020 - momentum: 0.000000
2023-10-25 17:01:13,271 epoch 7 - iter 1300/2606 - loss 0.04856922 - time (sec): 70.62 - samples/sec: 2592.87 - lr: 0.000019 - momentum: 0.000000
2023-10-25 17:01:28,375 epoch 7 - iter 1560/2606 - loss 0.05096085 - time (sec): 85.73 - samples/sec: 2579.70 - lr: 0.000019 - momentum: 0.000000
2023-10-25 17:01:43,143 epoch 7 - iter 1820/2606 - loss 0.05958653 - time (sec): 100.49 - samples/sec: 2603.24 - lr: 0.000018 - momentum: 0.000000
2023-10-25 17:01:56,751 epoch 7 - iter 2080/2606 - loss 0.06793201 - time (sec): 114.10 - samples/sec: 2605.36 - lr: 0.000018 - momentum: 0.000000
2023-10-25 17:02:09,950 epoch 7 - iter 2340/2606 - loss 0.06878116 - time (sec): 127.30 - samples/sec: 2607.52 - lr: 0.000017 - momentum: 0.000000
2023-10-25 17:02:23,915 epoch 7 - iter 2600/2606 - loss 0.06941224 - time (sec): 141.27 - samples/sec: 2596.78 - lr: 0.000017 - momentum: 0.000000
2023-10-25 17:02:24,220 ----------------------------------------------------------------------------------------------------
2023-10-25 17:02:24,220 EPOCH 7 done: loss 0.0694 - lr: 0.000017
2023-10-25 17:02:30,444 DEV : loss 0.35152772068977356 - f1-score (micro avg) 0.3214
2023-10-25 17:02:30,469 ----------------------------------------------------------------------------------------------------
2023-10-25 17:02:44,163 epoch 8 - iter 260/2606 - loss 0.07588476 - time (sec): 13.69 - samples/sec: 2635.53 - lr: 0.000016 - momentum: 0.000000
2023-10-25 17:02:57,982 epoch 8 - iter 520/2606 - loss 0.10043230 - time (sec): 27.51 - samples/sec: 2649.59 - lr: 0.000016 - momentum: 0.000000
2023-10-25 17:03:11,756 epoch 8 - iter 780/2606 - loss 0.13203650 - time (sec): 41.29 - samples/sec: 2628.01 - lr: 0.000015 - momentum: 0.000000
2023-10-25 17:03:25,598 epoch 8 - iter 1040/2606 - loss 0.12385530 - time (sec): 55.13 - samples/sec: 2623.26 - lr: 0.000014 - momentum: 0.000000
2023-10-25 17:03:39,705 epoch 8 - iter 1300/2606 - loss 0.12813762 - time (sec): 69.23 - samples/sec: 2618.67 - lr: 0.000014 - momentum: 0.000000
2023-10-25 17:03:53,694 epoch 8 - iter 1560/2606 - loss 0.13153397 - time (sec): 83.22 - samples/sec: 2621.37 - lr: 0.000013 - momentum: 0.000000
2023-10-25 17:04:08,336 epoch 8 - iter 1820/2606 - loss 0.12905434 - time (sec): 97.87 - samples/sec: 2630.44 - lr: 0.000013 - momentum: 0.000000
2023-10-25 17:04:22,400 epoch 8 - iter 2080/2606 - loss 0.13389258 - time (sec): 111.93 - samples/sec: 2623.41 - lr: 0.000012 - momentum: 0.000000
2023-10-25 17:04:36,755 epoch 8 - iter 2340/2606 - loss 0.13480518 - time (sec): 126.28 - samples/sec: 2606.93 - lr: 0.000012 - momentum: 0.000000
2023-10-25 17:04:50,583 epoch 8 - iter 2600/2606 - loss 0.13531880 - time (sec): 140.11 - samples/sec: 2616.35 - lr: 0.000011 - momentum: 0.000000
2023-10-25 17:04:50,912 ----------------------------------------------------------------------------------------------------
2023-10-25 17:04:50,912 EPOCH 8 done: loss 0.1352 - lr: 0.000011
2023-10-25 17:04:57,141 DEV : loss 0.2633623480796814 - f1-score (micro avg) 0.2342
2023-10-25 17:04:57,166 ----------------------------------------------------------------------------------------------------
2023-10-25 17:05:11,033 epoch 9 - iter 260/2606 - loss 0.08967090 - time (sec): 13.87 - samples/sec: 2713.58 - lr: 0.000011 - momentum: 0.000000
2023-10-25 17:05:25,132 epoch 9 - iter 520/2606 - loss 0.09331506 - time (sec): 27.97 - samples/sec: 2651.52 - lr: 0.000010 - momentum: 0.000000
2023-10-25 17:05:38,813 epoch 9 - iter 780/2606 - loss 0.09092922 - time (sec): 41.65 - samples/sec: 2643.14 - lr: 0.000009 - momentum: 0.000000
2023-10-25 17:05:52,488 epoch 9 - iter 1040/2606 - loss 0.09774521 - time (sec): 55.32 - samples/sec: 2686.25 - lr: 0.000009 - momentum: 0.000000
2023-10-25 17:06:06,110 epoch 9 - iter 1300/2606 - loss 0.10814172 - time (sec): 68.94 - samples/sec: 2685.63 - lr: 0.000008 - momentum: 0.000000
2023-10-25 17:06:19,697 epoch 9 - iter 1560/2606 - loss 0.11196481 - time (sec): 82.53 - samples/sec: 2668.35 - lr: 0.000008 - momentum: 0.000000
2023-10-25 17:06:33,655 epoch 9 - iter 1820/2606 - loss 0.11081450 - time (sec): 96.49 - samples/sec: 2673.87 - lr: 0.000007 - momentum: 0.000000
2023-10-25 17:06:47,477 epoch 9 - iter 2080/2606 - loss 0.11167705 - time (sec): 110.31 - samples/sec: 2663.33 - lr: 0.000007 - momentum: 0.000000
2023-10-25 17:07:01,705 epoch 9 - iter 2340/2606 - loss 0.11066523 - time (sec): 124.54 - samples/sec: 2664.34 - lr: 0.000006 - momentum: 0.000000
2023-10-25 17:07:15,529 epoch 9 - iter 2600/2606 - loss 0.11158714 - time (sec): 138.36 - samples/sec: 2647.26 - lr: 0.000006 - momentum: 0.000000
2023-10-25 17:07:15,945 ----------------------------------------------------------------------------------------------------
2023-10-25 17:07:15,946 EPOCH 9 done: loss 0.1115 - lr: 0.000006
2023-10-25 17:07:22,880 DEV : loss 0.2682478427886963 - f1-score (micro avg) 0.2293
2023-10-25 17:07:22,913 ----------------------------------------------------------------------------------------------------
2023-10-25 17:07:37,193 epoch 10 - iter 260/2606 - loss 0.09408029 - time (sec): 14.28 - samples/sec: 2595.12 - lr: 0.000005 - momentum: 0.000000
2023-10-25 17:07:51,010 epoch 10 - iter 520/2606 - loss 0.09512229 - time (sec): 28.09 - samples/sec: 2593.26 - lr: 0.000004 - momentum: 0.000000
2023-10-25 17:08:05,607 epoch 10 - iter 780/2606 - loss 0.08964442 - time (sec): 42.69 - samples/sec: 2609.86 - lr: 0.000004 - momentum: 0.000000
2023-10-25 17:08:20,011 epoch 10 - iter 1040/2606 - loss 0.08987618 - time (sec): 57.10 - samples/sec: 2633.35 - lr: 0.000003 - momentum: 0.000000
2023-10-25 17:08:33,885 epoch 10 - iter 1300/2606 - loss 0.08788357 - time (sec): 70.97 - samples/sec: 2663.52 - lr: 0.000003 - momentum: 0.000000
2023-10-25 17:08:47,392 epoch 10 - iter 1560/2606 - loss 0.08724918 - time (sec): 84.48 - samples/sec: 2641.55 - lr: 0.000002 - momentum: 0.000000
2023-10-25 17:09:01,218 epoch 10 - iter 1820/2606 - loss 0.08700501 - time (sec): 98.30 - samples/sec: 2620.75 - lr: 0.000002 - momentum: 0.000000
2023-10-25 17:09:14,991 epoch 10 - iter 2080/2606 - loss 0.08812984 - time (sec): 112.08 - samples/sec: 2606.24 - lr: 0.000001 - momentum: 0.000000
2023-10-25 17:09:29,100 epoch 10 - iter 2340/2606 - loss 0.09074088 - time (sec): 126.18 - samples/sec: 2614.01 - lr: 0.000001 - momentum: 0.000000
2023-10-25 17:09:43,532 epoch 10 - iter 2600/2606 - loss 0.09068240 - time (sec): 140.62 - samples/sec: 2606.43 - lr: 0.000000 - momentum: 0.000000
2023-10-25 17:09:43,851 ----------------------------------------------------------------------------------------------------
2023-10-25 17:09:43,851 EPOCH 10 done: loss 0.0905 - lr: 0.000000
2023-10-25 17:09:50,726 DEV : loss 0.27786970138549805 - f1-score (micro avg) 0.2168
2023-10-25 17:09:51,221 ----------------------------------------------------------------------------------------------------
2023-10-25 17:09:51,222 Loading model from best epoch ...
2023-10-25 17:09:52,830 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 17:10:02,533
Results:
- F-score (micro) 0.4446
- F-score (macro) 0.2829
- Accuracy 0.2912
By class:
precision recall f1-score support
LOC 0.5264 0.5840 0.5537 1214
PER 0.4000 0.3490 0.3728 808
ORG 0.2194 0.1926 0.2051 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4461 0.4431 0.4446 2390
macro avg 0.2864 0.2814 0.2829 2390
weighted avg 0.4350 0.4431 0.4376 2390
2023-10-25 17:10:02,533 ----------------------------------------------------------------------------------------------------