stefan-it's picture
Upload ./training.log with huggingface_hub
be4d905
2023-10-25 10:37:21,901 ----------------------------------------------------------------------------------------------------
2023-10-25 10:37:21,902 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 10:37:21,902 ----------------------------------------------------------------------------------------------------
2023-10-25 10:37:21,903 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-25 10:37:21,903 ----------------------------------------------------------------------------------------------------
2023-10-25 10:37:21,903 Train: 6183 sentences
2023-10-25 10:37:21,903 (train_with_dev=False, train_with_test=False)
2023-10-25 10:37:21,903 ----------------------------------------------------------------------------------------------------
2023-10-25 10:37:21,903 Training Params:
2023-10-25 10:37:21,903 - learning_rate: "3e-05"
2023-10-25 10:37:21,903 - mini_batch_size: "8"
2023-10-25 10:37:21,903 - max_epochs: "10"
2023-10-25 10:37:21,903 - shuffle: "True"
2023-10-25 10:37:21,903 ----------------------------------------------------------------------------------------------------
2023-10-25 10:37:21,903 Plugins:
2023-10-25 10:37:21,903 - TensorboardLogger
2023-10-25 10:37:21,903 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 10:37:21,903 ----------------------------------------------------------------------------------------------------
2023-10-25 10:37:21,903 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 10:37:21,903 - metric: "('micro avg', 'f1-score')"
2023-10-25 10:37:21,903 ----------------------------------------------------------------------------------------------------
2023-10-25 10:37:21,903 Computation:
2023-10-25 10:37:21,903 - compute on device: cuda:0
2023-10-25 10:37:21,903 - embedding storage: none
2023-10-25 10:37:21,903 ----------------------------------------------------------------------------------------------------
2023-10-25 10:37:21,903 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-25 10:37:21,903 ----------------------------------------------------------------------------------------------------
2023-10-25 10:37:21,903 ----------------------------------------------------------------------------------------------------
2023-10-25 10:37:21,903 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 10:37:26,605 epoch 1 - iter 77/773 - loss 2.00342293 - time (sec): 4.70 - samples/sec: 2712.01 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:37:31,262 epoch 1 - iter 154/773 - loss 1.14055389 - time (sec): 9.36 - samples/sec: 2662.74 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:37:35,889 epoch 1 - iter 231/773 - loss 0.82625202 - time (sec): 13.98 - samples/sec: 2642.41 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:37:40,575 epoch 1 - iter 308/773 - loss 0.64493288 - time (sec): 18.67 - samples/sec: 2668.34 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:37:45,208 epoch 1 - iter 385/773 - loss 0.53761507 - time (sec): 23.30 - samples/sec: 2669.10 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:37:49,742 epoch 1 - iter 462/773 - loss 0.47184545 - time (sec): 27.84 - samples/sec: 2656.78 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:37:54,392 epoch 1 - iter 539/773 - loss 0.41778036 - time (sec): 32.49 - samples/sec: 2672.72 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:37:59,140 epoch 1 - iter 616/773 - loss 0.37526149 - time (sec): 37.24 - samples/sec: 2675.01 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:38:03,677 epoch 1 - iter 693/773 - loss 0.34438623 - time (sec): 41.77 - samples/sec: 2678.19 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:38:08,219 epoch 1 - iter 770/773 - loss 0.32016433 - time (sec): 46.31 - samples/sec: 2673.41 - lr: 0.000030 - momentum: 0.000000
2023-10-25 10:38:08,404 ----------------------------------------------------------------------------------------------------
2023-10-25 10:38:08,404 EPOCH 1 done: loss 0.3193 - lr: 0.000030
2023-10-25 10:38:11,901 DEV : loss 0.05575157329440117 - f1-score (micro avg) 0.7258
2023-10-25 10:38:11,920 saving best model
2023-10-25 10:38:12,442 ----------------------------------------------------------------------------------------------------
2023-10-25 10:38:17,190 epoch 2 - iter 77/773 - loss 0.06917782 - time (sec): 4.75 - samples/sec: 2589.73 - lr: 0.000030 - momentum: 0.000000
2023-10-25 10:38:21,845 epoch 2 - iter 154/773 - loss 0.07295557 - time (sec): 9.40 - samples/sec: 2635.98 - lr: 0.000029 - momentum: 0.000000
2023-10-25 10:38:26,432 epoch 2 - iter 231/773 - loss 0.07445178 - time (sec): 13.99 - samples/sec: 2577.67 - lr: 0.000029 - momentum: 0.000000
2023-10-25 10:38:31,123 epoch 2 - iter 308/773 - loss 0.07546608 - time (sec): 18.68 - samples/sec: 2620.82 - lr: 0.000029 - momentum: 0.000000
2023-10-25 10:38:35,870 epoch 2 - iter 385/773 - loss 0.07298737 - time (sec): 23.43 - samples/sec: 2630.85 - lr: 0.000028 - momentum: 0.000000
2023-10-25 10:38:40,555 epoch 2 - iter 462/773 - loss 0.07131937 - time (sec): 28.11 - samples/sec: 2642.70 - lr: 0.000028 - momentum: 0.000000
2023-10-25 10:38:45,260 epoch 2 - iter 539/773 - loss 0.07085615 - time (sec): 32.82 - samples/sec: 2652.90 - lr: 0.000028 - momentum: 0.000000
2023-10-25 10:38:50,000 epoch 2 - iter 616/773 - loss 0.07015877 - time (sec): 37.56 - samples/sec: 2628.79 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:38:54,537 epoch 2 - iter 693/773 - loss 0.06882789 - time (sec): 42.09 - samples/sec: 2624.41 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:38:59,395 epoch 2 - iter 770/773 - loss 0.06928794 - time (sec): 46.95 - samples/sec: 2635.97 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:38:59,585 ----------------------------------------------------------------------------------------------------
2023-10-25 10:38:59,586 EPOCH 2 done: loss 0.0691 - lr: 0.000027
2023-10-25 10:39:02,404 DEV : loss 0.049297548830509186 - f1-score (micro avg) 0.8142
2023-10-25 10:39:02,422 saving best model
2023-10-25 10:39:03,089 ----------------------------------------------------------------------------------------------------
2023-10-25 10:39:07,755 epoch 3 - iter 77/773 - loss 0.04741193 - time (sec): 4.66 - samples/sec: 2618.48 - lr: 0.000026 - momentum: 0.000000
2023-10-25 10:39:12,662 epoch 3 - iter 154/773 - loss 0.04548219 - time (sec): 9.57 - samples/sec: 2493.23 - lr: 0.000026 - momentum: 0.000000
2023-10-25 10:39:17,185 epoch 3 - iter 231/773 - loss 0.04459085 - time (sec): 14.09 - samples/sec: 2538.42 - lr: 0.000026 - momentum: 0.000000
2023-10-25 10:39:21,796 epoch 3 - iter 308/773 - loss 0.04386816 - time (sec): 18.70 - samples/sec: 2585.25 - lr: 0.000025 - momentum: 0.000000
2023-10-25 10:39:26,414 epoch 3 - iter 385/773 - loss 0.04428707 - time (sec): 23.32 - samples/sec: 2595.54 - lr: 0.000025 - momentum: 0.000000
2023-10-25 10:39:31,080 epoch 3 - iter 462/773 - loss 0.04386941 - time (sec): 27.99 - samples/sec: 2628.33 - lr: 0.000025 - momentum: 0.000000
2023-10-25 10:39:35,843 epoch 3 - iter 539/773 - loss 0.04425161 - time (sec): 32.75 - samples/sec: 2629.03 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:39:40,653 epoch 3 - iter 616/773 - loss 0.04565354 - time (sec): 37.56 - samples/sec: 2628.00 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:39:45,336 epoch 3 - iter 693/773 - loss 0.04638286 - time (sec): 42.24 - samples/sec: 2640.83 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:39:50,069 epoch 3 - iter 770/773 - loss 0.04580281 - time (sec): 46.98 - samples/sec: 2637.11 - lr: 0.000023 - momentum: 0.000000
2023-10-25 10:39:50,249 ----------------------------------------------------------------------------------------------------
2023-10-25 10:39:50,249 EPOCH 3 done: loss 0.0457 - lr: 0.000023
2023-10-25 10:39:53,011 DEV : loss 0.07478724420070648 - f1-score (micro avg) 0.7705
2023-10-25 10:39:53,029 ----------------------------------------------------------------------------------------------------
2023-10-25 10:39:57,751 epoch 4 - iter 77/773 - loss 0.02381115 - time (sec): 4.72 - samples/sec: 2644.65 - lr: 0.000023 - momentum: 0.000000
2023-10-25 10:40:02,403 epoch 4 - iter 154/773 - loss 0.02272232 - time (sec): 9.37 - samples/sec: 2696.05 - lr: 0.000023 - momentum: 0.000000
2023-10-25 10:40:07,048 epoch 4 - iter 231/773 - loss 0.02306162 - time (sec): 14.02 - samples/sec: 2694.29 - lr: 0.000022 - momentum: 0.000000
2023-10-25 10:40:11,691 epoch 4 - iter 308/773 - loss 0.02500263 - time (sec): 18.66 - samples/sec: 2695.98 - lr: 0.000022 - momentum: 0.000000
2023-10-25 10:40:16,351 epoch 4 - iter 385/773 - loss 0.02577652 - time (sec): 23.32 - samples/sec: 2669.27 - lr: 0.000022 - momentum: 0.000000
2023-10-25 10:40:21,109 epoch 4 - iter 462/773 - loss 0.02834569 - time (sec): 28.08 - samples/sec: 2640.19 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:40:26,055 epoch 4 - iter 539/773 - loss 0.02860700 - time (sec): 33.02 - samples/sec: 2629.75 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:40:30,766 epoch 4 - iter 616/773 - loss 0.02820990 - time (sec): 37.73 - samples/sec: 2639.20 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:40:35,442 epoch 4 - iter 693/773 - loss 0.02789789 - time (sec): 42.41 - samples/sec: 2648.82 - lr: 0.000020 - momentum: 0.000000
2023-10-25 10:40:39,951 epoch 4 - iter 770/773 - loss 0.02986146 - time (sec): 46.92 - samples/sec: 2638.85 - lr: 0.000020 - momentum: 0.000000
2023-10-25 10:40:40,133 ----------------------------------------------------------------------------------------------------
2023-10-25 10:40:40,133 EPOCH 4 done: loss 0.0299 - lr: 0.000020
2023-10-25 10:40:42,825 DEV : loss 0.08224356174468994 - f1-score (micro avg) 0.7658
2023-10-25 10:40:42,842 ----------------------------------------------------------------------------------------------------
2023-10-25 10:40:47,542 epoch 5 - iter 77/773 - loss 0.02345774 - time (sec): 4.70 - samples/sec: 2619.37 - lr: 0.000020 - momentum: 0.000000
2023-10-25 10:40:52,241 epoch 5 - iter 154/773 - loss 0.02122391 - time (sec): 9.40 - samples/sec: 2625.76 - lr: 0.000019 - momentum: 0.000000
2023-10-25 10:40:56,970 epoch 5 - iter 231/773 - loss 0.01984080 - time (sec): 14.13 - samples/sec: 2661.60 - lr: 0.000019 - momentum: 0.000000
2023-10-25 10:41:01,364 epoch 5 - iter 308/773 - loss 0.02263347 - time (sec): 18.52 - samples/sec: 2686.02 - lr: 0.000019 - momentum: 0.000000
2023-10-25 10:41:06,019 epoch 5 - iter 385/773 - loss 0.02194096 - time (sec): 23.18 - samples/sec: 2706.00 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:41:10,705 epoch 5 - iter 462/773 - loss 0.02179624 - time (sec): 27.86 - samples/sec: 2708.53 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:41:15,249 epoch 5 - iter 539/773 - loss 0.02057243 - time (sec): 32.41 - samples/sec: 2722.04 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:41:19,743 epoch 5 - iter 616/773 - loss 0.02059981 - time (sec): 36.90 - samples/sec: 2702.67 - lr: 0.000017 - momentum: 0.000000
2023-10-25 10:41:24,238 epoch 5 - iter 693/773 - loss 0.02048126 - time (sec): 41.39 - samples/sec: 2712.66 - lr: 0.000017 - momentum: 0.000000
2023-10-25 10:41:28,644 epoch 5 - iter 770/773 - loss 0.02099495 - time (sec): 45.80 - samples/sec: 2703.67 - lr: 0.000017 - momentum: 0.000000
2023-10-25 10:41:28,839 ----------------------------------------------------------------------------------------------------
2023-10-25 10:41:28,840 EPOCH 5 done: loss 0.0212 - lr: 0.000017
2023-10-25 10:41:31,552 DEV : loss 0.09945573657751083 - f1-score (micro avg) 0.781
2023-10-25 10:41:31,572 ----------------------------------------------------------------------------------------------------
2023-10-25 10:41:36,179 epoch 6 - iter 77/773 - loss 0.01534591 - time (sec): 4.60 - samples/sec: 2745.09 - lr: 0.000016 - momentum: 0.000000
2023-10-25 10:41:40,809 epoch 6 - iter 154/773 - loss 0.01519805 - time (sec): 9.24 - samples/sec: 2722.23 - lr: 0.000016 - momentum: 0.000000
2023-10-25 10:41:45,397 epoch 6 - iter 231/773 - loss 0.01464584 - time (sec): 13.82 - samples/sec: 2671.47 - lr: 0.000016 - momentum: 0.000000
2023-10-25 10:41:50,157 epoch 6 - iter 308/773 - loss 0.01414850 - time (sec): 18.58 - samples/sec: 2679.03 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:41:54,914 epoch 6 - iter 385/773 - loss 0.01433663 - time (sec): 23.34 - samples/sec: 2701.85 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:41:59,690 epoch 6 - iter 462/773 - loss 0.01277122 - time (sec): 28.12 - samples/sec: 2696.50 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:42:04,329 epoch 6 - iter 539/773 - loss 0.01360655 - time (sec): 32.76 - samples/sec: 2682.46 - lr: 0.000014 - momentum: 0.000000
2023-10-25 10:42:09,181 epoch 6 - iter 616/773 - loss 0.01363450 - time (sec): 37.61 - samples/sec: 2648.33 - lr: 0.000014 - momentum: 0.000000
2023-10-25 10:42:14,051 epoch 6 - iter 693/773 - loss 0.01353720 - time (sec): 42.48 - samples/sec: 2630.62 - lr: 0.000014 - momentum: 0.000000
2023-10-25 10:42:18,779 epoch 6 - iter 770/773 - loss 0.01363499 - time (sec): 47.21 - samples/sec: 2624.71 - lr: 0.000013 - momentum: 0.000000
2023-10-25 10:42:18,961 ----------------------------------------------------------------------------------------------------
2023-10-25 10:42:18,962 EPOCH 6 done: loss 0.0140 - lr: 0.000013
2023-10-25 10:42:22,522 DEV : loss 0.11278796941041946 - f1-score (micro avg) 0.7753
2023-10-25 10:42:22,540 ----------------------------------------------------------------------------------------------------
2023-10-25 10:42:27,293 epoch 7 - iter 77/773 - loss 0.00960456 - time (sec): 4.75 - samples/sec: 2680.96 - lr: 0.000013 - momentum: 0.000000
2023-10-25 10:42:32,004 epoch 7 - iter 154/773 - loss 0.00936374 - time (sec): 9.46 - samples/sec: 2626.52 - lr: 0.000013 - momentum: 0.000000
2023-10-25 10:42:36,826 epoch 7 - iter 231/773 - loss 0.00747500 - time (sec): 14.28 - samples/sec: 2713.22 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:42:41,284 epoch 7 - iter 308/773 - loss 0.00789143 - time (sec): 18.74 - samples/sec: 2647.02 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:42:45,927 epoch 7 - iter 385/773 - loss 0.00801181 - time (sec): 23.39 - samples/sec: 2644.58 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:42:50,560 epoch 7 - iter 462/773 - loss 0.00730589 - time (sec): 28.02 - samples/sec: 2658.91 - lr: 0.000011 - momentum: 0.000000
2023-10-25 10:42:55,182 epoch 7 - iter 539/773 - loss 0.00808199 - time (sec): 32.64 - samples/sec: 2631.74 - lr: 0.000011 - momentum: 0.000000
2023-10-25 10:42:59,804 epoch 7 - iter 616/773 - loss 0.00863132 - time (sec): 37.26 - samples/sec: 2626.61 - lr: 0.000011 - momentum: 0.000000
2023-10-25 10:43:04,706 epoch 7 - iter 693/773 - loss 0.00876967 - time (sec): 42.16 - samples/sec: 2631.43 - lr: 0.000010 - momentum: 0.000000
2023-10-25 10:43:09,406 epoch 7 - iter 770/773 - loss 0.00915212 - time (sec): 46.86 - samples/sec: 2639.82 - lr: 0.000010 - momentum: 0.000000
2023-10-25 10:43:09,584 ----------------------------------------------------------------------------------------------------
2023-10-25 10:43:09,584 EPOCH 7 done: loss 0.0091 - lr: 0.000010
2023-10-25 10:43:12,629 DEV : loss 0.11861388385295868 - f1-score (micro avg) 0.7724
2023-10-25 10:43:12,647 ----------------------------------------------------------------------------------------------------
2023-10-25 10:43:17,325 epoch 8 - iter 77/773 - loss 0.00461380 - time (sec): 4.68 - samples/sec: 2636.24 - lr: 0.000010 - momentum: 0.000000
2023-10-25 10:43:21,940 epoch 8 - iter 154/773 - loss 0.00702621 - time (sec): 9.29 - samples/sec: 2672.72 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:43:26,655 epoch 8 - iter 231/773 - loss 0.00812508 - time (sec): 14.01 - samples/sec: 2598.23 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:43:31,362 epoch 8 - iter 308/773 - loss 0.00661780 - time (sec): 18.71 - samples/sec: 2595.72 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:43:35,901 epoch 8 - iter 385/773 - loss 0.00640004 - time (sec): 23.25 - samples/sec: 2629.17 - lr: 0.000008 - momentum: 0.000000
2023-10-25 10:43:40,469 epoch 8 - iter 462/773 - loss 0.00633717 - time (sec): 27.82 - samples/sec: 2675.90 - lr: 0.000008 - momentum: 0.000000
2023-10-25 10:43:45,129 epoch 8 - iter 539/773 - loss 0.00653842 - time (sec): 32.48 - samples/sec: 2677.43 - lr: 0.000008 - momentum: 0.000000
2023-10-25 10:43:49,745 epoch 8 - iter 616/773 - loss 0.00732939 - time (sec): 37.10 - samples/sec: 2673.52 - lr: 0.000007 - momentum: 0.000000
2023-10-25 10:43:54,376 epoch 8 - iter 693/773 - loss 0.00709463 - time (sec): 41.73 - samples/sec: 2665.60 - lr: 0.000007 - momentum: 0.000000
2023-10-25 10:43:59,079 epoch 8 - iter 770/773 - loss 0.00668988 - time (sec): 46.43 - samples/sec: 2665.10 - lr: 0.000007 - momentum: 0.000000
2023-10-25 10:43:59,266 ----------------------------------------------------------------------------------------------------
2023-10-25 10:43:59,266 EPOCH 8 done: loss 0.0067 - lr: 0.000007
2023-10-25 10:44:02,424 DEV : loss 0.10935225337743759 - f1-score (micro avg) 0.7901
2023-10-25 10:44:02,442 ----------------------------------------------------------------------------------------------------
2023-10-25 10:44:07,211 epoch 9 - iter 77/773 - loss 0.00280222 - time (sec): 4.77 - samples/sec: 2641.93 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:44:11,756 epoch 9 - iter 154/773 - loss 0.00290731 - time (sec): 9.31 - samples/sec: 2687.34 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:44:16,394 epoch 9 - iter 231/773 - loss 0.00405151 - time (sec): 13.95 - samples/sec: 2681.13 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:44:21,122 epoch 9 - iter 308/773 - loss 0.00435854 - time (sec): 18.68 - samples/sec: 2697.32 - lr: 0.000005 - momentum: 0.000000
2023-10-25 10:44:25,805 epoch 9 - iter 385/773 - loss 0.00434929 - time (sec): 23.36 - samples/sec: 2682.52 - lr: 0.000005 - momentum: 0.000000
2023-10-25 10:44:30,607 epoch 9 - iter 462/773 - loss 0.00405141 - time (sec): 28.16 - samples/sec: 2661.17 - lr: 0.000005 - momentum: 0.000000
2023-10-25 10:44:35,355 epoch 9 - iter 539/773 - loss 0.00398165 - time (sec): 32.91 - samples/sec: 2641.67 - lr: 0.000004 - momentum: 0.000000
2023-10-25 10:44:40,206 epoch 9 - iter 616/773 - loss 0.00428787 - time (sec): 37.76 - samples/sec: 2622.47 - lr: 0.000004 - momentum: 0.000000
2023-10-25 10:44:44,909 epoch 9 - iter 693/773 - loss 0.00416455 - time (sec): 42.47 - samples/sec: 2642.13 - lr: 0.000004 - momentum: 0.000000
2023-10-25 10:44:49,527 epoch 9 - iter 770/773 - loss 0.00393684 - time (sec): 47.08 - samples/sec: 2633.23 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:44:49,707 ----------------------------------------------------------------------------------------------------
2023-10-25 10:44:49,707 EPOCH 9 done: loss 0.0039 - lr: 0.000003
2023-10-25 10:44:52,325 DEV : loss 0.11163745075464249 - f1-score (micro avg) 0.7942
2023-10-25 10:44:52,342 ----------------------------------------------------------------------------------------------------
2023-10-25 10:44:57,023 epoch 10 - iter 77/773 - loss 0.00160620 - time (sec): 4.68 - samples/sec: 2530.57 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:45:01,740 epoch 10 - iter 154/773 - loss 0.00218583 - time (sec): 9.40 - samples/sec: 2501.82 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:45:06,385 epoch 10 - iter 231/773 - loss 0.00253136 - time (sec): 14.04 - samples/sec: 2524.83 - lr: 0.000002 - momentum: 0.000000
2023-10-25 10:45:10,823 epoch 10 - iter 308/773 - loss 0.00321432 - time (sec): 18.48 - samples/sec: 2583.04 - lr: 0.000002 - momentum: 0.000000
2023-10-25 10:45:15,327 epoch 10 - iter 385/773 - loss 0.00292953 - time (sec): 22.98 - samples/sec: 2578.23 - lr: 0.000002 - momentum: 0.000000
2023-10-25 10:45:20,048 epoch 10 - iter 462/773 - loss 0.00283533 - time (sec): 27.70 - samples/sec: 2604.20 - lr: 0.000001 - momentum: 0.000000
2023-10-25 10:45:24,719 epoch 10 - iter 539/773 - loss 0.00275001 - time (sec): 32.37 - samples/sec: 2628.38 - lr: 0.000001 - momentum: 0.000000
2023-10-25 10:45:29,618 epoch 10 - iter 616/773 - loss 0.00267312 - time (sec): 37.27 - samples/sec: 2636.23 - lr: 0.000001 - momentum: 0.000000
2023-10-25 10:45:34,364 epoch 10 - iter 693/773 - loss 0.00242572 - time (sec): 42.02 - samples/sec: 2648.91 - lr: 0.000000 - momentum: 0.000000
2023-10-25 10:45:39,113 epoch 10 - iter 770/773 - loss 0.00279658 - time (sec): 46.77 - samples/sec: 2642.08 - lr: 0.000000 - momentum: 0.000000
2023-10-25 10:45:39,310 ----------------------------------------------------------------------------------------------------
2023-10-25 10:45:39,310 EPOCH 10 done: loss 0.0029 - lr: 0.000000
2023-10-25 10:45:42,349 DEV : loss 0.11523404717445374 - f1-score (micro avg) 0.7884
2023-10-25 10:45:43,297 ----------------------------------------------------------------------------------------------------
2023-10-25 10:45:43,299 Loading model from best epoch ...
2023-10-25 10:45:45,437 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-25 10:45:55,712
Results:
- F-score (micro) 0.7656
- F-score (macro) 0.6513
- Accuracy 0.641
By class:
precision recall f1-score support
LOC 0.8262 0.8140 0.8200 946
BUILDING 0.5258 0.5514 0.5383 185
STREET 0.7368 0.5000 0.5957 56
micro avg 0.7732 0.7582 0.7656 1187
macro avg 0.6963 0.6218 0.6513 1187
weighted avg 0.7751 0.7582 0.7655 1187
2023-10-25 10:45:55,712 ----------------------------------------------------------------------------------------------------