stefan-it's picture
Upload ./training.log with huggingface_hub
91d1057
2023-10-25 12:04:43,420 ----------------------------------------------------------------------------------------------------
2023-10-25 12:04:43,421 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 12:04:43,421 ----------------------------------------------------------------------------------------------------
2023-10-25 12:04:43,421 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-25 12:04:43,421 ----------------------------------------------------------------------------------------------------
2023-10-25 12:04:43,421 Train: 20847 sentences
2023-10-25 12:04:43,421 (train_with_dev=False, train_with_test=False)
2023-10-25 12:04:43,421 ----------------------------------------------------------------------------------------------------
2023-10-25 12:04:43,421 Training Params:
2023-10-25 12:04:43,421 - learning_rate: "5e-05"
2023-10-25 12:04:43,421 - mini_batch_size: "8"
2023-10-25 12:04:43,421 - max_epochs: "10"
2023-10-25 12:04:43,421 - shuffle: "True"
2023-10-25 12:04:43,421 ----------------------------------------------------------------------------------------------------
2023-10-25 12:04:43,421 Plugins:
2023-10-25 12:04:43,421 - TensorboardLogger
2023-10-25 12:04:43,421 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 12:04:43,421 ----------------------------------------------------------------------------------------------------
2023-10-25 12:04:43,422 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 12:04:43,422 - metric: "('micro avg', 'f1-score')"
2023-10-25 12:04:43,422 ----------------------------------------------------------------------------------------------------
2023-10-25 12:04:43,422 Computation:
2023-10-25 12:04:43,422 - compute on device: cuda:0
2023-10-25 12:04:43,422 - embedding storage: none
2023-10-25 12:04:43,422 ----------------------------------------------------------------------------------------------------
2023-10-25 12:04:43,422 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-25 12:04:43,422 ----------------------------------------------------------------------------------------------------
2023-10-25 12:04:43,422 ----------------------------------------------------------------------------------------------------
2023-10-25 12:04:43,422 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 12:04:57,329 epoch 1 - iter 260/2606 - loss 1.33404298 - time (sec): 13.91 - samples/sec: 2676.54 - lr: 0.000005 - momentum: 0.000000
2023-10-25 12:05:11,911 epoch 1 - iter 520/2606 - loss 0.81750725 - time (sec): 28.49 - samples/sec: 2698.06 - lr: 0.000010 - momentum: 0.000000
2023-10-25 12:05:26,285 epoch 1 - iter 780/2606 - loss 0.63734009 - time (sec): 42.86 - samples/sec: 2662.27 - lr: 0.000015 - momentum: 0.000000
2023-10-25 12:05:40,386 epoch 1 - iter 1040/2606 - loss 0.54688049 - time (sec): 56.96 - samples/sec: 2609.80 - lr: 0.000020 - momentum: 0.000000
2023-10-25 12:05:54,652 epoch 1 - iter 1300/2606 - loss 0.48271010 - time (sec): 71.23 - samples/sec: 2585.48 - lr: 0.000025 - momentum: 0.000000
2023-10-25 12:06:09,224 epoch 1 - iter 1560/2606 - loss 0.43686814 - time (sec): 85.80 - samples/sec: 2591.06 - lr: 0.000030 - momentum: 0.000000
2023-10-25 12:06:22,977 epoch 1 - iter 1820/2606 - loss 0.40324464 - time (sec): 99.55 - samples/sec: 2576.82 - lr: 0.000035 - momentum: 0.000000
2023-10-25 12:06:37,534 epoch 1 - iter 2080/2606 - loss 0.37960994 - time (sec): 114.11 - samples/sec: 2560.25 - lr: 0.000040 - momentum: 0.000000
2023-10-25 12:06:51,663 epoch 1 - iter 2340/2606 - loss 0.35867330 - time (sec): 128.24 - samples/sec: 2557.80 - lr: 0.000045 - momentum: 0.000000
2023-10-25 12:07:06,164 epoch 1 - iter 2600/2606 - loss 0.33892838 - time (sec): 142.74 - samples/sec: 2569.97 - lr: 0.000050 - momentum: 0.000000
2023-10-25 12:07:06,496 ----------------------------------------------------------------------------------------------------
2023-10-25 12:07:06,496 EPOCH 1 done: loss 0.3387 - lr: 0.000050
2023-10-25 12:07:10,522 DEV : loss 0.1899159997701645 - f1-score (micro avg) 0.3206
2023-10-25 12:07:10,548 saving best model
2023-10-25 12:07:11,071 ----------------------------------------------------------------------------------------------------
2023-10-25 12:07:25,878 epoch 2 - iter 260/2606 - loss 0.18263844 - time (sec): 14.81 - samples/sec: 2609.92 - lr: 0.000049 - momentum: 0.000000
2023-10-25 12:07:40,312 epoch 2 - iter 520/2606 - loss 0.17196157 - time (sec): 29.24 - samples/sec: 2598.58 - lr: 0.000049 - momentum: 0.000000
2023-10-25 12:07:54,450 epoch 2 - iter 780/2606 - loss 0.16810378 - time (sec): 43.38 - samples/sec: 2626.37 - lr: 0.000048 - momentum: 0.000000
2023-10-25 12:08:08,143 epoch 2 - iter 1040/2606 - loss 0.16310966 - time (sec): 57.07 - samples/sec: 2587.48 - lr: 0.000048 - momentum: 0.000000
2023-10-25 12:08:22,199 epoch 2 - iter 1300/2606 - loss 0.16748312 - time (sec): 71.13 - samples/sec: 2565.37 - lr: 0.000047 - momentum: 0.000000
2023-10-25 12:08:36,287 epoch 2 - iter 1560/2606 - loss 0.17242041 - time (sec): 85.21 - samples/sec: 2564.40 - lr: 0.000047 - momentum: 0.000000
2023-10-25 12:08:51,010 epoch 2 - iter 1820/2606 - loss 0.18294843 - time (sec): 99.94 - samples/sec: 2559.73 - lr: 0.000046 - momentum: 0.000000
2023-10-25 12:09:05,058 epoch 2 - iter 2080/2606 - loss 0.18014163 - time (sec): 113.99 - samples/sec: 2562.12 - lr: 0.000046 - momentum: 0.000000
2023-10-25 12:09:19,987 epoch 2 - iter 2340/2606 - loss 0.17697195 - time (sec): 128.91 - samples/sec: 2554.90 - lr: 0.000045 - momentum: 0.000000
2023-10-25 12:09:34,480 epoch 2 - iter 2600/2606 - loss 0.17543410 - time (sec): 143.41 - samples/sec: 2557.33 - lr: 0.000044 - momentum: 0.000000
2023-10-25 12:09:34,754 ----------------------------------------------------------------------------------------------------
2023-10-25 12:09:34,754 EPOCH 2 done: loss 0.1757 - lr: 0.000044
2023-10-25 12:09:41,662 DEV : loss 0.12365195900201797 - f1-score (micro avg) 0.1991
2023-10-25 12:09:41,687 ----------------------------------------------------------------------------------------------------
2023-10-25 12:09:55,576 epoch 3 - iter 260/2606 - loss 0.13097015 - time (sec): 13.89 - samples/sec: 2422.06 - lr: 0.000044 - momentum: 0.000000
2023-10-25 12:10:09,421 epoch 3 - iter 520/2606 - loss 0.13232082 - time (sec): 27.73 - samples/sec: 2442.10 - lr: 0.000043 - momentum: 0.000000
2023-10-25 12:10:24,291 epoch 3 - iter 780/2606 - loss 0.11661536 - time (sec): 42.60 - samples/sec: 2529.83 - lr: 0.000043 - momentum: 0.000000
2023-10-25 12:10:38,435 epoch 3 - iter 1040/2606 - loss 0.12084986 - time (sec): 56.75 - samples/sec: 2521.57 - lr: 0.000042 - momentum: 0.000000
2023-10-25 12:10:52,440 epoch 3 - iter 1300/2606 - loss 0.11793891 - time (sec): 70.75 - samples/sec: 2546.11 - lr: 0.000042 - momentum: 0.000000
2023-10-25 12:11:06,700 epoch 3 - iter 1560/2606 - loss 0.11620440 - time (sec): 85.01 - samples/sec: 2552.64 - lr: 0.000041 - momentum: 0.000000
2023-10-25 12:11:21,546 epoch 3 - iter 1820/2606 - loss 0.11625199 - time (sec): 99.86 - samples/sec: 2574.35 - lr: 0.000041 - momentum: 0.000000
2023-10-25 12:11:36,104 epoch 3 - iter 2080/2606 - loss 0.11570847 - time (sec): 114.42 - samples/sec: 2571.76 - lr: 0.000040 - momentum: 0.000000
2023-10-25 12:11:50,815 epoch 3 - iter 2340/2606 - loss 0.11725099 - time (sec): 129.13 - samples/sec: 2573.15 - lr: 0.000039 - momentum: 0.000000
2023-10-25 12:12:04,853 epoch 3 - iter 2600/2606 - loss 0.11640683 - time (sec): 143.16 - samples/sec: 2561.04 - lr: 0.000039 - momentum: 0.000000
2023-10-25 12:12:05,150 ----------------------------------------------------------------------------------------------------
2023-10-25 12:12:05,150 EPOCH 3 done: loss 0.1164 - lr: 0.000039
2023-10-25 12:12:12,345 DEV : loss 0.18110370635986328 - f1-score (micro avg) 0.3487
2023-10-25 12:12:12,371 saving best model
2023-10-25 12:12:13,034 ----------------------------------------------------------------------------------------------------
2023-10-25 12:12:27,120 epoch 4 - iter 260/2606 - loss 0.07314164 - time (sec): 14.08 - samples/sec: 2617.27 - lr: 0.000038 - momentum: 0.000000
2023-10-25 12:12:41,122 epoch 4 - iter 520/2606 - loss 0.07274429 - time (sec): 28.09 - samples/sec: 2607.65 - lr: 0.000038 - momentum: 0.000000
2023-10-25 12:12:55,178 epoch 4 - iter 780/2606 - loss 0.07820023 - time (sec): 42.14 - samples/sec: 2600.12 - lr: 0.000037 - momentum: 0.000000
2023-10-25 12:13:09,293 epoch 4 - iter 1040/2606 - loss 0.08079671 - time (sec): 56.26 - samples/sec: 2534.91 - lr: 0.000037 - momentum: 0.000000
2023-10-25 12:13:24,132 epoch 4 - iter 1300/2606 - loss 0.08272013 - time (sec): 71.10 - samples/sec: 2541.48 - lr: 0.000036 - momentum: 0.000000
2023-10-25 12:13:38,910 epoch 4 - iter 1560/2606 - loss 0.08294572 - time (sec): 85.88 - samples/sec: 2523.61 - lr: 0.000036 - momentum: 0.000000
2023-10-25 12:13:53,194 epoch 4 - iter 1820/2606 - loss 0.08005595 - time (sec): 100.16 - samples/sec: 2534.43 - lr: 0.000035 - momentum: 0.000000
2023-10-25 12:14:08,011 epoch 4 - iter 2080/2606 - loss 0.08143243 - time (sec): 114.98 - samples/sec: 2551.54 - lr: 0.000034 - momentum: 0.000000
2023-10-25 12:14:22,431 epoch 4 - iter 2340/2606 - loss 0.08041802 - time (sec): 129.40 - samples/sec: 2532.25 - lr: 0.000034 - momentum: 0.000000
2023-10-25 12:14:37,281 epoch 4 - iter 2600/2606 - loss 0.08144784 - time (sec): 144.25 - samples/sec: 2539.74 - lr: 0.000033 - momentum: 0.000000
2023-10-25 12:14:37,596 ----------------------------------------------------------------------------------------------------
2023-10-25 12:14:37,597 EPOCH 4 done: loss 0.0813 - lr: 0.000033
2023-10-25 12:14:44,681 DEV : loss 0.24672392010688782 - f1-score (micro avg) 0.3258
2023-10-25 12:14:44,706 ----------------------------------------------------------------------------------------------------
2023-10-25 12:14:58,782 epoch 5 - iter 260/2606 - loss 0.04975046 - time (sec): 14.07 - samples/sec: 2514.00 - lr: 0.000033 - momentum: 0.000000
2023-10-25 12:15:13,022 epoch 5 - iter 520/2606 - loss 0.05842153 - time (sec): 28.31 - samples/sec: 2512.62 - lr: 0.000032 - momentum: 0.000000
2023-10-25 12:15:27,708 epoch 5 - iter 780/2606 - loss 0.05600138 - time (sec): 43.00 - samples/sec: 2555.65 - lr: 0.000032 - momentum: 0.000000
2023-10-25 12:15:42,374 epoch 5 - iter 1040/2606 - loss 0.05571986 - time (sec): 57.67 - samples/sec: 2559.74 - lr: 0.000031 - momentum: 0.000000
2023-10-25 12:15:56,321 epoch 5 - iter 1300/2606 - loss 0.05410019 - time (sec): 71.61 - samples/sec: 2592.02 - lr: 0.000031 - momentum: 0.000000
2023-10-25 12:16:11,032 epoch 5 - iter 1560/2606 - loss 0.05554265 - time (sec): 86.32 - samples/sec: 2600.43 - lr: 0.000030 - momentum: 0.000000
2023-10-25 12:16:25,229 epoch 5 - iter 1820/2606 - loss 0.05623815 - time (sec): 100.52 - samples/sec: 2591.86 - lr: 0.000029 - momentum: 0.000000
2023-10-25 12:16:39,142 epoch 5 - iter 2080/2606 - loss 0.05640579 - time (sec): 114.43 - samples/sec: 2577.66 - lr: 0.000029 - momentum: 0.000000
2023-10-25 12:16:52,969 epoch 5 - iter 2340/2606 - loss 0.05583642 - time (sec): 128.26 - samples/sec: 2566.52 - lr: 0.000028 - momentum: 0.000000
2023-10-25 12:17:07,359 epoch 5 - iter 2600/2606 - loss 0.05546860 - time (sec): 142.65 - samples/sec: 2568.74 - lr: 0.000028 - momentum: 0.000000
2023-10-25 12:17:07,671 ----------------------------------------------------------------------------------------------------
2023-10-25 12:17:07,671 EPOCH 5 done: loss 0.0554 - lr: 0.000028
2023-10-25 12:17:14,198 DEV : loss 0.3652787506580353 - f1-score (micro avg) 0.3473
2023-10-25 12:17:14,227 ----------------------------------------------------------------------------------------------------
2023-10-25 12:17:28,718 epoch 6 - iter 260/2606 - loss 0.04669705 - time (sec): 14.49 - samples/sec: 2618.65 - lr: 0.000027 - momentum: 0.000000
2023-10-25 12:17:44,097 epoch 6 - iter 520/2606 - loss 0.06386000 - time (sec): 29.87 - samples/sec: 2579.82 - lr: 0.000027 - momentum: 0.000000
2023-10-25 12:17:59,389 epoch 6 - iter 780/2606 - loss 0.05599133 - time (sec): 45.16 - samples/sec: 2534.51 - lr: 0.000026 - momentum: 0.000000
2023-10-25 12:18:13,860 epoch 6 - iter 1040/2606 - loss 0.05300823 - time (sec): 59.63 - samples/sec: 2554.23 - lr: 0.000026 - momentum: 0.000000
2023-10-25 12:18:27,592 epoch 6 - iter 1300/2606 - loss 0.05110818 - time (sec): 73.36 - samples/sec: 2537.33 - lr: 0.000025 - momentum: 0.000000
2023-10-25 12:18:41,721 epoch 6 - iter 1560/2606 - loss 0.04968908 - time (sec): 87.49 - samples/sec: 2540.03 - lr: 0.000024 - momentum: 0.000000
2023-10-25 12:18:56,223 epoch 6 - iter 1820/2606 - loss 0.04814738 - time (sec): 101.99 - samples/sec: 2516.29 - lr: 0.000024 - momentum: 0.000000
2023-10-25 12:19:10,794 epoch 6 - iter 2080/2606 - loss 0.04656793 - time (sec): 116.57 - samples/sec: 2511.11 - lr: 0.000023 - momentum: 0.000000
2023-10-25 12:19:24,885 epoch 6 - iter 2340/2606 - loss 0.04599298 - time (sec): 130.66 - samples/sec: 2519.24 - lr: 0.000023 - momentum: 0.000000
2023-10-25 12:19:39,483 epoch 6 - iter 2600/2606 - loss 0.04534244 - time (sec): 145.25 - samples/sec: 2523.96 - lr: 0.000022 - momentum: 0.000000
2023-10-25 12:19:39,796 ----------------------------------------------------------------------------------------------------
2023-10-25 12:19:39,796 EPOCH 6 done: loss 0.0453 - lr: 0.000022
2023-10-25 12:19:46,429 DEV : loss 0.36484959721565247 - f1-score (micro avg) 0.3757
2023-10-25 12:19:46,470 saving best model
2023-10-25 12:19:47,101 ----------------------------------------------------------------------------------------------------
2023-10-25 12:20:02,047 epoch 7 - iter 260/2606 - loss 0.02579681 - time (sec): 14.94 - samples/sec: 2481.46 - lr: 0.000022 - momentum: 0.000000
2023-10-25 12:20:17,070 epoch 7 - iter 520/2606 - loss 0.02723529 - time (sec): 29.97 - samples/sec: 2514.98 - lr: 0.000021 - momentum: 0.000000
2023-10-25 12:20:31,473 epoch 7 - iter 780/2606 - loss 0.03443679 - time (sec): 44.37 - samples/sec: 2487.49 - lr: 0.000021 - momentum: 0.000000
2023-10-25 12:20:46,443 epoch 7 - iter 1040/2606 - loss 0.05293747 - time (sec): 59.34 - samples/sec: 2490.37 - lr: 0.000020 - momentum: 0.000000
2023-10-25 12:21:01,062 epoch 7 - iter 1300/2606 - loss 0.05896085 - time (sec): 73.96 - samples/sec: 2498.42 - lr: 0.000019 - momentum: 0.000000
2023-10-25 12:21:15,728 epoch 7 - iter 1560/2606 - loss 0.05488902 - time (sec): 88.63 - samples/sec: 2525.43 - lr: 0.000019 - momentum: 0.000000
2023-10-25 12:21:30,405 epoch 7 - iter 1820/2606 - loss 0.05514632 - time (sec): 103.30 - samples/sec: 2519.10 - lr: 0.000018 - momentum: 0.000000
2023-10-25 12:21:44,288 epoch 7 - iter 2080/2606 - loss 0.05673720 - time (sec): 117.19 - samples/sec: 2518.16 - lr: 0.000018 - momentum: 0.000000
2023-10-25 12:21:58,440 epoch 7 - iter 2340/2606 - loss 0.06075104 - time (sec): 131.34 - samples/sec: 2518.27 - lr: 0.000017 - momentum: 0.000000
2023-10-25 12:22:12,005 epoch 7 - iter 2600/2606 - loss 0.07080521 - time (sec): 144.90 - samples/sec: 2527.51 - lr: 0.000017 - momentum: 0.000000
2023-10-25 12:22:12,429 ----------------------------------------------------------------------------------------------------
2023-10-25 12:22:12,430 EPOCH 7 done: loss 0.0709 - lr: 0.000017
2023-10-25 12:22:18,685 DEV : loss 0.28045564889907837 - f1-score (micro avg) 0.1776
2023-10-25 12:22:18,711 ----------------------------------------------------------------------------------------------------
2023-10-25 12:22:32,882 epoch 8 - iter 260/2606 - loss 0.10479596 - time (sec): 14.17 - samples/sec: 2620.27 - lr: 0.000016 - momentum: 0.000000
2023-10-25 12:22:47,009 epoch 8 - iter 520/2606 - loss 0.07922867 - time (sec): 28.30 - samples/sec: 2666.93 - lr: 0.000016 - momentum: 0.000000
2023-10-25 12:23:00,754 epoch 8 - iter 780/2606 - loss 0.08213498 - time (sec): 42.04 - samples/sec: 2670.80 - lr: 0.000015 - momentum: 0.000000
2023-10-25 12:23:14,774 epoch 8 - iter 1040/2606 - loss 0.08392793 - time (sec): 56.06 - samples/sec: 2676.38 - lr: 0.000014 - momentum: 0.000000
2023-10-25 12:23:29,316 epoch 8 - iter 1300/2606 - loss 0.09298697 - time (sec): 70.60 - samples/sec: 2692.58 - lr: 0.000014 - momentum: 0.000000
2023-10-25 12:23:43,269 epoch 8 - iter 1560/2606 - loss 0.09808489 - time (sec): 84.56 - samples/sec: 2660.91 - lr: 0.000013 - momentum: 0.000000
2023-10-25 12:23:57,806 epoch 8 - iter 1820/2606 - loss 0.09890906 - time (sec): 99.09 - samples/sec: 2606.94 - lr: 0.000013 - momentum: 0.000000
2023-10-25 12:24:12,049 epoch 8 - iter 2080/2606 - loss 0.10345234 - time (sec): 113.34 - samples/sec: 2615.34 - lr: 0.000012 - momentum: 0.000000
2023-10-25 12:24:26,187 epoch 8 - iter 2340/2606 - loss 0.10411809 - time (sec): 127.48 - samples/sec: 2608.57 - lr: 0.000012 - momentum: 0.000000
2023-10-25 12:24:40,125 epoch 8 - iter 2600/2606 - loss 0.10325893 - time (sec): 141.41 - samples/sec: 2593.05 - lr: 0.000011 - momentum: 0.000000
2023-10-25 12:24:40,426 ----------------------------------------------------------------------------------------------------
2023-10-25 12:24:40,426 EPOCH 8 done: loss 0.1032 - lr: 0.000011
2023-10-25 12:24:46,844 DEV : loss 0.3105942904949188 - f1-score (micro avg) 0.1121
2023-10-25 12:24:46,869 ----------------------------------------------------------------------------------------------------
2023-10-25 12:25:01,469 epoch 9 - iter 260/2606 - loss 0.12293943 - time (sec): 14.60 - samples/sec: 2468.64 - lr: 0.000011 - momentum: 0.000000
2023-10-25 12:25:15,333 epoch 9 - iter 520/2606 - loss 0.12263835 - time (sec): 28.46 - samples/sec: 2559.07 - lr: 0.000010 - momentum: 0.000000
2023-10-25 12:25:29,557 epoch 9 - iter 780/2606 - loss 0.14482561 - time (sec): 42.69 - samples/sec: 2559.99 - lr: 0.000009 - momentum: 0.000000
2023-10-25 12:25:43,465 epoch 9 - iter 1040/2606 - loss 0.15345955 - time (sec): 56.59 - samples/sec: 2570.82 - lr: 0.000009 - momentum: 0.000000
2023-10-25 12:25:58,521 epoch 9 - iter 1300/2606 - loss 0.15045690 - time (sec): 71.65 - samples/sec: 2560.35 - lr: 0.000008 - momentum: 0.000000
2023-10-25 12:26:12,756 epoch 9 - iter 1560/2606 - loss 0.14920787 - time (sec): 85.89 - samples/sec: 2568.73 - lr: 0.000008 - momentum: 0.000000
2023-10-25 12:26:26,678 epoch 9 - iter 1820/2606 - loss 0.14989750 - time (sec): 99.81 - samples/sec: 2586.72 - lr: 0.000007 - momentum: 0.000000
2023-10-25 12:26:40,643 epoch 9 - iter 2080/2606 - loss 0.15181262 - time (sec): 113.77 - samples/sec: 2578.12 - lr: 0.000007 - momentum: 0.000000
2023-10-25 12:26:55,783 epoch 9 - iter 2340/2606 - loss 0.15048265 - time (sec): 128.91 - samples/sec: 2565.28 - lr: 0.000006 - momentum: 0.000000
2023-10-25 12:27:09,724 epoch 9 - iter 2600/2606 - loss 0.14790385 - time (sec): 142.85 - samples/sec: 2564.79 - lr: 0.000006 - momentum: 0.000000
2023-10-25 12:27:10,104 ----------------------------------------------------------------------------------------------------
2023-10-25 12:27:10,104 EPOCH 9 done: loss 0.1478 - lr: 0.000006
2023-10-25 12:27:16,359 DEV : loss 0.2825768291950226 - f1-score (micro avg) 0.0264
2023-10-25 12:27:16,386 ----------------------------------------------------------------------------------------------------
2023-10-25 12:27:31,009 epoch 10 - iter 260/2606 - loss 0.12349381 - time (sec): 14.62 - samples/sec: 2544.38 - lr: 0.000005 - momentum: 0.000000
2023-10-25 12:27:44,544 epoch 10 - iter 520/2606 - loss 0.12212158 - time (sec): 28.16 - samples/sec: 2556.67 - lr: 0.000004 - momentum: 0.000000
2023-10-25 12:28:00,425 epoch 10 - iter 780/2606 - loss 0.12260170 - time (sec): 44.04 - samples/sec: 2533.39 - lr: 0.000004 - momentum: 0.000000
2023-10-25 12:28:14,921 epoch 10 - iter 1040/2606 - loss 0.12686123 - time (sec): 58.53 - samples/sec: 2549.96 - lr: 0.000003 - momentum: 0.000000
2023-10-25 12:28:28,874 epoch 10 - iter 1300/2606 - loss 0.12305199 - time (sec): 72.49 - samples/sec: 2528.69 - lr: 0.000003 - momentum: 0.000000
2023-10-25 12:28:44,113 epoch 10 - iter 1560/2606 - loss 0.12360049 - time (sec): 87.73 - samples/sec: 2501.90 - lr: 0.000002 - momentum: 0.000000
2023-10-25 12:28:58,489 epoch 10 - iter 1820/2606 - loss 0.12493764 - time (sec): 102.10 - samples/sec: 2499.93 - lr: 0.000002 - momentum: 0.000000
2023-10-25 12:29:13,302 epoch 10 - iter 2080/2606 - loss 0.12640882 - time (sec): 116.91 - samples/sec: 2502.36 - lr: 0.000001 - momentum: 0.000000
2023-10-25 12:29:27,892 epoch 10 - iter 2340/2606 - loss 0.12676932 - time (sec): 131.50 - samples/sec: 2510.93 - lr: 0.000001 - momentum: 0.000000
2023-10-25 12:29:41,917 epoch 10 - iter 2600/2606 - loss 0.12636628 - time (sec): 145.53 - samples/sec: 2521.08 - lr: 0.000000 - momentum: 0.000000
2023-10-25 12:29:42,199 ----------------------------------------------------------------------------------------------------
2023-10-25 12:29:42,199 EPOCH 10 done: loss 0.1263 - lr: 0.000000
2023-10-25 12:29:49,286 DEV : loss 0.30260512232780457 - f1-score (micro avg) 0.0731
2023-10-25 12:29:49,854 ----------------------------------------------------------------------------------------------------
2023-10-25 12:29:49,855 Loading model from best epoch ...
2023-10-25 12:29:51,548 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 12:30:03,057
Results:
- F-score (micro) 0.451
- F-score (macro) 0.2998
- Accuracy 0.2947
By class:
precision recall f1-score support
LOC 0.4629 0.5964 0.5212 1214
PER 0.4000 0.4356 0.4171 808
ORG 0.2843 0.2408 0.2607 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4210 0.4858 0.4510 2390
macro avg 0.2868 0.3182 0.2998 2390
weighted avg 0.4124 0.4858 0.4443 2390
2023-10-25 12:30:03,057 ----------------------------------------------------------------------------------------------------