stefan-it's picture
Upload ./training.log with huggingface_hub
54ce151
2023-10-25 10:22:52,222 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:52,223 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 10:22:52,223 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:52,223 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-25 10:22:52,224 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:52,224 Train: 6183 sentences
2023-10-25 10:22:52,224 (train_with_dev=False, train_with_test=False)
2023-10-25 10:22:52,224 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:52,224 Training Params:
2023-10-25 10:22:52,224 - learning_rate: "5e-05"
2023-10-25 10:22:52,224 - mini_batch_size: "4"
2023-10-25 10:22:52,224 - max_epochs: "10"
2023-10-25 10:22:52,224 - shuffle: "True"
2023-10-25 10:22:52,224 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:52,224 Plugins:
2023-10-25 10:22:52,224 - TensorboardLogger
2023-10-25 10:22:52,224 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 10:22:52,224 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:52,224 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 10:22:52,224 - metric: "('micro avg', 'f1-score')"
2023-10-25 10:22:52,224 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:52,224 Computation:
2023-10-25 10:22:52,224 - compute on device: cuda:0
2023-10-25 10:22:52,224 - embedding storage: none
2023-10-25 10:22:52,224 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:52,224 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-25 10:22:52,224 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:52,224 ----------------------------------------------------------------------------------------------------
2023-10-25 10:22:52,224 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 10:23:00,167 epoch 1 - iter 154/1546 - loss 1.59682713 - time (sec): 7.94 - samples/sec: 1589.82 - lr: 0.000005 - momentum: 0.000000
2023-10-25 10:23:08,149 epoch 1 - iter 308/1546 - loss 0.89117830 - time (sec): 15.92 - samples/sec: 1593.16 - lr: 0.000010 - momentum: 0.000000
2023-10-25 10:23:15,990 epoch 1 - iter 462/1546 - loss 0.64377555 - time (sec): 23.76 - samples/sec: 1582.88 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:23:23,785 epoch 1 - iter 616/1546 - loss 0.51331082 - time (sec): 31.56 - samples/sec: 1590.22 - lr: 0.000020 - momentum: 0.000000
2023-10-25 10:23:31,477 epoch 1 - iter 770/1546 - loss 0.44305966 - time (sec): 39.25 - samples/sec: 1569.11 - lr: 0.000025 - momentum: 0.000000
2023-10-25 10:23:39,530 epoch 1 - iter 924/1546 - loss 0.38864253 - time (sec): 47.30 - samples/sec: 1562.11 - lr: 0.000030 - momentum: 0.000000
2023-10-25 10:23:47,419 epoch 1 - iter 1078/1546 - loss 0.34915936 - time (sec): 55.19 - samples/sec: 1565.46 - lr: 0.000035 - momentum: 0.000000
2023-10-25 10:23:55,487 epoch 1 - iter 1232/1546 - loss 0.32266725 - time (sec): 63.26 - samples/sec: 1570.23 - lr: 0.000040 - momentum: 0.000000
2023-10-25 10:24:03,424 epoch 1 - iter 1386/1546 - loss 0.30046509 - time (sec): 71.20 - samples/sec: 1564.37 - lr: 0.000045 - momentum: 0.000000
2023-10-25 10:24:11,358 epoch 1 - iter 1540/1546 - loss 0.28008530 - time (sec): 79.13 - samples/sec: 1567.29 - lr: 0.000050 - momentum: 0.000000
2023-10-25 10:24:11,653 ----------------------------------------------------------------------------------------------------
2023-10-25 10:24:11,654 EPOCH 1 done: loss 0.2798 - lr: 0.000050
2023-10-25 10:24:15,019 DEV : loss 0.08405806124210358 - f1-score (micro avg) 0.5943
2023-10-25 10:24:15,036 saving best model
2023-10-25 10:24:15,582 ----------------------------------------------------------------------------------------------------
2023-10-25 10:24:23,571 epoch 2 - iter 154/1546 - loss 0.10858329 - time (sec): 7.99 - samples/sec: 1545.89 - lr: 0.000049 - momentum: 0.000000
2023-10-25 10:24:31,538 epoch 2 - iter 308/1546 - loss 0.10043248 - time (sec): 15.95 - samples/sec: 1532.59 - lr: 0.000049 - momentum: 0.000000
2023-10-25 10:24:39,629 epoch 2 - iter 462/1546 - loss 0.10238266 - time (sec): 24.05 - samples/sec: 1533.70 - lr: 0.000048 - momentum: 0.000000
2023-10-25 10:24:47,510 epoch 2 - iter 616/1546 - loss 0.09968080 - time (sec): 31.93 - samples/sec: 1546.93 - lr: 0.000048 - momentum: 0.000000
2023-10-25 10:24:55,393 epoch 2 - iter 770/1546 - loss 0.09894440 - time (sec): 39.81 - samples/sec: 1544.90 - lr: 0.000047 - momentum: 0.000000
2023-10-25 10:25:03,338 epoch 2 - iter 924/1546 - loss 0.09747046 - time (sec): 47.75 - samples/sec: 1546.64 - lr: 0.000047 - momentum: 0.000000
2023-10-25 10:25:11,221 epoch 2 - iter 1078/1546 - loss 0.09656863 - time (sec): 55.64 - samples/sec: 1553.87 - lr: 0.000046 - momentum: 0.000000
2023-10-25 10:25:19,121 epoch 2 - iter 1232/1546 - loss 0.09740415 - time (sec): 63.54 - samples/sec: 1559.14 - lr: 0.000046 - momentum: 0.000000
2023-10-25 10:25:27,235 epoch 2 - iter 1386/1546 - loss 0.10016234 - time (sec): 71.65 - samples/sec: 1559.71 - lr: 0.000045 - momentum: 0.000000
2023-10-25 10:25:35,143 epoch 2 - iter 1540/1546 - loss 0.10060445 - time (sec): 79.56 - samples/sec: 1557.57 - lr: 0.000044 - momentum: 0.000000
2023-10-25 10:25:35,445 ----------------------------------------------------------------------------------------------------
2023-10-25 10:25:35,445 EPOCH 2 done: loss 0.1005 - lr: 0.000044
2023-10-25 10:25:38,219 DEV : loss 0.06613297015428543 - f1-score (micro avg) 0.7377
2023-10-25 10:25:38,236 saving best model
2023-10-25 10:25:39,002 ----------------------------------------------------------------------------------------------------
2023-10-25 10:25:47,193 epoch 3 - iter 154/1546 - loss 0.06284020 - time (sec): 8.19 - samples/sec: 1524.28 - lr: 0.000044 - momentum: 0.000000
2023-10-25 10:25:55,106 epoch 3 - iter 308/1546 - loss 0.06051793 - time (sec): 16.10 - samples/sec: 1517.05 - lr: 0.000043 - momentum: 0.000000
2023-10-25 10:26:03,223 epoch 3 - iter 462/1546 - loss 0.06199724 - time (sec): 24.22 - samples/sec: 1508.30 - lr: 0.000043 - momentum: 0.000000
2023-10-25 10:26:11,285 epoch 3 - iter 616/1546 - loss 0.06901668 - time (sec): 32.28 - samples/sec: 1513.06 - lr: 0.000042 - momentum: 0.000000
2023-10-25 10:26:19,396 epoch 3 - iter 770/1546 - loss 0.06769128 - time (sec): 40.39 - samples/sec: 1513.76 - lr: 0.000042 - momentum: 0.000000
2023-10-25 10:26:27,337 epoch 3 - iter 924/1546 - loss 0.06807317 - time (sec): 48.33 - samples/sec: 1532.21 - lr: 0.000041 - momentum: 0.000000
2023-10-25 10:26:35,341 epoch 3 - iter 1078/1546 - loss 0.06992824 - time (sec): 56.34 - samples/sec: 1537.05 - lr: 0.000041 - momentum: 0.000000
2023-10-25 10:26:43,266 epoch 3 - iter 1232/1546 - loss 0.07185688 - time (sec): 64.26 - samples/sec: 1539.39 - lr: 0.000040 - momentum: 0.000000
2023-10-25 10:26:51,341 epoch 3 - iter 1386/1546 - loss 0.07155532 - time (sec): 72.34 - samples/sec: 1536.08 - lr: 0.000039 - momentum: 0.000000
2023-10-25 10:26:59,470 epoch 3 - iter 1540/1546 - loss 0.07128179 - time (sec): 80.47 - samples/sec: 1536.47 - lr: 0.000039 - momentum: 0.000000
2023-10-25 10:26:59,788 ----------------------------------------------------------------------------------------------------
2023-10-25 10:26:59,789 EPOCH 3 done: loss 0.0713 - lr: 0.000039
2023-10-25 10:27:02,908 DEV : loss 0.0856151133775711 - f1-score (micro avg) 0.7273
2023-10-25 10:27:02,924 ----------------------------------------------------------------------------------------------------
2023-10-25 10:27:10,888 epoch 4 - iter 154/1546 - loss 0.04453513 - time (sec): 7.96 - samples/sec: 1615.65 - lr: 0.000038 - momentum: 0.000000
2023-10-25 10:27:19,047 epoch 4 - iter 308/1546 - loss 0.04454757 - time (sec): 16.12 - samples/sec: 1517.28 - lr: 0.000038 - momentum: 0.000000
2023-10-25 10:27:27,292 epoch 4 - iter 462/1546 - loss 0.04667982 - time (sec): 24.37 - samples/sec: 1540.54 - lr: 0.000037 - momentum: 0.000000
2023-10-25 10:27:35,480 epoch 4 - iter 616/1546 - loss 0.04770651 - time (sec): 32.55 - samples/sec: 1550.36 - lr: 0.000037 - momentum: 0.000000
2023-10-25 10:27:43,398 epoch 4 - iter 770/1546 - loss 0.04711825 - time (sec): 40.47 - samples/sec: 1538.01 - lr: 0.000036 - momentum: 0.000000
2023-10-25 10:27:51,326 epoch 4 - iter 924/1546 - loss 0.04601775 - time (sec): 48.40 - samples/sec: 1546.43 - lr: 0.000036 - momentum: 0.000000
2023-10-25 10:27:59,380 epoch 4 - iter 1078/1546 - loss 0.04649332 - time (sec): 56.45 - samples/sec: 1556.67 - lr: 0.000035 - momentum: 0.000000
2023-10-25 10:28:07,340 epoch 4 - iter 1232/1546 - loss 0.04677162 - time (sec): 64.41 - samples/sec: 1553.58 - lr: 0.000034 - momentum: 0.000000
2023-10-25 10:28:15,375 epoch 4 - iter 1386/1546 - loss 0.04686556 - time (sec): 72.45 - samples/sec: 1546.65 - lr: 0.000034 - momentum: 0.000000
2023-10-25 10:28:23,211 epoch 4 - iter 1540/1546 - loss 0.04791150 - time (sec): 80.29 - samples/sec: 1541.79 - lr: 0.000033 - momentum: 0.000000
2023-10-25 10:28:23,524 ----------------------------------------------------------------------------------------------------
2023-10-25 10:28:23,524 EPOCH 4 done: loss 0.0478 - lr: 0.000033
2023-10-25 10:28:26,220 DEV : loss 0.09348937124013901 - f1-score (micro avg) 0.7521
2023-10-25 10:28:26,234 saving best model
2023-10-25 10:28:26,950 ----------------------------------------------------------------------------------------------------
2023-10-25 10:28:34,965 epoch 5 - iter 154/1546 - loss 0.04181508 - time (sec): 8.01 - samples/sec: 1421.60 - lr: 0.000033 - momentum: 0.000000
2023-10-25 10:28:42,948 epoch 5 - iter 308/1546 - loss 0.03973625 - time (sec): 16.00 - samples/sec: 1539.28 - lr: 0.000032 - momentum: 0.000000
2023-10-25 10:28:50,943 epoch 5 - iter 462/1546 - loss 0.03429687 - time (sec): 23.99 - samples/sec: 1559.42 - lr: 0.000032 - momentum: 0.000000
2023-10-25 10:28:59,030 epoch 5 - iter 616/1546 - loss 0.03909449 - time (sec): 32.08 - samples/sec: 1553.75 - lr: 0.000031 - momentum: 0.000000
2023-10-25 10:29:06,995 epoch 5 - iter 770/1546 - loss 0.03934781 - time (sec): 40.04 - samples/sec: 1549.00 - lr: 0.000031 - momentum: 0.000000
2023-10-25 10:29:14,916 epoch 5 - iter 924/1546 - loss 0.03682160 - time (sec): 47.96 - samples/sec: 1555.64 - lr: 0.000030 - momentum: 0.000000
2023-10-25 10:29:22,777 epoch 5 - iter 1078/1546 - loss 0.03663157 - time (sec): 55.82 - samples/sec: 1554.26 - lr: 0.000029 - momentum: 0.000000
2023-10-25 10:29:30,743 epoch 5 - iter 1232/1546 - loss 0.03744518 - time (sec): 63.79 - samples/sec: 1553.97 - lr: 0.000029 - momentum: 0.000000
2023-10-25 10:29:38,749 epoch 5 - iter 1386/1546 - loss 0.03709433 - time (sec): 71.80 - samples/sec: 1554.16 - lr: 0.000028 - momentum: 0.000000
2023-10-25 10:29:46,726 epoch 5 - iter 1540/1546 - loss 0.03613206 - time (sec): 79.77 - samples/sec: 1553.52 - lr: 0.000028 - momentum: 0.000000
2023-10-25 10:29:47,029 ----------------------------------------------------------------------------------------------------
2023-10-25 10:29:47,029 EPOCH 5 done: loss 0.0360 - lr: 0.000028
2023-10-25 10:29:49,629 DEV : loss 0.11338605731725693 - f1-score (micro avg) 0.7546
2023-10-25 10:29:49,648 saving best model
2023-10-25 10:29:50,361 ----------------------------------------------------------------------------------------------------
2023-10-25 10:29:58,359 epoch 6 - iter 154/1546 - loss 0.01816962 - time (sec): 8.00 - samples/sec: 1573.02 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:30:06,293 epoch 6 - iter 308/1546 - loss 0.02207942 - time (sec): 15.93 - samples/sec: 1566.57 - lr: 0.000027 - momentum: 0.000000
2023-10-25 10:30:14,440 epoch 6 - iter 462/1546 - loss 0.02501555 - time (sec): 24.08 - samples/sec: 1553.46 - lr: 0.000026 - momentum: 0.000000
2023-10-25 10:30:22,341 epoch 6 - iter 616/1546 - loss 0.02802152 - time (sec): 31.98 - samples/sec: 1561.74 - lr: 0.000026 - momentum: 0.000000
2023-10-25 10:30:30,311 epoch 6 - iter 770/1546 - loss 0.02771706 - time (sec): 39.95 - samples/sec: 1527.23 - lr: 0.000025 - momentum: 0.000000
2023-10-25 10:30:38,410 epoch 6 - iter 924/1546 - loss 0.02733007 - time (sec): 48.05 - samples/sec: 1521.43 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:30:46,504 epoch 6 - iter 1078/1546 - loss 0.02629383 - time (sec): 56.14 - samples/sec: 1526.19 - lr: 0.000024 - momentum: 0.000000
2023-10-25 10:30:54,581 epoch 6 - iter 1232/1546 - loss 0.02645982 - time (sec): 64.22 - samples/sec: 1541.41 - lr: 0.000023 - momentum: 0.000000
2023-10-25 10:31:02,692 epoch 6 - iter 1386/1546 - loss 0.02607912 - time (sec): 72.33 - samples/sec: 1541.29 - lr: 0.000023 - momentum: 0.000000
2023-10-25 10:31:11,054 epoch 6 - iter 1540/1546 - loss 0.02570720 - time (sec): 80.69 - samples/sec: 1535.15 - lr: 0.000022 - momentum: 0.000000
2023-10-25 10:31:11,360 ----------------------------------------------------------------------------------------------------
2023-10-25 10:31:11,360 EPOCH 6 done: loss 0.0258 - lr: 0.000022
2023-10-25 10:31:14,574 DEV : loss 0.11959201842546463 - f1-score (micro avg) 0.7621
2023-10-25 10:31:14,597 saving best model
2023-10-25 10:31:15,299 ----------------------------------------------------------------------------------------------------
2023-10-25 10:31:23,382 epoch 7 - iter 154/1546 - loss 0.01965882 - time (sec): 8.08 - samples/sec: 1508.42 - lr: 0.000022 - momentum: 0.000000
2023-10-25 10:31:31,337 epoch 7 - iter 308/1546 - loss 0.01786807 - time (sec): 16.04 - samples/sec: 1555.65 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:31:39,462 epoch 7 - iter 462/1546 - loss 0.01846085 - time (sec): 24.16 - samples/sec: 1579.01 - lr: 0.000021 - momentum: 0.000000
2023-10-25 10:31:47,507 epoch 7 - iter 616/1546 - loss 0.01777823 - time (sec): 32.21 - samples/sec: 1561.77 - lr: 0.000020 - momentum: 0.000000
2023-10-25 10:31:55,527 epoch 7 - iter 770/1546 - loss 0.01824766 - time (sec): 40.23 - samples/sec: 1552.45 - lr: 0.000019 - momentum: 0.000000
2023-10-25 10:32:03,806 epoch 7 - iter 924/1546 - loss 0.01845757 - time (sec): 48.50 - samples/sec: 1525.47 - lr: 0.000019 - momentum: 0.000000
2023-10-25 10:32:11,802 epoch 7 - iter 1078/1546 - loss 0.01958815 - time (sec): 56.50 - samples/sec: 1538.31 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:32:19,783 epoch 7 - iter 1232/1546 - loss 0.01951007 - time (sec): 64.48 - samples/sec: 1542.37 - lr: 0.000018 - momentum: 0.000000
2023-10-25 10:32:28,006 epoch 7 - iter 1386/1546 - loss 0.01948908 - time (sec): 72.70 - samples/sec: 1533.63 - lr: 0.000017 - momentum: 0.000000
2023-10-25 10:32:36,058 epoch 7 - iter 1540/1546 - loss 0.01926685 - time (sec): 80.76 - samples/sec: 1530.81 - lr: 0.000017 - momentum: 0.000000
2023-10-25 10:32:36,360 ----------------------------------------------------------------------------------------------------
2023-10-25 10:32:36,361 EPOCH 7 done: loss 0.0193 - lr: 0.000017
2023-10-25 10:32:39,099 DEV : loss 0.13256850838661194 - f1-score (micro avg) 0.7401
2023-10-25 10:32:39,116 ----------------------------------------------------------------------------------------------------
2023-10-25 10:32:47,138 epoch 8 - iter 154/1546 - loss 0.01514111 - time (sec): 8.02 - samples/sec: 1563.64 - lr: 0.000016 - momentum: 0.000000
2023-10-25 10:32:55,255 epoch 8 - iter 308/1546 - loss 0.01555329 - time (sec): 16.14 - samples/sec: 1597.67 - lr: 0.000016 - momentum: 0.000000
2023-10-25 10:33:03,313 epoch 8 - iter 462/1546 - loss 0.01647792 - time (sec): 24.20 - samples/sec: 1560.81 - lr: 0.000015 - momentum: 0.000000
2023-10-25 10:33:11,355 epoch 8 - iter 616/1546 - loss 0.01541964 - time (sec): 32.24 - samples/sec: 1536.45 - lr: 0.000014 - momentum: 0.000000
2023-10-25 10:33:19,158 epoch 8 - iter 770/1546 - loss 0.01457204 - time (sec): 40.04 - samples/sec: 1530.37 - lr: 0.000014 - momentum: 0.000000
2023-10-25 10:33:27,228 epoch 8 - iter 924/1546 - loss 0.01394284 - time (sec): 48.11 - samples/sec: 1520.42 - lr: 0.000013 - momentum: 0.000000
2023-10-25 10:33:35,360 epoch 8 - iter 1078/1546 - loss 0.01347465 - time (sec): 56.24 - samples/sec: 1510.05 - lr: 0.000013 - momentum: 0.000000
2023-10-25 10:33:43,555 epoch 8 - iter 1232/1546 - loss 0.01279755 - time (sec): 64.44 - samples/sec: 1528.63 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:33:51,990 epoch 8 - iter 1386/1546 - loss 0.01254666 - time (sec): 72.87 - samples/sec: 1528.85 - lr: 0.000012 - momentum: 0.000000
2023-10-25 10:33:59,892 epoch 8 - iter 1540/1546 - loss 0.01237466 - time (sec): 80.77 - samples/sec: 1532.22 - lr: 0.000011 - momentum: 0.000000
2023-10-25 10:34:00,209 ----------------------------------------------------------------------------------------------------
2023-10-25 10:34:00,209 EPOCH 8 done: loss 0.0123 - lr: 0.000011
2023-10-25 10:34:02,639 DEV : loss 0.14071506261825562 - f1-score (micro avg) 0.7454
2023-10-25 10:34:02,655 ----------------------------------------------------------------------------------------------------
2023-10-25 10:34:10,640 epoch 9 - iter 154/1546 - loss 0.00502645 - time (sec): 7.98 - samples/sec: 1448.38 - lr: 0.000011 - momentum: 0.000000
2023-10-25 10:34:18,629 epoch 9 - iter 308/1546 - loss 0.00689538 - time (sec): 15.97 - samples/sec: 1492.57 - lr: 0.000010 - momentum: 0.000000
2023-10-25 10:34:26,678 epoch 9 - iter 462/1546 - loss 0.00648425 - time (sec): 24.02 - samples/sec: 1525.76 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:34:34,738 epoch 9 - iter 616/1546 - loss 0.00704709 - time (sec): 32.08 - samples/sec: 1539.47 - lr: 0.000009 - momentum: 0.000000
2023-10-25 10:34:42,944 epoch 9 - iter 770/1546 - loss 0.00744508 - time (sec): 40.29 - samples/sec: 1553.43 - lr: 0.000008 - momentum: 0.000000
2023-10-25 10:34:50,993 epoch 9 - iter 924/1546 - loss 0.00672678 - time (sec): 48.34 - samples/sec: 1554.21 - lr: 0.000008 - momentum: 0.000000
2023-10-25 10:34:59,172 epoch 9 - iter 1078/1546 - loss 0.00676359 - time (sec): 56.51 - samples/sec: 1561.68 - lr: 0.000007 - momentum: 0.000000
2023-10-25 10:35:06,863 epoch 9 - iter 1232/1546 - loss 0.00666184 - time (sec): 64.21 - samples/sec: 1563.70 - lr: 0.000007 - momentum: 0.000000
2023-10-25 10:35:14,955 epoch 9 - iter 1386/1546 - loss 0.00711228 - time (sec): 72.30 - samples/sec: 1546.05 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:35:22,887 epoch 9 - iter 1540/1546 - loss 0.00784747 - time (sec): 80.23 - samples/sec: 1543.02 - lr: 0.000006 - momentum: 0.000000
2023-10-25 10:35:23,181 ----------------------------------------------------------------------------------------------------
2023-10-25 10:35:23,181 EPOCH 9 done: loss 0.0078 - lr: 0.000006
2023-10-25 10:35:25,925 DEV : loss 0.1457042396068573 - f1-score (micro avg) 0.7398
2023-10-25 10:35:25,943 ----------------------------------------------------------------------------------------------------
2023-10-25 10:35:34,044 epoch 10 - iter 154/1546 - loss 0.00461955 - time (sec): 8.10 - samples/sec: 1522.59 - lr: 0.000005 - momentum: 0.000000
2023-10-25 10:35:41,995 epoch 10 - iter 308/1546 - loss 0.00725351 - time (sec): 16.05 - samples/sec: 1462.48 - lr: 0.000004 - momentum: 0.000000
2023-10-25 10:35:49,857 epoch 10 - iter 462/1546 - loss 0.00548130 - time (sec): 23.91 - samples/sec: 1496.35 - lr: 0.000004 - momentum: 0.000000
2023-10-25 10:35:57,737 epoch 10 - iter 616/1546 - loss 0.00534302 - time (sec): 31.79 - samples/sec: 1512.74 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:36:05,860 epoch 10 - iter 770/1546 - loss 0.00521499 - time (sec): 39.92 - samples/sec: 1528.84 - lr: 0.000003 - momentum: 0.000000
2023-10-25 10:36:13,948 epoch 10 - iter 924/1546 - loss 0.00491525 - time (sec): 48.00 - samples/sec: 1528.03 - lr: 0.000002 - momentum: 0.000000
2023-10-25 10:36:21,904 epoch 10 - iter 1078/1546 - loss 0.00477326 - time (sec): 55.96 - samples/sec: 1528.56 - lr: 0.000002 - momentum: 0.000000
2023-10-25 10:36:29,926 epoch 10 - iter 1232/1546 - loss 0.00450425 - time (sec): 63.98 - samples/sec: 1535.19 - lr: 0.000001 - momentum: 0.000000
2023-10-25 10:36:38,128 epoch 10 - iter 1386/1546 - loss 0.00420305 - time (sec): 72.18 - samples/sec: 1534.30 - lr: 0.000001 - momentum: 0.000000
2023-10-25 10:36:46,085 epoch 10 - iter 1540/1546 - loss 0.00458320 - time (sec): 80.14 - samples/sec: 1544.64 - lr: 0.000000 - momentum: 0.000000
2023-10-25 10:36:46,385 ----------------------------------------------------------------------------------------------------
2023-10-25 10:36:46,386 EPOCH 10 done: loss 0.0046 - lr: 0.000000
2023-10-25 10:36:49,417 DEV : loss 0.14359833300113678 - f1-score (micro avg) 0.7453
2023-10-25 10:36:49,937 ----------------------------------------------------------------------------------------------------
2023-10-25 10:36:49,939 Loading model from best epoch ...
2023-10-25 10:36:51,996 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-25 10:37:00,712
Results:
- F-score (micro) 0.7736
- F-score (macro) 0.6541
- Accuracy 0.648
By class:
precision recall f1-score support
LOC 0.8731 0.7780 0.8228 946
BUILDING 0.6718 0.4757 0.5570 185
STREET 0.6383 0.5357 0.5825 56
micro avg 0.8364 0.7195 0.7736 1187
macro avg 0.7277 0.5965 0.6541 1187
weighted avg 0.8306 0.7195 0.7700 1187
2023-10-25 10:37:00,712 ----------------------------------------------------------------------------------------------------