stefan-it's picture
Upload folder using huggingface_hub
acd5c36
2023-10-15 13:16:10,272 ----------------------------------------------------------------------------------------------------
2023-10-15 13:16:10,273 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-15 13:16:10,273 ----------------------------------------------------------------------------------------------------
2023-10-15 13:16:10,273 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-15 13:16:10,273 ----------------------------------------------------------------------------------------------------
2023-10-15 13:16:10,273 Train: 20847 sentences
2023-10-15 13:16:10,273 (train_with_dev=False, train_with_test=False)
2023-10-15 13:16:10,273 ----------------------------------------------------------------------------------------------------
2023-10-15 13:16:10,273 Training Params:
2023-10-15 13:16:10,273 - learning_rate: "5e-05"
2023-10-15 13:16:10,273 - mini_batch_size: "4"
2023-10-15 13:16:10,273 - max_epochs: "10"
2023-10-15 13:16:10,273 - shuffle: "True"
2023-10-15 13:16:10,273 ----------------------------------------------------------------------------------------------------
2023-10-15 13:16:10,273 Plugins:
2023-10-15 13:16:10,273 - LinearScheduler | warmup_fraction: '0.1'
2023-10-15 13:16:10,273 ----------------------------------------------------------------------------------------------------
2023-10-15 13:16:10,273 Final evaluation on model from best epoch (best-model.pt)
2023-10-15 13:16:10,273 - metric: "('micro avg', 'f1-score')"
2023-10-15 13:16:10,273 ----------------------------------------------------------------------------------------------------
2023-10-15 13:16:10,273 Computation:
2023-10-15 13:16:10,273 - compute on device: cuda:0
2023-10-15 13:16:10,273 - embedding storage: none
2023-10-15 13:16:10,273 ----------------------------------------------------------------------------------------------------
2023-10-15 13:16:10,273 Model training base path: "hmbench-newseye/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-15 13:16:10,273 ----------------------------------------------------------------------------------------------------
2023-10-15 13:16:10,273 ----------------------------------------------------------------------------------------------------
2023-10-15 13:16:35,448 epoch 1 - iter 521/5212 - loss 1.40362758 - time (sec): 25.17 - samples/sec: 1455.51 - lr: 0.000005 - momentum: 0.000000
2023-10-15 13:17:00,458 epoch 1 - iter 1042/5212 - loss 0.89883300 - time (sec): 50.18 - samples/sec: 1460.27 - lr: 0.000010 - momentum: 0.000000
2023-10-15 13:17:25,878 epoch 1 - iter 1563/5212 - loss 0.67555628 - time (sec): 75.60 - samples/sec: 1488.02 - lr: 0.000015 - momentum: 0.000000
2023-10-15 13:17:50,812 epoch 1 - iter 2084/5212 - loss 0.57238927 - time (sec): 100.54 - samples/sec: 1477.04 - lr: 0.000020 - momentum: 0.000000
2023-10-15 13:18:16,704 epoch 1 - iter 2605/5212 - loss 0.49502228 - time (sec): 126.43 - samples/sec: 1492.43 - lr: 0.000025 - momentum: 0.000000
2023-10-15 13:18:42,337 epoch 1 - iter 3126/5212 - loss 0.45192361 - time (sec): 152.06 - samples/sec: 1483.23 - lr: 0.000030 - momentum: 0.000000
2023-10-15 13:19:08,050 epoch 1 - iter 3647/5212 - loss 0.42525465 - time (sec): 177.78 - samples/sec: 1458.09 - lr: 0.000035 - momentum: 0.000000
2023-10-15 13:19:37,108 epoch 1 - iter 4168/5212 - loss 0.40135861 - time (sec): 206.83 - samples/sec: 1438.06 - lr: 0.000040 - momentum: 0.000000
2023-10-15 13:20:03,562 epoch 1 - iter 4689/5212 - loss 0.38395729 - time (sec): 233.29 - samples/sec: 1429.03 - lr: 0.000045 - momentum: 0.000000
2023-10-15 13:20:30,449 epoch 1 - iter 5210/5212 - loss 0.37262070 - time (sec): 260.17 - samples/sec: 1411.90 - lr: 0.000050 - momentum: 0.000000
2023-10-15 13:20:30,541 ----------------------------------------------------------------------------------------------------
2023-10-15 13:20:30,542 EPOCH 1 done: loss 0.3725 - lr: 0.000050
2023-10-15 13:20:36,434 DEV : loss 0.11529939621686935 - f1-score (micro avg) 0.2459
2023-10-15 13:20:36,460 saving best model
2023-10-15 13:20:36,927 ----------------------------------------------------------------------------------------------------
2023-10-15 13:21:02,702 epoch 2 - iter 521/5212 - loss 0.19455856 - time (sec): 25.77 - samples/sec: 1478.33 - lr: 0.000049 - momentum: 0.000000
2023-10-15 13:21:28,495 epoch 2 - iter 1042/5212 - loss 0.20837326 - time (sec): 51.57 - samples/sec: 1394.34 - lr: 0.000049 - momentum: 0.000000
2023-10-15 13:21:56,301 epoch 2 - iter 1563/5212 - loss 0.19711389 - time (sec): 79.37 - samples/sec: 1372.97 - lr: 0.000048 - momentum: 0.000000
2023-10-15 13:22:22,909 epoch 2 - iter 2084/5212 - loss 0.19827196 - time (sec): 105.98 - samples/sec: 1387.68 - lr: 0.000048 - momentum: 0.000000
2023-10-15 13:22:50,780 epoch 2 - iter 2605/5212 - loss 0.19554159 - time (sec): 133.85 - samples/sec: 1376.38 - lr: 0.000047 - momentum: 0.000000
2023-10-15 13:23:16,034 epoch 2 - iter 3126/5212 - loss 0.19262159 - time (sec): 159.10 - samples/sec: 1367.76 - lr: 0.000047 - momentum: 0.000000
2023-10-15 13:23:41,047 epoch 2 - iter 3647/5212 - loss 0.19419779 - time (sec): 184.12 - samples/sec: 1389.83 - lr: 0.000046 - momentum: 0.000000
2023-10-15 13:24:06,335 epoch 2 - iter 4168/5212 - loss 0.19161465 - time (sec): 209.41 - samples/sec: 1401.90 - lr: 0.000046 - momentum: 0.000000
2023-10-15 13:24:34,437 epoch 2 - iter 4689/5212 - loss 0.19121284 - time (sec): 237.51 - samples/sec: 1404.82 - lr: 0.000045 - momentum: 0.000000
2023-10-15 13:25:02,829 epoch 2 - iter 5210/5212 - loss 0.19317263 - time (sec): 265.90 - samples/sec: 1381.36 - lr: 0.000044 - momentum: 0.000000
2023-10-15 13:25:02,932 ----------------------------------------------------------------------------------------------------
2023-10-15 13:25:02,932 EPOCH 2 done: loss 0.1931 - lr: 0.000044
2023-10-15 13:25:12,816 DEV : loss 0.17544493079185486 - f1-score (micro avg) 0.3654
2023-10-15 13:25:12,852 saving best model
2023-10-15 13:25:13,376 ----------------------------------------------------------------------------------------------------
2023-10-15 13:25:39,265 epoch 3 - iter 521/5212 - loss 0.15145173 - time (sec): 25.88 - samples/sec: 1458.63 - lr: 0.000044 - momentum: 0.000000
2023-10-15 13:26:04,335 epoch 3 - iter 1042/5212 - loss 0.15678187 - time (sec): 50.95 - samples/sec: 1438.61 - lr: 0.000043 - momentum: 0.000000
2023-10-15 13:26:30,991 epoch 3 - iter 1563/5212 - loss 0.15448299 - time (sec): 77.61 - samples/sec: 1389.38 - lr: 0.000043 - momentum: 0.000000
2023-10-15 13:26:56,639 epoch 3 - iter 2084/5212 - loss 0.15527194 - time (sec): 103.26 - samples/sec: 1392.24 - lr: 0.000042 - momentum: 0.000000
2023-10-15 13:27:25,339 epoch 3 - iter 2605/5212 - loss 0.15240117 - time (sec): 131.96 - samples/sec: 1368.28 - lr: 0.000042 - momentum: 0.000000
2023-10-15 13:27:54,043 epoch 3 - iter 3126/5212 - loss 0.14898541 - time (sec): 160.66 - samples/sec: 1356.81 - lr: 0.000041 - momentum: 0.000000
2023-10-15 13:28:19,990 epoch 3 - iter 3647/5212 - loss 0.14601801 - time (sec): 186.61 - samples/sec: 1379.17 - lr: 0.000041 - momentum: 0.000000
2023-10-15 13:28:47,027 epoch 3 - iter 4168/5212 - loss 0.14444689 - time (sec): 213.65 - samples/sec: 1371.63 - lr: 0.000040 - momentum: 0.000000
2023-10-15 13:29:14,124 epoch 3 - iter 4689/5212 - loss 0.14484556 - time (sec): 240.74 - samples/sec: 1374.81 - lr: 0.000039 - momentum: 0.000000
2023-10-15 13:29:40,292 epoch 3 - iter 5210/5212 - loss 0.14315344 - time (sec): 266.91 - samples/sec: 1376.40 - lr: 0.000039 - momentum: 0.000000
2023-10-15 13:29:40,389 ----------------------------------------------------------------------------------------------------
2023-10-15 13:29:40,389 EPOCH 3 done: loss 0.1431 - lr: 0.000039
2023-10-15 13:29:49,797 DEV : loss 0.23793098330497742 - f1-score (micro avg) 0.3084
2023-10-15 13:29:49,824 ----------------------------------------------------------------------------------------------------
2023-10-15 13:30:14,394 epoch 4 - iter 521/5212 - loss 0.11205687 - time (sec): 24.57 - samples/sec: 1434.64 - lr: 0.000038 - momentum: 0.000000
2023-10-15 13:30:39,652 epoch 4 - iter 1042/5212 - loss 0.10513806 - time (sec): 49.83 - samples/sec: 1419.82 - lr: 0.000038 - momentum: 0.000000
2023-10-15 13:31:05,128 epoch 4 - iter 1563/5212 - loss 0.11011535 - time (sec): 75.30 - samples/sec: 1448.52 - lr: 0.000037 - momentum: 0.000000
2023-10-15 13:31:30,130 epoch 4 - iter 2084/5212 - loss 0.11135520 - time (sec): 100.31 - samples/sec: 1455.61 - lr: 0.000037 - momentum: 0.000000
2023-10-15 13:31:54,730 epoch 4 - iter 2605/5212 - loss 0.11031903 - time (sec): 124.91 - samples/sec: 1449.83 - lr: 0.000036 - momentum: 0.000000
2023-10-15 13:32:19,870 epoch 4 - iter 3126/5212 - loss 0.10935951 - time (sec): 150.05 - samples/sec: 1461.11 - lr: 0.000036 - momentum: 0.000000
2023-10-15 13:32:44,907 epoch 4 - iter 3647/5212 - loss 0.10865845 - time (sec): 175.08 - samples/sec: 1459.09 - lr: 0.000035 - momentum: 0.000000
2023-10-15 13:33:10,118 epoch 4 - iter 4168/5212 - loss 0.10703262 - time (sec): 200.29 - samples/sec: 1465.07 - lr: 0.000034 - momentum: 0.000000
2023-10-15 13:33:35,323 epoch 4 - iter 4689/5212 - loss 0.10572271 - time (sec): 225.50 - samples/sec: 1457.14 - lr: 0.000034 - momentum: 0.000000
2023-10-15 13:34:01,141 epoch 4 - iter 5210/5212 - loss 0.10526184 - time (sec): 251.32 - samples/sec: 1461.60 - lr: 0.000033 - momentum: 0.000000
2023-10-15 13:34:01,232 ----------------------------------------------------------------------------------------------------
2023-10-15 13:34:01,233 EPOCH 4 done: loss 0.1053 - lr: 0.000033
2023-10-15 13:34:09,661 DEV : loss 0.23926250636577606 - f1-score (micro avg) 0.2991
2023-10-15 13:34:09,704 ----------------------------------------------------------------------------------------------------
2023-10-15 13:34:36,513 epoch 5 - iter 521/5212 - loss 0.08552624 - time (sec): 26.81 - samples/sec: 1340.65 - lr: 0.000033 - momentum: 0.000000
2023-10-15 13:35:01,593 epoch 5 - iter 1042/5212 - loss 0.07856810 - time (sec): 51.89 - samples/sec: 1363.29 - lr: 0.000032 - momentum: 0.000000
2023-10-15 13:35:26,794 epoch 5 - iter 1563/5212 - loss 0.08133709 - time (sec): 77.09 - samples/sec: 1389.39 - lr: 0.000032 - momentum: 0.000000
2023-10-15 13:35:52,056 epoch 5 - iter 2084/5212 - loss 0.08333790 - time (sec): 102.35 - samples/sec: 1406.19 - lr: 0.000031 - momentum: 0.000000
2023-10-15 13:36:17,357 epoch 5 - iter 2605/5212 - loss 0.08277969 - time (sec): 127.65 - samples/sec: 1409.26 - lr: 0.000031 - momentum: 0.000000
2023-10-15 13:36:43,034 epoch 5 - iter 3126/5212 - loss 0.08273593 - time (sec): 153.33 - samples/sec: 1420.12 - lr: 0.000030 - momentum: 0.000000
2023-10-15 13:37:08,780 epoch 5 - iter 3647/5212 - loss 0.08084529 - time (sec): 179.07 - samples/sec: 1425.88 - lr: 0.000029 - momentum: 0.000000
2023-10-15 13:37:34,843 epoch 5 - iter 4168/5212 - loss 0.08016018 - time (sec): 205.14 - samples/sec: 1436.62 - lr: 0.000029 - momentum: 0.000000
2023-10-15 13:38:00,799 epoch 5 - iter 4689/5212 - loss 0.07996177 - time (sec): 231.09 - samples/sec: 1442.23 - lr: 0.000028 - momentum: 0.000000
2023-10-15 13:38:25,393 epoch 5 - iter 5210/5212 - loss 0.08003152 - time (sec): 255.69 - samples/sec: 1436.83 - lr: 0.000028 - momentum: 0.000000
2023-10-15 13:38:25,484 ----------------------------------------------------------------------------------------------------
2023-10-15 13:38:25,484 EPOCH 5 done: loss 0.0800 - lr: 0.000028
2023-10-15 13:38:33,739 DEV : loss 0.3471781313419342 - f1-score (micro avg) 0.3072
2023-10-15 13:38:33,768 ----------------------------------------------------------------------------------------------------
2023-10-15 13:38:59,130 epoch 6 - iter 521/5212 - loss 0.04893610 - time (sec): 25.36 - samples/sec: 1550.29 - lr: 0.000027 - momentum: 0.000000
2023-10-15 13:39:24,147 epoch 6 - iter 1042/5212 - loss 0.05092956 - time (sec): 50.38 - samples/sec: 1516.20 - lr: 0.000027 - momentum: 0.000000
2023-10-15 13:39:50,524 epoch 6 - iter 1563/5212 - loss 0.05455800 - time (sec): 76.76 - samples/sec: 1509.47 - lr: 0.000026 - momentum: 0.000000
2023-10-15 13:40:16,811 epoch 6 - iter 2084/5212 - loss 0.05559083 - time (sec): 103.04 - samples/sec: 1460.28 - lr: 0.000026 - momentum: 0.000000
2023-10-15 13:40:41,525 epoch 6 - iter 2605/5212 - loss 0.05732132 - time (sec): 127.76 - samples/sec: 1434.33 - lr: 0.000025 - momentum: 0.000000
2023-10-15 13:41:07,304 epoch 6 - iter 3126/5212 - loss 0.05727400 - time (sec): 153.54 - samples/sec: 1440.88 - lr: 0.000024 - momentum: 0.000000
2023-10-15 13:41:32,019 epoch 6 - iter 3647/5212 - loss 0.05654496 - time (sec): 178.25 - samples/sec: 1428.82 - lr: 0.000024 - momentum: 0.000000
2023-10-15 13:41:57,953 epoch 6 - iter 4168/5212 - loss 0.05631661 - time (sec): 204.18 - samples/sec: 1442.93 - lr: 0.000023 - momentum: 0.000000
2023-10-15 13:42:23,594 epoch 6 - iter 4689/5212 - loss 0.05593561 - time (sec): 229.82 - samples/sec: 1445.76 - lr: 0.000023 - momentum: 0.000000
2023-10-15 13:42:48,376 epoch 6 - iter 5210/5212 - loss 0.05663169 - time (sec): 254.61 - samples/sec: 1442.84 - lr: 0.000022 - momentum: 0.000000
2023-10-15 13:42:48,471 ----------------------------------------------------------------------------------------------------
2023-10-15 13:42:48,471 EPOCH 6 done: loss 0.0566 - lr: 0.000022
2023-10-15 13:42:56,767 DEV : loss 0.3654634654521942 - f1-score (micro avg) 0.3514
2023-10-15 13:42:56,796 ----------------------------------------------------------------------------------------------------
2023-10-15 13:43:21,792 epoch 7 - iter 521/5212 - loss 0.03312481 - time (sec): 25.00 - samples/sec: 1410.93 - lr: 0.000022 - momentum: 0.000000
2023-10-15 13:43:47,410 epoch 7 - iter 1042/5212 - loss 0.03774246 - time (sec): 50.61 - samples/sec: 1425.50 - lr: 0.000021 - momentum: 0.000000
2023-10-15 13:44:12,732 epoch 7 - iter 1563/5212 - loss 0.04380652 - time (sec): 75.94 - samples/sec: 1443.44 - lr: 0.000021 - momentum: 0.000000
2023-10-15 13:44:37,716 epoch 7 - iter 2084/5212 - loss 0.04357735 - time (sec): 100.92 - samples/sec: 1453.08 - lr: 0.000020 - momentum: 0.000000
2023-10-15 13:45:02,712 epoch 7 - iter 2605/5212 - loss 0.04260736 - time (sec): 125.92 - samples/sec: 1453.59 - lr: 0.000019 - momentum: 0.000000
2023-10-15 13:45:27,992 epoch 7 - iter 3126/5212 - loss 0.04472481 - time (sec): 151.20 - samples/sec: 1454.29 - lr: 0.000019 - momentum: 0.000000
2023-10-15 13:45:54,498 epoch 7 - iter 3647/5212 - loss 0.04429013 - time (sec): 177.70 - samples/sec: 1454.84 - lr: 0.000018 - momentum: 0.000000
2023-10-15 13:46:19,450 epoch 7 - iter 4168/5212 - loss 0.04403234 - time (sec): 202.65 - samples/sec: 1452.76 - lr: 0.000018 - momentum: 0.000000
2023-10-15 13:46:44,374 epoch 7 - iter 4689/5212 - loss 0.04448230 - time (sec): 227.58 - samples/sec: 1451.83 - lr: 0.000017 - momentum: 0.000000
2023-10-15 13:47:09,346 epoch 7 - iter 5210/5212 - loss 0.04403781 - time (sec): 252.55 - samples/sec: 1453.77 - lr: 0.000017 - momentum: 0.000000
2023-10-15 13:47:09,455 ----------------------------------------------------------------------------------------------------
2023-10-15 13:47:09,455 EPOCH 7 done: loss 0.0440 - lr: 0.000017
2023-10-15 13:47:17,889 DEV : loss 0.34585681557655334 - f1-score (micro avg) 0.3311
2023-10-15 13:47:17,919 ----------------------------------------------------------------------------------------------------
2023-10-15 13:47:43,267 epoch 8 - iter 521/5212 - loss 0.02689241 - time (sec): 25.35 - samples/sec: 1476.10 - lr: 0.000016 - momentum: 0.000000
2023-10-15 13:48:08,186 epoch 8 - iter 1042/5212 - loss 0.02653419 - time (sec): 50.27 - samples/sec: 1445.14 - lr: 0.000016 - momentum: 0.000000
2023-10-15 13:48:33,822 epoch 8 - iter 1563/5212 - loss 0.02637713 - time (sec): 75.90 - samples/sec: 1441.02 - lr: 0.000015 - momentum: 0.000000
2023-10-15 13:48:59,594 epoch 8 - iter 2084/5212 - loss 0.02733328 - time (sec): 101.67 - samples/sec: 1424.32 - lr: 0.000014 - momentum: 0.000000
2023-10-15 13:49:24,714 epoch 8 - iter 2605/5212 - loss 0.02880069 - time (sec): 126.79 - samples/sec: 1426.83 - lr: 0.000014 - momentum: 0.000000
2023-10-15 13:49:51,571 epoch 8 - iter 3126/5212 - loss 0.02954909 - time (sec): 153.65 - samples/sec: 1435.69 - lr: 0.000013 - momentum: 0.000000
2023-10-15 13:50:16,561 epoch 8 - iter 3647/5212 - loss 0.03040527 - time (sec): 178.64 - samples/sec: 1444.08 - lr: 0.000013 - momentum: 0.000000
2023-10-15 13:50:41,614 epoch 8 - iter 4168/5212 - loss 0.03036981 - time (sec): 203.69 - samples/sec: 1438.53 - lr: 0.000012 - momentum: 0.000000
2023-10-15 13:51:06,615 epoch 8 - iter 4689/5212 - loss 0.02995803 - time (sec): 228.69 - samples/sec: 1442.97 - lr: 0.000012 - momentum: 0.000000
2023-10-15 13:51:33,708 epoch 8 - iter 5210/5212 - loss 0.03001161 - time (sec): 255.79 - samples/sec: 1435.88 - lr: 0.000011 - momentum: 0.000000
2023-10-15 13:51:33,815 ----------------------------------------------------------------------------------------------------
2023-10-15 13:51:33,816 EPOCH 8 done: loss 0.0300 - lr: 0.000011
2023-10-15 13:51:42,835 DEV : loss 0.43034762144088745 - f1-score (micro avg) 0.3493
2023-10-15 13:51:42,865 ----------------------------------------------------------------------------------------------------
2023-10-15 13:52:08,590 epoch 9 - iter 521/5212 - loss 0.02763096 - time (sec): 25.72 - samples/sec: 1534.61 - lr: 0.000011 - momentum: 0.000000
2023-10-15 13:52:33,552 epoch 9 - iter 1042/5212 - loss 0.02207242 - time (sec): 50.69 - samples/sec: 1519.59 - lr: 0.000010 - momentum: 0.000000
2023-10-15 13:52:58,597 epoch 9 - iter 1563/5212 - loss 0.02236283 - time (sec): 75.73 - samples/sec: 1460.68 - lr: 0.000009 - momentum: 0.000000
2023-10-15 13:53:23,759 epoch 9 - iter 2084/5212 - loss 0.02177159 - time (sec): 100.89 - samples/sec: 1469.42 - lr: 0.000009 - momentum: 0.000000
2023-10-15 13:53:49,224 epoch 9 - iter 2605/5212 - loss 0.02114199 - time (sec): 126.36 - samples/sec: 1462.95 - lr: 0.000008 - momentum: 0.000000
2023-10-15 13:54:14,015 epoch 9 - iter 3126/5212 - loss 0.02090215 - time (sec): 151.15 - samples/sec: 1458.83 - lr: 0.000008 - momentum: 0.000000
2023-10-15 13:54:39,333 epoch 9 - iter 3647/5212 - loss 0.02100536 - time (sec): 176.47 - samples/sec: 1454.71 - lr: 0.000007 - momentum: 0.000000
2023-10-15 13:55:04,484 epoch 9 - iter 4168/5212 - loss 0.02104071 - time (sec): 201.62 - samples/sec: 1453.99 - lr: 0.000007 - momentum: 0.000000
2023-10-15 13:55:30,243 epoch 9 - iter 4689/5212 - loss 0.02069058 - time (sec): 227.38 - samples/sec: 1439.28 - lr: 0.000006 - momentum: 0.000000
2023-10-15 13:55:56,291 epoch 9 - iter 5210/5212 - loss 0.02049929 - time (sec): 253.42 - samples/sec: 1449.41 - lr: 0.000006 - momentum: 0.000000
2023-10-15 13:55:56,382 ----------------------------------------------------------------------------------------------------
2023-10-15 13:55:56,383 EPOCH 9 done: loss 0.0205 - lr: 0.000006
2023-10-15 13:56:05,381 DEV : loss 0.5122669339179993 - f1-score (micro avg) 0.3381
2023-10-15 13:56:05,407 ----------------------------------------------------------------------------------------------------
2023-10-15 13:56:30,432 epoch 10 - iter 521/5212 - loss 0.01178875 - time (sec): 25.02 - samples/sec: 1530.92 - lr: 0.000005 - momentum: 0.000000
2023-10-15 13:56:55,856 epoch 10 - iter 1042/5212 - loss 0.01468480 - time (sec): 50.45 - samples/sec: 1507.44 - lr: 0.000004 - momentum: 0.000000
2023-10-15 13:57:21,319 epoch 10 - iter 1563/5212 - loss 0.01459527 - time (sec): 75.91 - samples/sec: 1516.01 - lr: 0.000004 - momentum: 0.000000
2023-10-15 13:57:46,537 epoch 10 - iter 2084/5212 - loss 0.01485247 - time (sec): 101.13 - samples/sec: 1506.02 - lr: 0.000003 - momentum: 0.000000
2023-10-15 13:58:11,468 epoch 10 - iter 2605/5212 - loss 0.01444242 - time (sec): 126.06 - samples/sec: 1494.49 - lr: 0.000003 - momentum: 0.000000
2023-10-15 13:58:36,105 epoch 10 - iter 3126/5212 - loss 0.01442687 - time (sec): 150.70 - samples/sec: 1471.55 - lr: 0.000002 - momentum: 0.000000
2023-10-15 13:59:01,128 epoch 10 - iter 3647/5212 - loss 0.01393897 - time (sec): 175.72 - samples/sec: 1466.96 - lr: 0.000002 - momentum: 0.000000
2023-10-15 13:59:26,149 epoch 10 - iter 4168/5212 - loss 0.01439542 - time (sec): 200.74 - samples/sec: 1466.86 - lr: 0.000001 - momentum: 0.000000
2023-10-15 13:59:51,374 epoch 10 - iter 4689/5212 - loss 0.01427652 - time (sec): 225.97 - samples/sec: 1470.47 - lr: 0.000001 - momentum: 0.000000
2023-10-15 14:00:15,042 epoch 10 - iter 5210/5212 - loss 0.01384805 - time (sec): 249.63 - samples/sec: 1471.63 - lr: 0.000000 - momentum: 0.000000
2023-10-15 14:00:15,126 ----------------------------------------------------------------------------------------------------
2023-10-15 14:00:15,126 EPOCH 10 done: loss 0.0138 - lr: 0.000000
2023-10-15 14:00:24,139 DEV : loss 0.46498292684555054 - f1-score (micro avg) 0.3499
2023-10-15 14:00:24,544 ----------------------------------------------------------------------------------------------------
2023-10-15 14:00:24,545 Loading model from best epoch ...
2023-10-15 14:00:26,145 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-15 14:00:42,346
Results:
- F-score (micro) 0.2834
- F-score (macro) 0.1787
- Accuracy 0.166
By class:
precision recall f1-score support
LOC 0.3432 0.3534 0.3482 1214
PER 0.3094 0.2030 0.2451 808
ORG 0.1345 0.1105 0.1213 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.3053 0.2644 0.2834 2390
macro avg 0.1968 0.1667 0.1787 2390
weighted avg 0.2988 0.2644 0.2777 2390
2023-10-15 14:00:42,346 ----------------------------------------------------------------------------------------------------