flair-hipe-2022-ajmc-de / training.log
stefan-it's picture
Upload folder using huggingface_hub
e1394b3
2023-10-17 08:46:12,766 ----------------------------------------------------------------------------------------------------
2023-10-17 08:46:12,767 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 08:46:12,767 ----------------------------------------------------------------------------------------------------
2023-10-17 08:46:12,767 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-17 08:46:12,767 ----------------------------------------------------------------------------------------------------
2023-10-17 08:46:12,767 Train: 1100 sentences
2023-10-17 08:46:12,768 (train_with_dev=False, train_with_test=False)
2023-10-17 08:46:12,768 ----------------------------------------------------------------------------------------------------
2023-10-17 08:46:12,768 Training Params:
2023-10-17 08:46:12,768 - learning_rate: "3e-05"
2023-10-17 08:46:12,768 - mini_batch_size: "4"
2023-10-17 08:46:12,768 - max_epochs: "10"
2023-10-17 08:46:12,768 - shuffle: "True"
2023-10-17 08:46:12,768 ----------------------------------------------------------------------------------------------------
2023-10-17 08:46:12,768 Plugins:
2023-10-17 08:46:12,768 - TensorboardLogger
2023-10-17 08:46:12,768 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 08:46:12,768 ----------------------------------------------------------------------------------------------------
2023-10-17 08:46:12,768 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 08:46:12,768 - metric: "('micro avg', 'f1-score')"
2023-10-17 08:46:12,768 ----------------------------------------------------------------------------------------------------
2023-10-17 08:46:12,768 Computation:
2023-10-17 08:46:12,768 - compute on device: cuda:0
2023-10-17 08:46:12,768 - embedding storage: none
2023-10-17 08:46:12,768 ----------------------------------------------------------------------------------------------------
2023-10-17 08:46:12,768 Model training base path: "hmbench-ajmc/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 08:46:12,768 ----------------------------------------------------------------------------------------------------
2023-10-17 08:46:12,768 ----------------------------------------------------------------------------------------------------
2023-10-17 08:46:12,768 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 08:46:14,067 epoch 1 - iter 27/275 - loss 3.97024551 - time (sec): 1.30 - samples/sec: 1816.98 - lr: 0.000003 - momentum: 0.000000
2023-10-17 08:46:15,306 epoch 1 - iter 54/275 - loss 3.41470820 - time (sec): 2.54 - samples/sec: 1694.36 - lr: 0.000006 - momentum: 0.000000
2023-10-17 08:46:16,558 epoch 1 - iter 81/275 - loss 2.68789197 - time (sec): 3.79 - samples/sec: 1760.77 - lr: 0.000009 - momentum: 0.000000
2023-10-17 08:46:17,774 epoch 1 - iter 108/275 - loss 2.21443759 - time (sec): 5.01 - samples/sec: 1747.17 - lr: 0.000012 - momentum: 0.000000
2023-10-17 08:46:19,034 epoch 1 - iter 135/275 - loss 1.87610511 - time (sec): 6.26 - samples/sec: 1745.65 - lr: 0.000015 - momentum: 0.000000
2023-10-17 08:46:20,265 epoch 1 - iter 162/275 - loss 1.63283108 - time (sec): 7.50 - samples/sec: 1762.93 - lr: 0.000018 - momentum: 0.000000
2023-10-17 08:46:21,520 epoch 1 - iter 189/275 - loss 1.44276987 - time (sec): 8.75 - samples/sec: 1797.17 - lr: 0.000021 - momentum: 0.000000
2023-10-17 08:46:22,795 epoch 1 - iter 216/275 - loss 1.28758424 - time (sec): 10.03 - samples/sec: 1819.62 - lr: 0.000023 - momentum: 0.000000
2023-10-17 08:46:24,026 epoch 1 - iter 243/275 - loss 1.19494444 - time (sec): 11.26 - samples/sec: 1791.18 - lr: 0.000026 - momentum: 0.000000
2023-10-17 08:46:25,261 epoch 1 - iter 270/275 - loss 1.10556620 - time (sec): 12.49 - samples/sec: 1790.23 - lr: 0.000029 - momentum: 0.000000
2023-10-17 08:46:25,489 ----------------------------------------------------------------------------------------------------
2023-10-17 08:46:25,489 EPOCH 1 done: loss 1.0898 - lr: 0.000029
2023-10-17 08:46:26,237 DEV : loss 0.2100459337234497 - f1-score (micro avg) 0.7394
2023-10-17 08:46:26,242 saving best model
2023-10-17 08:46:26,590 ----------------------------------------------------------------------------------------------------
2023-10-17 08:46:27,824 epoch 2 - iter 27/275 - loss 0.17970721 - time (sec): 1.23 - samples/sec: 1835.24 - lr: 0.000030 - momentum: 0.000000
2023-10-17 08:46:29,079 epoch 2 - iter 54/275 - loss 0.18007533 - time (sec): 2.49 - samples/sec: 1801.59 - lr: 0.000029 - momentum: 0.000000
2023-10-17 08:46:30,296 epoch 2 - iter 81/275 - loss 0.18204146 - time (sec): 3.70 - samples/sec: 1799.96 - lr: 0.000029 - momentum: 0.000000
2023-10-17 08:46:31,517 epoch 2 - iter 108/275 - loss 0.18586872 - time (sec): 4.93 - samples/sec: 1836.50 - lr: 0.000029 - momentum: 0.000000
2023-10-17 08:46:32,804 epoch 2 - iter 135/275 - loss 0.18371927 - time (sec): 6.21 - samples/sec: 1805.97 - lr: 0.000028 - momentum: 0.000000
2023-10-17 08:46:34,029 epoch 2 - iter 162/275 - loss 0.18284554 - time (sec): 7.44 - samples/sec: 1819.86 - lr: 0.000028 - momentum: 0.000000
2023-10-17 08:46:35,293 epoch 2 - iter 189/275 - loss 0.17931458 - time (sec): 8.70 - samples/sec: 1810.73 - lr: 0.000028 - momentum: 0.000000
2023-10-17 08:46:36,602 epoch 2 - iter 216/275 - loss 0.18162776 - time (sec): 10.01 - samples/sec: 1831.89 - lr: 0.000027 - momentum: 0.000000
2023-10-17 08:46:37,845 epoch 2 - iter 243/275 - loss 0.17965500 - time (sec): 11.25 - samples/sec: 1822.02 - lr: 0.000027 - momentum: 0.000000
2023-10-17 08:46:39,081 epoch 2 - iter 270/275 - loss 0.17595326 - time (sec): 12.49 - samples/sec: 1797.58 - lr: 0.000027 - momentum: 0.000000
2023-10-17 08:46:39,300 ----------------------------------------------------------------------------------------------------
2023-10-17 08:46:39,300 EPOCH 2 done: loss 0.1765 - lr: 0.000027
2023-10-17 08:46:39,946 DEV : loss 0.20042622089385986 - f1-score (micro avg) 0.7796
2023-10-17 08:46:39,951 saving best model
2023-10-17 08:46:40,414 ----------------------------------------------------------------------------------------------------
2023-10-17 08:46:41,662 epoch 3 - iter 27/275 - loss 0.12635593 - time (sec): 1.25 - samples/sec: 1690.00 - lr: 0.000026 - momentum: 0.000000
2023-10-17 08:46:42,881 epoch 3 - iter 54/275 - loss 0.09902783 - time (sec): 2.46 - samples/sec: 1597.56 - lr: 0.000026 - momentum: 0.000000
2023-10-17 08:46:44,101 epoch 3 - iter 81/275 - loss 0.09432280 - time (sec): 3.68 - samples/sec: 1680.69 - lr: 0.000026 - momentum: 0.000000
2023-10-17 08:46:45,347 epoch 3 - iter 108/275 - loss 0.10335773 - time (sec): 4.93 - samples/sec: 1757.59 - lr: 0.000025 - momentum: 0.000000
2023-10-17 08:46:46,572 epoch 3 - iter 135/275 - loss 0.09804934 - time (sec): 6.16 - samples/sec: 1747.46 - lr: 0.000025 - momentum: 0.000000
2023-10-17 08:46:47,787 epoch 3 - iter 162/275 - loss 0.09737870 - time (sec): 7.37 - samples/sec: 1775.22 - lr: 0.000025 - momentum: 0.000000
2023-10-17 08:46:49,052 epoch 3 - iter 189/275 - loss 0.09737990 - time (sec): 8.64 - samples/sec: 1765.38 - lr: 0.000024 - momentum: 0.000000
2023-10-17 08:46:50,289 epoch 3 - iter 216/275 - loss 0.10009286 - time (sec): 9.87 - samples/sec: 1777.82 - lr: 0.000024 - momentum: 0.000000
2023-10-17 08:46:51,552 epoch 3 - iter 243/275 - loss 0.10060457 - time (sec): 11.14 - samples/sec: 1789.43 - lr: 0.000024 - momentum: 0.000000
2023-10-17 08:46:52,790 epoch 3 - iter 270/275 - loss 0.10670080 - time (sec): 12.37 - samples/sec: 1800.99 - lr: 0.000023 - momentum: 0.000000
2023-10-17 08:46:53,026 ----------------------------------------------------------------------------------------------------
2023-10-17 08:46:53,026 EPOCH 3 done: loss 0.1055 - lr: 0.000023
2023-10-17 08:46:53,665 DEV : loss 0.14525996148586273 - f1-score (micro avg) 0.852
2023-10-17 08:46:53,670 saving best model
2023-10-17 08:46:54,106 ----------------------------------------------------------------------------------------------------
2023-10-17 08:46:55,432 epoch 4 - iter 27/275 - loss 0.07663780 - time (sec): 1.32 - samples/sec: 1604.50 - lr: 0.000023 - momentum: 0.000000
2023-10-17 08:46:56,710 epoch 4 - iter 54/275 - loss 0.05673337 - time (sec): 2.60 - samples/sec: 1664.79 - lr: 0.000023 - momentum: 0.000000
2023-10-17 08:46:57,963 epoch 4 - iter 81/275 - loss 0.08993934 - time (sec): 3.85 - samples/sec: 1680.69 - lr: 0.000022 - momentum: 0.000000
2023-10-17 08:46:59,202 epoch 4 - iter 108/275 - loss 0.08785841 - time (sec): 5.09 - samples/sec: 1712.71 - lr: 0.000022 - momentum: 0.000000
2023-10-17 08:47:00,555 epoch 4 - iter 135/275 - loss 0.08309543 - time (sec): 6.45 - samples/sec: 1682.77 - lr: 0.000022 - momentum: 0.000000
2023-10-17 08:47:01,879 epoch 4 - iter 162/275 - loss 0.08392456 - time (sec): 7.77 - samples/sec: 1709.62 - lr: 0.000021 - momentum: 0.000000
2023-10-17 08:47:03,101 epoch 4 - iter 189/275 - loss 0.08150298 - time (sec): 8.99 - samples/sec: 1713.53 - lr: 0.000021 - momentum: 0.000000
2023-10-17 08:47:04,325 epoch 4 - iter 216/275 - loss 0.08090320 - time (sec): 10.22 - samples/sec: 1750.66 - lr: 0.000021 - momentum: 0.000000
2023-10-17 08:47:05,557 epoch 4 - iter 243/275 - loss 0.08222903 - time (sec): 11.45 - samples/sec: 1756.15 - lr: 0.000020 - momentum: 0.000000
2023-10-17 08:47:06,785 epoch 4 - iter 270/275 - loss 0.08283375 - time (sec): 12.68 - samples/sec: 1757.16 - lr: 0.000020 - momentum: 0.000000
2023-10-17 08:47:07,008 ----------------------------------------------------------------------------------------------------
2023-10-17 08:47:07,008 EPOCH 4 done: loss 0.0815 - lr: 0.000020
2023-10-17 08:47:07,649 DEV : loss 0.16326691210269928 - f1-score (micro avg) 0.8735
2023-10-17 08:47:07,654 saving best model
2023-10-17 08:47:08,086 ----------------------------------------------------------------------------------------------------
2023-10-17 08:47:09,299 epoch 5 - iter 27/275 - loss 0.11167857 - time (sec): 1.21 - samples/sec: 2006.63 - lr: 0.000020 - momentum: 0.000000
2023-10-17 08:47:10,524 epoch 5 - iter 54/275 - loss 0.08355736 - time (sec): 2.44 - samples/sec: 1883.54 - lr: 0.000019 - momentum: 0.000000
2023-10-17 08:47:11,747 epoch 5 - iter 81/275 - loss 0.06609019 - time (sec): 3.66 - samples/sec: 1807.76 - lr: 0.000019 - momentum: 0.000000
2023-10-17 08:47:13,030 epoch 5 - iter 108/275 - loss 0.07976555 - time (sec): 4.94 - samples/sec: 1735.78 - lr: 0.000019 - momentum: 0.000000
2023-10-17 08:47:14,249 epoch 5 - iter 135/275 - loss 0.07909386 - time (sec): 6.16 - samples/sec: 1767.66 - lr: 0.000018 - momentum: 0.000000
2023-10-17 08:47:15,546 epoch 5 - iter 162/275 - loss 0.07437644 - time (sec): 7.46 - samples/sec: 1760.74 - lr: 0.000018 - momentum: 0.000000
2023-10-17 08:47:16,766 epoch 5 - iter 189/275 - loss 0.07040829 - time (sec): 8.68 - samples/sec: 1780.58 - lr: 0.000018 - momentum: 0.000000
2023-10-17 08:47:17,986 epoch 5 - iter 216/275 - loss 0.06799222 - time (sec): 9.90 - samples/sec: 1811.00 - lr: 0.000017 - momentum: 0.000000
2023-10-17 08:47:19,246 epoch 5 - iter 243/275 - loss 0.06299278 - time (sec): 11.16 - samples/sec: 1805.20 - lr: 0.000017 - momentum: 0.000000
2023-10-17 08:47:20,482 epoch 5 - iter 270/275 - loss 0.06246345 - time (sec): 12.40 - samples/sec: 1806.75 - lr: 0.000017 - momentum: 0.000000
2023-10-17 08:47:20,713 ----------------------------------------------------------------------------------------------------
2023-10-17 08:47:20,713 EPOCH 5 done: loss 0.0624 - lr: 0.000017
2023-10-17 08:47:21,345 DEV : loss 0.16908515989780426 - f1-score (micro avg) 0.8803
2023-10-17 08:47:21,350 saving best model
2023-10-17 08:47:21,779 ----------------------------------------------------------------------------------------------------
2023-10-17 08:47:22,983 epoch 6 - iter 27/275 - loss 0.05658757 - time (sec): 1.20 - samples/sec: 1666.25 - lr: 0.000016 - momentum: 0.000000
2023-10-17 08:47:24,261 epoch 6 - iter 54/275 - loss 0.05408551 - time (sec): 2.48 - samples/sec: 1693.91 - lr: 0.000016 - momentum: 0.000000
2023-10-17 08:47:25,553 epoch 6 - iter 81/275 - loss 0.06296102 - time (sec): 3.77 - samples/sec: 1725.26 - lr: 0.000016 - momentum: 0.000000
2023-10-17 08:47:26,766 epoch 6 - iter 108/275 - loss 0.06062024 - time (sec): 4.98 - samples/sec: 1734.48 - lr: 0.000015 - momentum: 0.000000
2023-10-17 08:47:27,990 epoch 6 - iter 135/275 - loss 0.05487807 - time (sec): 6.21 - samples/sec: 1761.68 - lr: 0.000015 - momentum: 0.000000
2023-10-17 08:47:29,247 epoch 6 - iter 162/275 - loss 0.04790647 - time (sec): 7.46 - samples/sec: 1758.75 - lr: 0.000015 - momentum: 0.000000
2023-10-17 08:47:30,503 epoch 6 - iter 189/275 - loss 0.04791047 - time (sec): 8.72 - samples/sec: 1759.19 - lr: 0.000014 - momentum: 0.000000
2023-10-17 08:47:31,762 epoch 6 - iter 216/275 - loss 0.04903596 - time (sec): 9.98 - samples/sec: 1765.27 - lr: 0.000014 - momentum: 0.000000
2023-10-17 08:47:33,033 epoch 6 - iter 243/275 - loss 0.05332217 - time (sec): 11.25 - samples/sec: 1784.97 - lr: 0.000014 - momentum: 0.000000
2023-10-17 08:47:34,255 epoch 6 - iter 270/275 - loss 0.05091613 - time (sec): 12.47 - samples/sec: 1787.09 - lr: 0.000013 - momentum: 0.000000
2023-10-17 08:47:34,480 ----------------------------------------------------------------------------------------------------
2023-10-17 08:47:34,481 EPOCH 6 done: loss 0.0501 - lr: 0.000013
2023-10-17 08:47:35,157 DEV : loss 0.17602381110191345 - f1-score (micro avg) 0.8822
2023-10-17 08:47:35,167 saving best model
2023-10-17 08:47:35,710 ----------------------------------------------------------------------------------------------------
2023-10-17 08:47:37,152 epoch 7 - iter 27/275 - loss 0.08149249 - time (sec): 1.44 - samples/sec: 1431.14 - lr: 0.000013 - momentum: 0.000000
2023-10-17 08:47:38,645 epoch 7 - iter 54/275 - loss 0.05422051 - time (sec): 2.93 - samples/sec: 1503.81 - lr: 0.000013 - momentum: 0.000000
2023-10-17 08:47:40,059 epoch 7 - iter 81/275 - loss 0.06668234 - time (sec): 4.35 - samples/sec: 1541.04 - lr: 0.000012 - momentum: 0.000000
2023-10-17 08:47:41,325 epoch 7 - iter 108/275 - loss 0.05693708 - time (sec): 5.61 - samples/sec: 1591.18 - lr: 0.000012 - momentum: 0.000000
2023-10-17 08:47:42,586 epoch 7 - iter 135/275 - loss 0.04821359 - time (sec): 6.87 - samples/sec: 1607.07 - lr: 0.000012 - momentum: 0.000000
2023-10-17 08:47:43,887 epoch 7 - iter 162/275 - loss 0.04377735 - time (sec): 8.18 - samples/sec: 1638.57 - lr: 0.000011 - momentum: 0.000000
2023-10-17 08:47:45,191 epoch 7 - iter 189/275 - loss 0.04185787 - time (sec): 9.48 - samples/sec: 1673.78 - lr: 0.000011 - momentum: 0.000000
2023-10-17 08:47:46,422 epoch 7 - iter 216/275 - loss 0.04208472 - time (sec): 10.71 - samples/sec: 1678.53 - lr: 0.000011 - momentum: 0.000000
2023-10-17 08:47:47,681 epoch 7 - iter 243/275 - loss 0.04099767 - time (sec): 11.97 - samples/sec: 1674.21 - lr: 0.000010 - momentum: 0.000000
2023-10-17 08:47:49,017 epoch 7 - iter 270/275 - loss 0.04134562 - time (sec): 13.31 - samples/sec: 1678.74 - lr: 0.000010 - momentum: 0.000000
2023-10-17 08:47:49,256 ----------------------------------------------------------------------------------------------------
2023-10-17 08:47:49,256 EPOCH 7 done: loss 0.0406 - lr: 0.000010
2023-10-17 08:47:49,919 DEV : loss 0.18893340229988098 - f1-score (micro avg) 0.8723
2023-10-17 08:47:49,923 ----------------------------------------------------------------------------------------------------
2023-10-17 08:47:51,156 epoch 8 - iter 27/275 - loss 0.00906963 - time (sec): 1.23 - samples/sec: 1960.66 - lr: 0.000010 - momentum: 0.000000
2023-10-17 08:47:52,384 epoch 8 - iter 54/275 - loss 0.02783979 - time (sec): 2.46 - samples/sec: 1887.13 - lr: 0.000009 - momentum: 0.000000
2023-10-17 08:47:53,638 epoch 8 - iter 81/275 - loss 0.02990417 - time (sec): 3.71 - samples/sec: 1848.33 - lr: 0.000009 - momentum: 0.000000
2023-10-17 08:47:54,853 epoch 8 - iter 108/275 - loss 0.02627080 - time (sec): 4.93 - samples/sec: 1826.83 - lr: 0.000009 - momentum: 0.000000
2023-10-17 08:47:56,119 epoch 8 - iter 135/275 - loss 0.02263615 - time (sec): 6.19 - samples/sec: 1866.59 - lr: 0.000008 - momentum: 0.000000
2023-10-17 08:47:57,325 epoch 8 - iter 162/275 - loss 0.02311608 - time (sec): 7.40 - samples/sec: 1842.73 - lr: 0.000008 - momentum: 0.000000
2023-10-17 08:47:58,559 epoch 8 - iter 189/275 - loss 0.02151264 - time (sec): 8.63 - samples/sec: 1843.38 - lr: 0.000008 - momentum: 0.000000
2023-10-17 08:47:59,779 epoch 8 - iter 216/275 - loss 0.02896352 - time (sec): 9.85 - samples/sec: 1837.29 - lr: 0.000007 - momentum: 0.000000
2023-10-17 08:48:01,054 epoch 8 - iter 243/275 - loss 0.02972546 - time (sec): 11.13 - samples/sec: 1825.46 - lr: 0.000007 - momentum: 0.000000
2023-10-17 08:48:02,272 epoch 8 - iter 270/275 - loss 0.02940927 - time (sec): 12.35 - samples/sec: 1816.05 - lr: 0.000007 - momentum: 0.000000
2023-10-17 08:48:02,503 ----------------------------------------------------------------------------------------------------
2023-10-17 08:48:02,503 EPOCH 8 done: loss 0.0301 - lr: 0.000007
2023-10-17 08:48:03,194 DEV : loss 0.17884737253189087 - f1-score (micro avg) 0.8905
2023-10-17 08:48:03,198 saving best model
2023-10-17 08:48:03,654 ----------------------------------------------------------------------------------------------------
2023-10-17 08:48:04,846 epoch 9 - iter 27/275 - loss 0.03662375 - time (sec): 1.19 - samples/sec: 2005.60 - lr: 0.000006 - momentum: 0.000000
2023-10-17 08:48:06,020 epoch 9 - iter 54/275 - loss 0.02152339 - time (sec): 2.36 - samples/sec: 2035.85 - lr: 0.000006 - momentum: 0.000000
2023-10-17 08:48:07,185 epoch 9 - iter 81/275 - loss 0.02197956 - time (sec): 3.53 - samples/sec: 1949.37 - lr: 0.000006 - momentum: 0.000000
2023-10-17 08:48:08,371 epoch 9 - iter 108/275 - loss 0.02212008 - time (sec): 4.72 - samples/sec: 1896.48 - lr: 0.000005 - momentum: 0.000000
2023-10-17 08:48:09,590 epoch 9 - iter 135/275 - loss 0.02084789 - time (sec): 5.93 - samples/sec: 1851.36 - lr: 0.000005 - momentum: 0.000000
2023-10-17 08:48:10,800 epoch 9 - iter 162/275 - loss 0.02603522 - time (sec): 7.14 - samples/sec: 1858.08 - lr: 0.000005 - momentum: 0.000000
2023-10-17 08:48:12,005 epoch 9 - iter 189/275 - loss 0.02658731 - time (sec): 8.35 - samples/sec: 1857.87 - lr: 0.000004 - momentum: 0.000000
2023-10-17 08:48:13,260 epoch 9 - iter 216/275 - loss 0.02480973 - time (sec): 9.60 - samples/sec: 1835.23 - lr: 0.000004 - momentum: 0.000000
2023-10-17 08:48:14,480 epoch 9 - iter 243/275 - loss 0.02327959 - time (sec): 10.82 - samples/sec: 1840.65 - lr: 0.000004 - momentum: 0.000000
2023-10-17 08:48:15,704 epoch 9 - iter 270/275 - loss 0.02433170 - time (sec): 12.05 - samples/sec: 1849.42 - lr: 0.000003 - momentum: 0.000000
2023-10-17 08:48:15,935 ----------------------------------------------------------------------------------------------------
2023-10-17 08:48:15,935 EPOCH 9 done: loss 0.0239 - lr: 0.000003
2023-10-17 08:48:16,663 DEV : loss 0.1809743344783783 - f1-score (micro avg) 0.8884
2023-10-17 08:48:16,668 ----------------------------------------------------------------------------------------------------
2023-10-17 08:48:17,828 epoch 10 - iter 27/275 - loss 0.01640093 - time (sec): 1.16 - samples/sec: 1842.79 - lr: 0.000003 - momentum: 0.000000
2023-10-17 08:48:19,001 epoch 10 - iter 54/275 - loss 0.01996468 - time (sec): 2.33 - samples/sec: 1952.99 - lr: 0.000003 - momentum: 0.000000
2023-10-17 08:48:20,159 epoch 10 - iter 81/275 - loss 0.02308692 - time (sec): 3.49 - samples/sec: 1934.94 - lr: 0.000002 - momentum: 0.000000
2023-10-17 08:48:21,308 epoch 10 - iter 108/275 - loss 0.01854556 - time (sec): 4.64 - samples/sec: 1864.98 - lr: 0.000002 - momentum: 0.000000
2023-10-17 08:48:22,472 epoch 10 - iter 135/275 - loss 0.01695391 - time (sec): 5.80 - samples/sec: 1868.82 - lr: 0.000002 - momentum: 0.000000
2023-10-17 08:48:23,639 epoch 10 - iter 162/275 - loss 0.02042256 - time (sec): 6.97 - samples/sec: 1872.87 - lr: 0.000001 - momentum: 0.000000
2023-10-17 08:48:24,861 epoch 10 - iter 189/275 - loss 0.01872876 - time (sec): 8.19 - samples/sec: 1894.03 - lr: 0.000001 - momentum: 0.000000
2023-10-17 08:48:26,090 epoch 10 - iter 216/275 - loss 0.02260026 - time (sec): 9.42 - samples/sec: 1892.11 - lr: 0.000001 - momentum: 0.000000
2023-10-17 08:48:27,338 epoch 10 - iter 243/275 - loss 0.02105999 - time (sec): 10.67 - samples/sec: 1887.12 - lr: 0.000000 - momentum: 0.000000
2023-10-17 08:48:28,558 epoch 10 - iter 270/275 - loss 0.01967608 - time (sec): 11.89 - samples/sec: 1879.11 - lr: 0.000000 - momentum: 0.000000
2023-10-17 08:48:28,795 ----------------------------------------------------------------------------------------------------
2023-10-17 08:48:28,795 EPOCH 10 done: loss 0.0199 - lr: 0.000000
2023-10-17 08:48:29,444 DEV : loss 0.18379689753055573 - f1-score (micro avg) 0.8828
2023-10-17 08:48:29,797 ----------------------------------------------------------------------------------------------------
2023-10-17 08:48:29,799 Loading model from best epoch ...
2023-10-17 08:48:31,311 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-17 08:48:31,953
Results:
- F-score (micro) 0.9062
- F-score (macro) 0.6759
- Accuracy 0.8469
By class:
precision recall f1-score support
scope 0.9023 0.8920 0.8971 176
pers 0.9680 0.9453 0.9565 128
work 0.8533 0.8649 0.8591 74
loc 1.0000 0.5000 0.6667 2
object 0.0000 0.0000 0.0000 2
micro avg 0.9147 0.8979 0.9062 382
macro avg 0.7447 0.6404 0.6759 382
weighted avg 0.9106 0.8979 0.9038 382
2023-10-17 08:48:31,954 ----------------------------------------------------------------------------------------------------