|
2023-10-13 08:43:02,471 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:02,472 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=25, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 08:43:02,472 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:02,472 MultiCorpus: 1100 train + 206 dev + 240 test sentences |
|
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator |
|
2023-10-13 08:43:02,472 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:02,472 Train: 1100 sentences |
|
2023-10-13 08:43:02,472 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 08:43:02,472 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:02,472 Training Params: |
|
2023-10-13 08:43:02,472 - learning_rate: "5e-05" |
|
2023-10-13 08:43:02,472 - mini_batch_size: "4" |
|
2023-10-13 08:43:02,472 - max_epochs: "10" |
|
2023-10-13 08:43:02,472 - shuffle: "True" |
|
2023-10-13 08:43:02,472 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:02,473 Plugins: |
|
2023-10-13 08:43:02,473 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 08:43:02,473 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:02,473 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 08:43:02,473 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 08:43:02,473 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:02,473 Computation: |
|
2023-10-13 08:43:02,473 - compute on device: cuda:0 |
|
2023-10-13 08:43:02,473 - embedding storage: none |
|
2023-10-13 08:43:02,473 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:02,473 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-13 08:43:02,473 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:02,473 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:03,736 epoch 1 - iter 27/275 - loss 3.26762455 - time (sec): 1.26 - samples/sec: 1709.25 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 08:43:04,936 epoch 1 - iter 54/275 - loss 2.61858234 - time (sec): 2.46 - samples/sec: 1794.70 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 08:43:06,121 epoch 1 - iter 81/275 - loss 2.06451088 - time (sec): 3.65 - samples/sec: 1808.60 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 08:43:07,305 epoch 1 - iter 108/275 - loss 1.70976718 - time (sec): 4.83 - samples/sec: 1817.03 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 08:43:08,553 epoch 1 - iter 135/275 - loss 1.46718450 - time (sec): 6.08 - samples/sec: 1837.67 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 08:43:09,740 epoch 1 - iter 162/275 - loss 1.29811689 - time (sec): 7.27 - samples/sec: 1839.88 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 08:43:10,956 epoch 1 - iter 189/275 - loss 1.15204599 - time (sec): 8.48 - samples/sec: 1853.59 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 08:43:12,195 epoch 1 - iter 216/275 - loss 1.04030680 - time (sec): 9.72 - samples/sec: 1855.80 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 08:43:13,467 epoch 1 - iter 243/275 - loss 0.96060765 - time (sec): 10.99 - samples/sec: 1834.61 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-13 08:43:14,708 epoch 1 - iter 270/275 - loss 0.89047507 - time (sec): 12.23 - samples/sec: 1822.47 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 08:43:14,957 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:14,957 EPOCH 1 done: loss 0.8777 - lr: 0.000049 |
|
2023-10-13 08:43:15,936 DEV : loss 0.2361612170934677 - f1-score (micro avg) 0.6943 |
|
2023-10-13 08:43:15,945 saving best model |
|
2023-10-13 08:43:16,296 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:17,666 epoch 2 - iter 27/275 - loss 0.25324016 - time (sec): 1.37 - samples/sec: 1521.92 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 08:43:19,054 epoch 2 - iter 54/275 - loss 0.21752413 - time (sec): 2.76 - samples/sec: 1589.66 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-13 08:43:20,273 epoch 2 - iter 81/275 - loss 0.20318272 - time (sec): 3.98 - samples/sec: 1633.17 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 08:43:21,423 epoch 2 - iter 108/275 - loss 0.18840612 - time (sec): 5.13 - samples/sec: 1725.84 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 08:43:22,590 epoch 2 - iter 135/275 - loss 0.17939364 - time (sec): 6.29 - samples/sec: 1758.63 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 08:43:23,760 epoch 2 - iter 162/275 - loss 0.17560780 - time (sec): 7.46 - samples/sec: 1809.56 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 08:43:24,917 epoch 2 - iter 189/275 - loss 0.16513976 - time (sec): 8.62 - samples/sec: 1813.69 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 08:43:26,123 epoch 2 - iter 216/275 - loss 0.16618392 - time (sec): 9.83 - samples/sec: 1836.15 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 08:43:27,341 epoch 2 - iter 243/275 - loss 0.16274574 - time (sec): 11.04 - samples/sec: 1835.74 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 08:43:28,528 epoch 2 - iter 270/275 - loss 0.16043632 - time (sec): 12.23 - samples/sec: 1821.92 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 08:43:28,760 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:28,760 EPOCH 2 done: loss 0.1631 - lr: 0.000045 |
|
2023-10-13 08:43:29,496 DEV : loss 0.13491378724575043 - f1-score (micro avg) 0.8185 |
|
2023-10-13 08:43:29,504 saving best model |
|
2023-10-13 08:43:30,065 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:31,469 epoch 3 - iter 27/275 - loss 0.09269037 - time (sec): 1.40 - samples/sec: 1632.01 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-13 08:43:32,883 epoch 3 - iter 54/275 - loss 0.10066397 - time (sec): 2.82 - samples/sec: 1592.28 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 08:43:34,176 epoch 3 - iter 81/275 - loss 0.10912877 - time (sec): 4.11 - samples/sec: 1629.65 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 08:43:35,461 epoch 3 - iter 108/275 - loss 0.10894748 - time (sec): 5.39 - samples/sec: 1665.22 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 08:43:36,721 epoch 3 - iter 135/275 - loss 0.10675830 - time (sec): 6.65 - samples/sec: 1687.14 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 08:43:37,984 epoch 3 - iter 162/275 - loss 0.10865347 - time (sec): 7.92 - samples/sec: 1698.31 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 08:43:39,250 epoch 3 - iter 189/275 - loss 0.10904253 - time (sec): 9.18 - samples/sec: 1736.02 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 08:43:40,472 epoch 3 - iter 216/275 - loss 0.10403065 - time (sec): 10.40 - samples/sec: 1741.79 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 08:43:41,782 epoch 3 - iter 243/275 - loss 0.09802057 - time (sec): 11.71 - samples/sec: 1741.04 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 08:43:43,048 epoch 3 - iter 270/275 - loss 0.10304729 - time (sec): 12.98 - samples/sec: 1729.18 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 08:43:43,272 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:43,272 EPOCH 3 done: loss 0.1023 - lr: 0.000039 |
|
2023-10-13 08:43:43,929 DEV : loss 0.15517204999923706 - f1-score (micro avg) 0.8565 |
|
2023-10-13 08:43:43,935 saving best model |
|
2023-10-13 08:43:44,434 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:45,562 epoch 4 - iter 27/275 - loss 0.04730955 - time (sec): 1.13 - samples/sec: 1946.03 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 08:43:46,720 epoch 4 - iter 54/275 - loss 0.05859413 - time (sec): 2.29 - samples/sec: 1886.10 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 08:43:47,890 epoch 4 - iter 81/275 - loss 0.07722230 - time (sec): 3.45 - samples/sec: 1839.70 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 08:43:49,046 epoch 4 - iter 108/275 - loss 0.08290820 - time (sec): 4.61 - samples/sec: 1839.41 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 08:43:50,209 epoch 4 - iter 135/275 - loss 0.07883405 - time (sec): 5.77 - samples/sec: 1853.08 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 08:43:51,350 epoch 4 - iter 162/275 - loss 0.07919026 - time (sec): 6.91 - samples/sec: 1874.21 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 08:43:52,473 epoch 4 - iter 189/275 - loss 0.07762734 - time (sec): 8.04 - samples/sec: 1856.64 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 08:43:53,639 epoch 4 - iter 216/275 - loss 0.07935127 - time (sec): 9.20 - samples/sec: 1865.31 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 08:43:54,847 epoch 4 - iter 243/275 - loss 0.08193059 - time (sec): 10.41 - samples/sec: 1894.59 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 08:43:56,056 epoch 4 - iter 270/275 - loss 0.08075966 - time (sec): 11.62 - samples/sec: 1920.53 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 08:43:56,269 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:56,269 EPOCH 4 done: loss 0.0794 - lr: 0.000034 |
|
2023-10-13 08:43:56,992 DEV : loss 0.15865449607372284 - f1-score (micro avg) 0.8635 |
|
2023-10-13 08:43:56,999 saving best model |
|
2023-10-13 08:43:57,489 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:43:58,703 epoch 5 - iter 27/275 - loss 0.06511906 - time (sec): 1.21 - samples/sec: 1823.46 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 08:43:59,970 epoch 5 - iter 54/275 - loss 0.05423925 - time (sec): 2.48 - samples/sec: 1859.97 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 08:44:01,167 epoch 5 - iter 81/275 - loss 0.05481274 - time (sec): 3.68 - samples/sec: 1834.82 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 08:44:02,366 epoch 5 - iter 108/275 - loss 0.06460615 - time (sec): 4.87 - samples/sec: 1869.79 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-13 08:44:03,583 epoch 5 - iter 135/275 - loss 0.05906795 - time (sec): 6.09 - samples/sec: 1876.83 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-13 08:44:04,760 epoch 5 - iter 162/275 - loss 0.05559834 - time (sec): 7.27 - samples/sec: 1865.62 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 08:44:05,937 epoch 5 - iter 189/275 - loss 0.05630927 - time (sec): 8.45 - samples/sec: 1828.54 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 08:44:07,134 epoch 5 - iter 216/275 - loss 0.06238579 - time (sec): 9.64 - samples/sec: 1840.92 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 08:44:08,318 epoch 5 - iter 243/275 - loss 0.05897466 - time (sec): 10.83 - samples/sec: 1838.93 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 08:44:09,512 epoch 5 - iter 270/275 - loss 0.06050901 - time (sec): 12.02 - samples/sec: 1848.42 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 08:44:09,745 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:44:09,745 EPOCH 5 done: loss 0.0608 - lr: 0.000028 |
|
2023-10-13 08:44:10,438 DEV : loss 0.17816683650016785 - f1-score (micro avg) 0.8638 |
|
2023-10-13 08:44:10,445 saving best model |
|
2023-10-13 08:44:10,922 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:44:12,282 epoch 6 - iter 27/275 - loss 0.02081309 - time (sec): 1.36 - samples/sec: 1660.71 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 08:44:13,665 epoch 6 - iter 54/275 - loss 0.03422600 - time (sec): 2.74 - samples/sec: 1680.74 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 08:44:15,026 epoch 6 - iter 81/275 - loss 0.05469774 - time (sec): 4.10 - samples/sec: 1680.05 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 08:44:16,379 epoch 6 - iter 108/275 - loss 0.05159438 - time (sec): 5.45 - samples/sec: 1636.99 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 08:44:17,759 epoch 6 - iter 135/275 - loss 0.04869846 - time (sec): 6.83 - samples/sec: 1659.27 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 08:44:19,111 epoch 6 - iter 162/275 - loss 0.04743475 - time (sec): 8.19 - samples/sec: 1654.66 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 08:44:20,480 epoch 6 - iter 189/275 - loss 0.04989248 - time (sec): 9.55 - samples/sec: 1658.84 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 08:44:21,869 epoch 6 - iter 216/275 - loss 0.04818983 - time (sec): 10.94 - samples/sec: 1639.75 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 08:44:23,230 epoch 6 - iter 243/275 - loss 0.04658753 - time (sec): 12.31 - samples/sec: 1645.97 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 08:44:24,614 epoch 6 - iter 270/275 - loss 0.04485098 - time (sec): 13.69 - samples/sec: 1639.87 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 08:44:24,888 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:44:24,888 EPOCH 6 done: loss 0.0456 - lr: 0.000022 |
|
2023-10-13 08:44:25,580 DEV : loss 0.1625245064496994 - f1-score (micro avg) 0.8709 |
|
2023-10-13 08:44:25,587 saving best model |
|
2023-10-13 08:44:26,030 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:44:27,414 epoch 7 - iter 27/275 - loss 0.02007844 - time (sec): 1.38 - samples/sec: 1410.89 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 08:44:28,779 epoch 7 - iter 54/275 - loss 0.02694298 - time (sec): 2.75 - samples/sec: 1503.12 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 08:44:30,159 epoch 7 - iter 81/275 - loss 0.03953833 - time (sec): 4.13 - samples/sec: 1564.02 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 08:44:31,613 epoch 7 - iter 108/275 - loss 0.04578602 - time (sec): 5.58 - samples/sec: 1580.73 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 08:44:33,000 epoch 7 - iter 135/275 - loss 0.03901358 - time (sec): 6.97 - samples/sec: 1584.86 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 08:44:34,325 epoch 7 - iter 162/275 - loss 0.03827773 - time (sec): 8.29 - samples/sec: 1601.22 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 08:44:35,648 epoch 7 - iter 189/275 - loss 0.03517038 - time (sec): 9.62 - samples/sec: 1633.74 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 08:44:37,011 epoch 7 - iter 216/275 - loss 0.03656872 - time (sec): 10.98 - samples/sec: 1655.72 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 08:44:38,258 epoch 7 - iter 243/275 - loss 0.03458599 - time (sec): 12.23 - samples/sec: 1648.73 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 08:44:39,467 epoch 7 - iter 270/275 - loss 0.03211470 - time (sec): 13.44 - samples/sec: 1657.37 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 08:44:39,689 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:44:39,690 EPOCH 7 done: loss 0.0321 - lr: 0.000017 |
|
2023-10-13 08:44:40,378 DEV : loss 0.16852889955043793 - f1-score (micro avg) 0.8699 |
|
2023-10-13 08:44:40,385 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:44:41,675 epoch 8 - iter 27/275 - loss 0.01991863 - time (sec): 1.29 - samples/sec: 1682.09 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 08:44:43,008 epoch 8 - iter 54/275 - loss 0.02203765 - time (sec): 2.62 - samples/sec: 1617.53 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 08:44:44,372 epoch 8 - iter 81/275 - loss 0.01842223 - time (sec): 3.99 - samples/sec: 1611.20 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 08:44:45,591 epoch 8 - iter 108/275 - loss 0.02703636 - time (sec): 5.20 - samples/sec: 1682.24 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 08:44:46,783 epoch 8 - iter 135/275 - loss 0.02656983 - time (sec): 6.40 - samples/sec: 1708.78 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 08:44:47,956 epoch 8 - iter 162/275 - loss 0.02337674 - time (sec): 7.57 - samples/sec: 1718.38 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 08:44:49,110 epoch 8 - iter 189/275 - loss 0.02320125 - time (sec): 8.72 - samples/sec: 1751.86 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 08:44:50,279 epoch 8 - iter 216/275 - loss 0.02203139 - time (sec): 9.89 - samples/sec: 1793.87 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 08:44:51,495 epoch 8 - iter 243/275 - loss 0.02298342 - time (sec): 11.11 - samples/sec: 1810.66 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 08:44:52,735 epoch 8 - iter 270/275 - loss 0.02201397 - time (sec): 12.35 - samples/sec: 1803.23 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 08:44:52,964 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:44:52,964 EPOCH 8 done: loss 0.0262 - lr: 0.000011 |
|
2023-10-13 08:44:53,644 DEV : loss 0.1576564908027649 - f1-score (micro avg) 0.8867 |
|
2023-10-13 08:44:53,649 saving best model |
|
2023-10-13 08:44:54,211 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:44:55,422 epoch 9 - iter 27/275 - loss 0.00305494 - time (sec): 1.21 - samples/sec: 1659.86 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 08:44:56,646 epoch 9 - iter 54/275 - loss 0.04012301 - time (sec): 2.43 - samples/sec: 1716.76 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 08:44:57,903 epoch 9 - iter 81/275 - loss 0.02902370 - time (sec): 3.69 - samples/sec: 1731.43 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 08:44:59,152 epoch 9 - iter 108/275 - loss 0.02389707 - time (sec): 4.94 - samples/sec: 1771.21 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 08:45:00,405 epoch 9 - iter 135/275 - loss 0.01980409 - time (sec): 6.19 - samples/sec: 1787.82 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 08:45:01,624 epoch 9 - iter 162/275 - loss 0.01933285 - time (sec): 7.41 - samples/sec: 1804.29 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 08:45:02,935 epoch 9 - iter 189/275 - loss 0.01697799 - time (sec): 8.72 - samples/sec: 1773.38 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 08:45:04,413 epoch 9 - iter 216/275 - loss 0.01651821 - time (sec): 10.20 - samples/sec: 1750.13 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 08:45:05,931 epoch 9 - iter 243/275 - loss 0.01563807 - time (sec): 11.72 - samples/sec: 1727.71 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 08:45:07,391 epoch 9 - iter 270/275 - loss 0.01736356 - time (sec): 13.18 - samples/sec: 1691.76 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 08:45:07,672 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:45:07,673 EPOCH 9 done: loss 0.0172 - lr: 0.000006 |
|
2023-10-13 08:45:08,365 DEV : loss 0.15656420588493347 - f1-score (micro avg) 0.8764 |
|
2023-10-13 08:45:08,372 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:45:09,826 epoch 10 - iter 27/275 - loss 0.00119089 - time (sec): 1.45 - samples/sec: 1396.00 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 08:45:11,229 epoch 10 - iter 54/275 - loss 0.01293146 - time (sec): 2.86 - samples/sec: 1532.63 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 08:45:12,587 epoch 10 - iter 81/275 - loss 0.00924602 - time (sec): 4.21 - samples/sec: 1543.93 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 08:45:13,955 epoch 10 - iter 108/275 - loss 0.01041877 - time (sec): 5.58 - samples/sec: 1563.54 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 08:45:15,386 epoch 10 - iter 135/275 - loss 0.01078816 - time (sec): 7.01 - samples/sec: 1613.49 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 08:45:16,722 epoch 10 - iter 162/275 - loss 0.01072570 - time (sec): 8.35 - samples/sec: 1583.39 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 08:45:17,913 epoch 10 - iter 189/275 - loss 0.00990525 - time (sec): 9.54 - samples/sec: 1638.19 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 08:45:19,088 epoch 10 - iter 216/275 - loss 0.01335076 - time (sec): 10.71 - samples/sec: 1671.76 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 08:45:20,259 epoch 10 - iter 243/275 - loss 0.01220671 - time (sec): 11.89 - samples/sec: 1676.88 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 08:45:21,443 epoch 10 - iter 270/275 - loss 0.01244867 - time (sec): 13.07 - samples/sec: 1708.53 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 08:45:21,663 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:45:21,664 EPOCH 10 done: loss 0.0126 - lr: 0.000000 |
|
2023-10-13 08:45:22,746 DEV : loss 0.1590714156627655 - f1-score (micro avg) 0.8803 |
|
2023-10-13 08:45:23,094 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:45:23,095 Loading model from best epoch ... |
|
2023-10-13 08:45:24,509 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date |
|
2023-10-13 08:45:25,142 |
|
Results: |
|
- F-score (micro) 0.9045 |
|
- F-score (macro) 0.8061 |
|
- Accuracy 0.8378 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
scope 0.8902 0.8750 0.8825 176 |
|
pers 0.9672 0.9219 0.9440 128 |
|
work 0.9167 0.8919 0.9041 74 |
|
loc 0.6667 1.0000 0.8000 2 |
|
object 0.5000 0.5000 0.5000 2 |
|
|
|
micro avg 0.9167 0.8927 0.9045 382 |
|
macro avg 0.7881 0.8378 0.8061 382 |
|
weighted avg 0.9179 0.8927 0.9049 382 |
|
|
|
2023-10-13 08:45:25,142 ---------------------------------------------------------------------------------------------------- |
|
|