2023-10-23 15:04:11,701 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:04:11,701 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-23 15:04:11,702 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:04:11,702 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-23 15:04:11,702 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:04:11,702 Train: 1100 sentences 2023-10-23 15:04:11,702 (train_with_dev=False, train_with_test=False) 2023-10-23 15:04:11,702 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:04:11,702 Training Params: 2023-10-23 15:04:11,702 - learning_rate: "5e-05" 2023-10-23 15:04:11,702 - mini_batch_size: "4" 2023-10-23 15:04:11,702 - max_epochs: "10" 2023-10-23 15:04:11,702 - shuffle: "True" 2023-10-23 15:04:11,702 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:04:11,702 Plugins: 2023-10-23 15:04:11,702 - TensorboardLogger 2023-10-23 15:04:11,702 - LinearScheduler | warmup_fraction: '0.1' 2023-10-23 15:04:11,702 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:04:11,702 Final evaluation on model from best epoch (best-model.pt) 2023-10-23 15:04:11,702 - metric: "('micro avg', 'f1-score')" 2023-10-23 15:04:11,702 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:04:11,702 Computation: 2023-10-23 15:04:11,702 - compute on device: cuda:0 2023-10-23 15:04:11,702 - embedding storage: none 2023-10-23 15:04:11,702 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:04:11,702 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-23 15:04:11,702 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:04:11,702 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:04:11,703 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-23 15:04:13,115 epoch 1 - iter 27/275 - loss 3.09062004 - time (sec): 1.41 - samples/sec: 1625.25 - lr: 0.000005 - momentum: 0.000000 2023-10-23 15:04:14,519 epoch 1 - iter 54/275 - loss 2.26631567 - time (sec): 2.82 - samples/sec: 1570.56 - lr: 0.000010 - momentum: 0.000000 2023-10-23 15:04:15,927 epoch 1 - iter 81/275 - loss 1.80032959 - time (sec): 4.22 - samples/sec: 1507.26 - lr: 0.000015 - momentum: 0.000000 2023-10-23 15:04:17,339 epoch 1 - iter 108/275 - loss 1.47722673 - time (sec): 5.64 - samples/sec: 1487.19 - lr: 0.000019 - momentum: 0.000000 2023-10-23 15:04:18,743 epoch 1 - iter 135/275 - loss 1.25336352 - time (sec): 7.04 - samples/sec: 1535.80 - lr: 0.000024 - momentum: 0.000000 2023-10-23 15:04:20,157 epoch 1 - iter 162/275 - loss 1.08649001 - time (sec): 8.45 - samples/sec: 1557.89 - lr: 0.000029 - momentum: 0.000000 2023-10-23 15:04:21,576 epoch 1 - iter 189/275 - loss 0.96593062 - time (sec): 9.87 - samples/sec: 1571.29 - lr: 0.000034 - momentum: 0.000000 2023-10-23 15:04:22,971 epoch 1 - iter 216/275 - loss 0.86888523 - time (sec): 11.27 - samples/sec: 1583.34 - lr: 0.000039 - momentum: 0.000000 2023-10-23 15:04:24,379 epoch 1 - iter 243/275 - loss 0.79534613 - time (sec): 12.68 - samples/sec: 1578.04 - lr: 0.000044 - momentum: 0.000000 2023-10-23 15:04:25,785 epoch 1 - iter 270/275 - loss 0.73745460 - time (sec): 14.08 - samples/sec: 1584.83 - lr: 0.000049 - momentum: 0.000000 2023-10-23 15:04:26,044 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:04:26,044 EPOCH 1 done: loss 0.7297 - lr: 0.000049 2023-10-23 15:04:26,465 DEV : loss 0.15482567250728607 - f1-score (micro avg) 0.8014 2023-10-23 15:04:26,471 saving best model 2023-10-23 15:04:26,870 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:04:28,269 epoch 2 - iter 27/275 - loss 0.10188481 - time (sec): 1.40 - samples/sec: 1553.08 - lr: 0.000049 - momentum: 0.000000 2023-10-23 15:04:29,674 epoch 2 - iter 54/275 - loss 0.13605400 - time (sec): 2.80 - samples/sec: 1537.01 - lr: 0.000049 - momentum: 0.000000 2023-10-23 15:04:31,076 epoch 2 - iter 81/275 - loss 0.15526551 - time (sec): 4.21 - samples/sec: 1533.08 - lr: 0.000048 - momentum: 0.000000 2023-10-23 15:04:32,482 epoch 2 - iter 108/275 - loss 0.16346409 - time (sec): 5.61 - samples/sec: 1564.35 - lr: 0.000048 - momentum: 0.000000 2023-10-23 15:04:33,897 epoch 2 - iter 135/275 - loss 0.16626491 - time (sec): 7.03 - samples/sec: 1550.86 - lr: 0.000047 - momentum: 0.000000 2023-10-23 15:04:35,382 epoch 2 - iter 162/275 - loss 0.15580745 - time (sec): 8.51 - samples/sec: 1527.87 - lr: 0.000047 - momentum: 0.000000 2023-10-23 15:04:37,014 epoch 2 - iter 189/275 - loss 0.14912426 - time (sec): 10.14 - samples/sec: 1492.16 - lr: 0.000046 - momentum: 0.000000 2023-10-23 15:04:38,507 epoch 2 - iter 216/275 - loss 0.15015653 - time (sec): 11.64 - samples/sec: 1525.37 - lr: 0.000046 - momentum: 0.000000 2023-10-23 15:04:39,964 epoch 2 - iter 243/275 - loss 0.14211656 - time (sec): 13.09 - samples/sec: 1519.75 - lr: 0.000045 - momentum: 0.000000 2023-10-23 15:04:41,448 epoch 2 - iter 270/275 - loss 0.14876165 - time (sec): 14.58 - samples/sec: 1532.25 - lr: 0.000045 - momentum: 0.000000 2023-10-23 15:04:41,728 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:04:41,728 EPOCH 2 done: loss 0.1508 - lr: 0.000045 2023-10-23 15:04:42,262 DEV : loss 0.14162753522396088 - f1-score (micro avg) 0.8295 2023-10-23 15:04:42,268 saving best model 2023-10-23 15:04:42,820 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:04:44,270 epoch 3 - iter 27/275 - loss 0.05795766 - time (sec): 1.45 - samples/sec: 1562.30 - lr: 0.000044 - momentum: 0.000000 2023-10-23 15:04:45,708 epoch 3 - iter 54/275 - loss 0.07123708 - time (sec): 2.88 - samples/sec: 1507.75 - lr: 0.000043 - momentum: 0.000000 2023-10-23 15:04:47,147 epoch 3 - iter 81/275 - loss 0.07562506 - time (sec): 4.32 - samples/sec: 1453.33 - lr: 0.000043 - momentum: 0.000000 2023-10-23 15:04:48,592 epoch 3 - iter 108/275 - loss 0.08895203 - time (sec): 5.77 - samples/sec: 1514.19 - lr: 0.000042 - momentum: 0.000000 2023-10-23 15:04:50,045 epoch 3 - iter 135/275 - loss 0.08914446 - time (sec): 7.22 - samples/sec: 1541.48 - lr: 0.000042 - momentum: 0.000000 2023-10-23 15:04:51,500 epoch 3 - iter 162/275 - loss 0.09699020 - time (sec): 8.68 - samples/sec: 1554.76 - lr: 0.000041 - momentum: 0.000000 2023-10-23 15:04:52,942 epoch 3 - iter 189/275 - loss 0.09933554 - time (sec): 10.12 - samples/sec: 1546.17 - lr: 0.000041 - momentum: 0.000000 2023-10-23 15:04:54,372 epoch 3 - iter 216/275 - loss 0.10321007 - time (sec): 11.55 - samples/sec: 1530.90 - lr: 0.000040 - momentum: 0.000000 2023-10-23 15:04:55,836 epoch 3 - iter 243/275 - loss 0.10045946 - time (sec): 13.01 - samples/sec: 1528.74 - lr: 0.000040 - momentum: 0.000000 2023-10-23 15:04:57,325 epoch 3 - iter 270/275 - loss 0.09955021 - time (sec): 14.50 - samples/sec: 1542.20 - lr: 0.000039 - momentum: 0.000000 2023-10-23 15:04:57,600 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:04:57,600 EPOCH 3 done: loss 0.0998 - lr: 0.000039 2023-10-23 15:04:58,142 DEV : loss 0.13985690474510193 - f1-score (micro avg) 0.8487 2023-10-23 15:04:58,147 saving best model 2023-10-23 15:04:58,687 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:05:00,179 epoch 4 - iter 27/275 - loss 0.07335491 - time (sec): 1.49 - samples/sec: 1501.18 - lr: 0.000038 - momentum: 0.000000 2023-10-23 15:05:01,665 epoch 4 - iter 54/275 - loss 0.07343576 - time (sec): 2.97 - samples/sec: 1478.87 - lr: 0.000038 - momentum: 0.000000 2023-10-23 15:05:03,148 epoch 4 - iter 81/275 - loss 0.09432281 - time (sec): 4.46 - samples/sec: 1558.96 - lr: 0.000037 - momentum: 0.000000 2023-10-23 15:05:04,632 epoch 4 - iter 108/275 - loss 0.09400070 - time (sec): 5.94 - samples/sec: 1550.70 - lr: 0.000037 - momentum: 0.000000 2023-10-23 15:05:06,081 epoch 4 - iter 135/275 - loss 0.09192554 - time (sec): 7.39 - samples/sec: 1527.35 - lr: 0.000036 - momentum: 0.000000 2023-10-23 15:05:07,488 epoch 4 - iter 162/275 - loss 0.08488052 - time (sec): 8.80 - samples/sec: 1554.03 - lr: 0.000036 - momentum: 0.000000 2023-10-23 15:05:08,968 epoch 4 - iter 189/275 - loss 0.07770991 - time (sec): 10.28 - samples/sec: 1528.34 - lr: 0.000035 - momentum: 0.000000 2023-10-23 15:05:10,488 epoch 4 - iter 216/275 - loss 0.07242444 - time (sec): 11.80 - samples/sec: 1529.97 - lr: 0.000035 - momentum: 0.000000 2023-10-23 15:05:11,969 epoch 4 - iter 243/275 - loss 0.07127409 - time (sec): 13.28 - samples/sec: 1518.66 - lr: 0.000034 - momentum: 0.000000 2023-10-23 15:05:13,451 epoch 4 - iter 270/275 - loss 0.06627057 - time (sec): 14.76 - samples/sec: 1518.72 - lr: 0.000034 - momentum: 0.000000 2023-10-23 15:05:13,728 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:05:13,729 EPOCH 4 done: loss 0.0652 - lr: 0.000034 2023-10-23 15:05:14,261 DEV : loss 0.22312763333320618 - f1-score (micro avg) 0.8245 2023-10-23 15:05:14,267 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:05:15,764 epoch 5 - iter 27/275 - loss 0.09253662 - time (sec): 1.50 - samples/sec: 1415.37 - lr: 0.000033 - momentum: 0.000000 2023-10-23 15:05:17,244 epoch 5 - iter 54/275 - loss 0.06035346 - time (sec): 2.98 - samples/sec: 1512.20 - lr: 0.000032 - momentum: 0.000000 2023-10-23 15:05:18,744 epoch 5 - iter 81/275 - loss 0.05627402 - time (sec): 4.48 - samples/sec: 1487.60 - lr: 0.000032 - momentum: 0.000000 2023-10-23 15:05:20,240 epoch 5 - iter 108/275 - loss 0.05054570 - time (sec): 5.97 - samples/sec: 1487.64 - lr: 0.000031 - momentum: 0.000000 2023-10-23 15:05:21,720 epoch 5 - iter 135/275 - loss 0.05016426 - time (sec): 7.45 - samples/sec: 1476.94 - lr: 0.000031 - momentum: 0.000000 2023-10-23 15:05:23,229 epoch 5 - iter 162/275 - loss 0.04995958 - time (sec): 8.96 - samples/sec: 1504.46 - lr: 0.000030 - momentum: 0.000000 2023-10-23 15:05:24,713 epoch 5 - iter 189/275 - loss 0.04817291 - time (sec): 10.45 - samples/sec: 1506.81 - lr: 0.000030 - momentum: 0.000000 2023-10-23 15:05:26,195 epoch 5 - iter 216/275 - loss 0.04583219 - time (sec): 11.93 - samples/sec: 1497.32 - lr: 0.000029 - momentum: 0.000000 2023-10-23 15:05:27,694 epoch 5 - iter 243/275 - loss 0.04620345 - time (sec): 13.43 - samples/sec: 1499.18 - lr: 0.000029 - momentum: 0.000000 2023-10-23 15:05:29,203 epoch 5 - iter 270/275 - loss 0.04938114 - time (sec): 14.94 - samples/sec: 1502.22 - lr: 0.000028 - momentum: 0.000000 2023-10-23 15:05:29,481 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:05:29,481 EPOCH 5 done: loss 0.0491 - lr: 0.000028 2023-10-23 15:05:30,016 DEV : loss 0.14948585629463196 - f1-score (micro avg) 0.8732 2023-10-23 15:05:30,022 saving best model 2023-10-23 15:05:30,567 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:05:32,055 epoch 6 - iter 27/275 - loss 0.03491608 - time (sec): 1.48 - samples/sec: 1662.31 - lr: 0.000027 - momentum: 0.000000 2023-10-23 15:05:33,550 epoch 6 - iter 54/275 - loss 0.02230697 - time (sec): 2.98 - samples/sec: 1655.66 - lr: 0.000027 - momentum: 0.000000 2023-10-23 15:05:35,032 epoch 6 - iter 81/275 - loss 0.03450145 - time (sec): 4.46 - samples/sec: 1607.41 - lr: 0.000026 - momentum: 0.000000 2023-10-23 15:05:36,496 epoch 6 - iter 108/275 - loss 0.03612551 - time (sec): 5.92 - samples/sec: 1522.74 - lr: 0.000026 - momentum: 0.000000 2023-10-23 15:05:37,988 epoch 6 - iter 135/275 - loss 0.03255043 - time (sec): 7.42 - samples/sec: 1513.41 - lr: 0.000025 - momentum: 0.000000 2023-10-23 15:05:39,463 epoch 6 - iter 162/275 - loss 0.03302444 - time (sec): 8.89 - samples/sec: 1533.60 - lr: 0.000025 - momentum: 0.000000 2023-10-23 15:05:40,981 epoch 6 - iter 189/275 - loss 0.03473278 - time (sec): 10.41 - samples/sec: 1537.88 - lr: 0.000024 - momentum: 0.000000 2023-10-23 15:05:42,464 epoch 6 - iter 216/275 - loss 0.03260253 - time (sec): 11.89 - samples/sec: 1524.64 - lr: 0.000024 - momentum: 0.000000 2023-10-23 15:05:43,966 epoch 6 - iter 243/275 - loss 0.03197077 - time (sec): 13.39 - samples/sec: 1520.81 - lr: 0.000023 - momentum: 0.000000 2023-10-23 15:05:45,446 epoch 6 - iter 270/275 - loss 0.03099582 - time (sec): 14.87 - samples/sec: 1508.59 - lr: 0.000022 - momentum: 0.000000 2023-10-23 15:05:45,714 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:05:45,715 EPOCH 6 done: loss 0.0312 - lr: 0.000022 2023-10-23 15:05:46,255 DEV : loss 0.16261592507362366 - f1-score (micro avg) 0.8809 2023-10-23 15:05:46,261 saving best model 2023-10-23 15:05:46,805 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:05:48,295 epoch 7 - iter 27/275 - loss 0.02230212 - time (sec): 1.49 - samples/sec: 1337.77 - lr: 0.000022 - momentum: 0.000000 2023-10-23 15:05:49,807 epoch 7 - iter 54/275 - loss 0.01168582 - time (sec): 3.00 - samples/sec: 1451.66 - lr: 0.000021 - momentum: 0.000000 2023-10-23 15:05:51,312 epoch 7 - iter 81/275 - loss 0.01546050 - time (sec): 4.51 - samples/sec: 1493.85 - lr: 0.000021 - momentum: 0.000000 2023-10-23 15:05:52,794 epoch 7 - iter 108/275 - loss 0.01737495 - time (sec): 5.99 - samples/sec: 1455.73 - lr: 0.000020 - momentum: 0.000000 2023-10-23 15:05:54,281 epoch 7 - iter 135/275 - loss 0.01592705 - time (sec): 7.47 - samples/sec: 1510.00 - lr: 0.000020 - momentum: 0.000000 2023-10-23 15:05:55,761 epoch 7 - iter 162/275 - loss 0.01962550 - time (sec): 8.95 - samples/sec: 1512.46 - lr: 0.000019 - momentum: 0.000000 2023-10-23 15:05:57,239 epoch 7 - iter 189/275 - loss 0.02153254 - time (sec): 10.43 - samples/sec: 1503.17 - lr: 0.000019 - momentum: 0.000000 2023-10-23 15:05:58,732 epoch 7 - iter 216/275 - loss 0.02104602 - time (sec): 11.93 - samples/sec: 1520.42 - lr: 0.000018 - momentum: 0.000000 2023-10-23 15:06:00,216 epoch 7 - iter 243/275 - loss 0.02356550 - time (sec): 13.41 - samples/sec: 1520.07 - lr: 0.000017 - momentum: 0.000000 2023-10-23 15:06:01,617 epoch 7 - iter 270/275 - loss 0.02281954 - time (sec): 14.81 - samples/sec: 1513.29 - lr: 0.000017 - momentum: 0.000000 2023-10-23 15:06:01,875 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:06:01,875 EPOCH 7 done: loss 0.0230 - lr: 0.000017 2023-10-23 15:06:02,418 DEV : loss 0.17113645374774933 - f1-score (micro avg) 0.8838 2023-10-23 15:06:02,423 saving best model 2023-10-23 15:06:03,086 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:06:04,523 epoch 8 - iter 27/275 - loss 0.00527097 - time (sec): 1.44 - samples/sec: 1599.29 - lr: 0.000016 - momentum: 0.000000 2023-10-23 15:06:06,023 epoch 8 - iter 54/275 - loss 0.01448787 - time (sec): 2.94 - samples/sec: 1580.16 - lr: 0.000016 - momentum: 0.000000 2023-10-23 15:06:07,509 epoch 8 - iter 81/275 - loss 0.01640281 - time (sec): 4.42 - samples/sec: 1518.68 - lr: 0.000015 - momentum: 0.000000 2023-10-23 15:06:08,996 epoch 8 - iter 108/275 - loss 0.01491556 - time (sec): 5.91 - samples/sec: 1529.30 - lr: 0.000015 - momentum: 0.000000 2023-10-23 15:06:10,504 epoch 8 - iter 135/275 - loss 0.01640144 - time (sec): 7.42 - samples/sec: 1547.19 - lr: 0.000014 - momentum: 0.000000 2023-10-23 15:06:11,988 epoch 8 - iter 162/275 - loss 0.01626040 - time (sec): 8.90 - samples/sec: 1551.77 - lr: 0.000014 - momentum: 0.000000 2023-10-23 15:06:13,377 epoch 8 - iter 189/275 - loss 0.01444249 - time (sec): 10.29 - samples/sec: 1548.57 - lr: 0.000013 - momentum: 0.000000 2023-10-23 15:06:14,780 epoch 8 - iter 216/275 - loss 0.01330704 - time (sec): 11.69 - samples/sec: 1547.77 - lr: 0.000012 - momentum: 0.000000 2023-10-23 15:06:16,173 epoch 8 - iter 243/275 - loss 0.01869803 - time (sec): 13.09 - samples/sec: 1546.18 - lr: 0.000012 - momentum: 0.000000 2023-10-23 15:06:17,587 epoch 8 - iter 270/275 - loss 0.01872302 - time (sec): 14.50 - samples/sec: 1537.62 - lr: 0.000011 - momentum: 0.000000 2023-10-23 15:06:17,852 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:06:17,852 EPOCH 8 done: loss 0.0193 - lr: 0.000011 2023-10-23 15:06:18,388 DEV : loss 0.17924140393733978 - f1-score (micro avg) 0.8843 2023-10-23 15:06:18,394 saving best model 2023-10-23 15:06:18,937 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:06:20,363 epoch 9 - iter 27/275 - loss 0.01760637 - time (sec): 1.42 - samples/sec: 1388.70 - lr: 0.000011 - momentum: 0.000000 2023-10-23 15:06:21,769 epoch 9 - iter 54/275 - loss 0.01734596 - time (sec): 2.83 - samples/sec: 1536.84 - lr: 0.000010 - momentum: 0.000000 2023-10-23 15:06:23,184 epoch 9 - iter 81/275 - loss 0.01138205 - time (sec): 4.24 - samples/sec: 1590.70 - lr: 0.000010 - momentum: 0.000000 2023-10-23 15:06:24,582 epoch 9 - iter 108/275 - loss 0.00872858 - time (sec): 5.64 - samples/sec: 1563.70 - lr: 0.000009 - momentum: 0.000000 2023-10-23 15:06:25,992 epoch 9 - iter 135/275 - loss 0.00768086 - time (sec): 7.05 - samples/sec: 1570.15 - lr: 0.000009 - momentum: 0.000000 2023-10-23 15:06:27,392 epoch 9 - iter 162/275 - loss 0.00726630 - time (sec): 8.45 - samples/sec: 1562.29 - lr: 0.000008 - momentum: 0.000000 2023-10-23 15:06:28,785 epoch 9 - iter 189/275 - loss 0.00733705 - time (sec): 9.84 - samples/sec: 1591.73 - lr: 0.000007 - momentum: 0.000000 2023-10-23 15:06:30,185 epoch 9 - iter 216/275 - loss 0.00788343 - time (sec): 11.24 - samples/sec: 1591.31 - lr: 0.000007 - momentum: 0.000000 2023-10-23 15:06:31,583 epoch 9 - iter 243/275 - loss 0.00764454 - time (sec): 12.64 - samples/sec: 1583.96 - lr: 0.000006 - momentum: 0.000000 2023-10-23 15:06:32,968 epoch 9 - iter 270/275 - loss 0.00997646 - time (sec): 14.03 - samples/sec: 1592.63 - lr: 0.000006 - momentum: 0.000000 2023-10-23 15:06:33,229 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:06:33,229 EPOCH 9 done: loss 0.0101 - lr: 0.000006 2023-10-23 15:06:33,774 DEV : loss 0.18590492010116577 - f1-score (micro avg) 0.8908 2023-10-23 15:06:33,780 saving best model 2023-10-23 15:06:34,323 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:06:35,730 epoch 10 - iter 27/275 - loss 0.00873020 - time (sec): 1.40 - samples/sec: 1558.54 - lr: 0.000005 - momentum: 0.000000 2023-10-23 15:06:37,123 epoch 10 - iter 54/275 - loss 0.00485999 - time (sec): 2.80 - samples/sec: 1560.54 - lr: 0.000005 - momentum: 0.000000 2023-10-23 15:06:38,524 epoch 10 - iter 81/275 - loss 0.00647040 - time (sec): 4.20 - samples/sec: 1579.87 - lr: 0.000004 - momentum: 0.000000 2023-10-23 15:06:39,957 epoch 10 - iter 108/275 - loss 0.00495255 - time (sec): 5.63 - samples/sec: 1577.28 - lr: 0.000004 - momentum: 0.000000 2023-10-23 15:06:41,474 epoch 10 - iter 135/275 - loss 0.00423381 - time (sec): 7.15 - samples/sec: 1553.75 - lr: 0.000003 - momentum: 0.000000 2023-10-23 15:06:42,974 epoch 10 - iter 162/275 - loss 0.00453805 - time (sec): 8.65 - samples/sec: 1590.16 - lr: 0.000002 - momentum: 0.000000 2023-10-23 15:06:44,467 epoch 10 - iter 189/275 - loss 0.00528885 - time (sec): 10.14 - samples/sec: 1605.55 - lr: 0.000002 - momentum: 0.000000 2023-10-23 15:06:45,879 epoch 10 - iter 216/275 - loss 0.00676939 - time (sec): 11.55 - samples/sec: 1591.78 - lr: 0.000001 - momentum: 0.000000 2023-10-23 15:06:47,262 epoch 10 - iter 243/275 - loss 0.00828759 - time (sec): 12.93 - samples/sec: 1578.93 - lr: 0.000001 - momentum: 0.000000 2023-10-23 15:06:48,659 epoch 10 - iter 270/275 - loss 0.00784917 - time (sec): 14.33 - samples/sec: 1563.55 - lr: 0.000000 - momentum: 0.000000 2023-10-23 15:06:48,914 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:06:48,914 EPOCH 10 done: loss 0.0077 - lr: 0.000000 2023-10-23 15:06:49,449 DEV : loss 0.18461209535598755 - f1-score (micro avg) 0.891 2023-10-23 15:06:49,455 saving best model 2023-10-23 15:06:50,404 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:06:50,405 Loading model from best epoch ... 2023-10-23 15:06:52,312 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-23 15:06:52,837 Results: - F-score (micro) 0.9179 - F-score (macro) 0.8813 - Accuracy 0.8606 By class: precision recall f1-score support scope 0.8994 0.9148 0.9070 176 pers 0.9760 0.9531 0.9644 128 work 0.8462 0.8919 0.8684 74 object 1.0000 1.0000 1.0000 2 loc 1.0000 0.5000 0.6667 2 micro avg 0.9143 0.9215 0.9179 382 macro avg 0.9443 0.8520 0.8813 382 weighted avg 0.9158 0.9215 0.9180 382 2023-10-23 15:06:52,838 ----------------------------------------------------------------------------------------------------