2023-10-16 19:55:09,277 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:55:09,278 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-16 19:55:09,278 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:55:09,278 MultiCorpus: 1085 train + 148 dev + 364 test sentences - NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator 2023-10-16 19:55:09,278 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:55:09,278 Train: 1085 sentences 2023-10-16 19:55:09,278 (train_with_dev=False, train_with_test=False) 2023-10-16 19:55:09,278 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:55:09,278 Training Params: 2023-10-16 19:55:09,278 - learning_rate: "3e-05" 2023-10-16 19:55:09,278 - mini_batch_size: "4" 2023-10-16 19:55:09,278 - max_epochs: "10" 2023-10-16 19:55:09,278 - shuffle: "True" 2023-10-16 19:55:09,278 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:55:09,278 Plugins: 2023-10-16 19:55:09,278 - LinearScheduler | warmup_fraction: '0.1' 2023-10-16 19:55:09,278 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:55:09,278 Final evaluation on model from best epoch (best-model.pt) 2023-10-16 19:55:09,278 - metric: "('micro avg', 'f1-score')" 2023-10-16 19:55:09,279 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:55:09,279 Computation: 2023-10-16 19:55:09,279 - compute on device: cuda:0 2023-10-16 19:55:09,279 - embedding storage: none 2023-10-16 19:55:09,279 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:55:09,279 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-16 19:55:09,279 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:55:09,279 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:55:10,798 epoch 1 - iter 27/272 - loss 2.83234957 - time (sec): 1.52 - samples/sec: 3290.84 - lr: 0.000003 - momentum: 0.000000 2023-10-16 19:55:12,264 epoch 1 - iter 54/272 - loss 2.43687139 - time (sec): 2.98 - samples/sec: 3407.94 - lr: 0.000006 - momentum: 0.000000 2023-10-16 19:55:13,757 epoch 1 - iter 81/272 - loss 1.87521019 - time (sec): 4.48 - samples/sec: 3379.15 - lr: 0.000009 - momentum: 0.000000 2023-10-16 19:55:15,248 epoch 1 - iter 108/272 - loss 1.54195388 - time (sec): 5.97 - samples/sec: 3360.98 - lr: 0.000012 - momentum: 0.000000 2023-10-16 19:55:16,670 epoch 1 - iter 135/272 - loss 1.34849777 - time (sec): 7.39 - samples/sec: 3310.21 - lr: 0.000015 - momentum: 0.000000 2023-10-16 19:55:18,237 epoch 1 - iter 162/272 - loss 1.16896217 - time (sec): 8.96 - samples/sec: 3315.75 - lr: 0.000018 - momentum: 0.000000 2023-10-16 19:55:19,779 epoch 1 - iter 189/272 - loss 1.04849700 - time (sec): 10.50 - samples/sec: 3299.64 - lr: 0.000021 - momentum: 0.000000 2023-10-16 19:55:21,259 epoch 1 - iter 216/272 - loss 0.95395801 - time (sec): 11.98 - samples/sec: 3306.92 - lr: 0.000024 - momentum: 0.000000 2023-10-16 19:55:22,843 epoch 1 - iter 243/272 - loss 0.87116032 - time (sec): 13.56 - samples/sec: 3302.68 - lr: 0.000027 - momentum: 0.000000 2023-10-16 19:55:24,711 epoch 1 - iter 270/272 - loss 0.77807401 - time (sec): 15.43 - samples/sec: 3356.32 - lr: 0.000030 - momentum: 0.000000 2023-10-16 19:55:24,812 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:55:24,812 EPOCH 1 done: loss 0.7756 - lr: 0.000030 2023-10-16 19:55:25,806 DEV : loss 0.16201870143413544 - f1-score (micro avg) 0.5989 2023-10-16 19:55:25,810 saving best model 2023-10-16 19:55:26,150 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:55:27,855 epoch 2 - iter 27/272 - loss 0.16691729 - time (sec): 1.70 - samples/sec: 3558.22 - lr: 0.000030 - momentum: 0.000000 2023-10-16 19:55:29,354 epoch 2 - iter 54/272 - loss 0.18391069 - time (sec): 3.20 - samples/sec: 3606.93 - lr: 0.000029 - momentum: 0.000000 2023-10-16 19:55:30,864 epoch 2 - iter 81/272 - loss 0.18206767 - time (sec): 4.71 - samples/sec: 3430.44 - lr: 0.000029 - momentum: 0.000000 2023-10-16 19:55:32,454 epoch 2 - iter 108/272 - loss 0.17658993 - time (sec): 6.30 - samples/sec: 3436.87 - lr: 0.000029 - momentum: 0.000000 2023-10-16 19:55:33,900 epoch 2 - iter 135/272 - loss 0.16868206 - time (sec): 7.75 - samples/sec: 3408.36 - lr: 0.000028 - momentum: 0.000000 2023-10-16 19:55:35,478 epoch 2 - iter 162/272 - loss 0.16939540 - time (sec): 9.33 - samples/sec: 3362.54 - lr: 0.000028 - momentum: 0.000000 2023-10-16 19:55:37,031 epoch 2 - iter 189/272 - loss 0.16409239 - time (sec): 10.88 - samples/sec: 3355.17 - lr: 0.000028 - momentum: 0.000000 2023-10-16 19:55:38,595 epoch 2 - iter 216/272 - loss 0.16010105 - time (sec): 12.44 - samples/sec: 3376.29 - lr: 0.000027 - momentum: 0.000000 2023-10-16 19:55:39,951 epoch 2 - iter 243/272 - loss 0.15825024 - time (sec): 13.80 - samples/sec: 3338.77 - lr: 0.000027 - momentum: 0.000000 2023-10-16 19:55:41,608 epoch 2 - iter 270/272 - loss 0.15185793 - time (sec): 15.46 - samples/sec: 3353.38 - lr: 0.000027 - momentum: 0.000000 2023-10-16 19:55:41,692 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:55:41,692 EPOCH 2 done: loss 0.1516 - lr: 0.000027 2023-10-16 19:55:43,108 DEV : loss 0.11580488085746765 - f1-score (micro avg) 0.784 2023-10-16 19:55:43,112 saving best model 2023-10-16 19:55:43,564 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:55:45,007 epoch 3 - iter 27/272 - loss 0.08808373 - time (sec): 1.44 - samples/sec: 3170.05 - lr: 0.000026 - momentum: 0.000000 2023-10-16 19:55:46,648 epoch 3 - iter 54/272 - loss 0.09481473 - time (sec): 3.08 - samples/sec: 3413.77 - lr: 0.000026 - momentum: 0.000000 2023-10-16 19:55:48,227 epoch 3 - iter 81/272 - loss 0.10163387 - time (sec): 4.66 - samples/sec: 3434.61 - lr: 0.000026 - momentum: 0.000000 2023-10-16 19:55:49,795 epoch 3 - iter 108/272 - loss 0.09866374 - time (sec): 6.23 - samples/sec: 3487.16 - lr: 0.000025 - momentum: 0.000000 2023-10-16 19:55:51,436 epoch 3 - iter 135/272 - loss 0.09301795 - time (sec): 7.87 - samples/sec: 3426.99 - lr: 0.000025 - momentum: 0.000000 2023-10-16 19:55:53,033 epoch 3 - iter 162/272 - loss 0.09724660 - time (sec): 9.47 - samples/sec: 3404.26 - lr: 0.000025 - momentum: 0.000000 2023-10-16 19:55:54,679 epoch 3 - iter 189/272 - loss 0.09184127 - time (sec): 11.11 - samples/sec: 3375.12 - lr: 0.000024 - momentum: 0.000000 2023-10-16 19:55:56,293 epoch 3 - iter 216/272 - loss 0.09024370 - time (sec): 12.72 - samples/sec: 3366.13 - lr: 0.000024 - momentum: 0.000000 2023-10-16 19:55:57,711 epoch 3 - iter 243/272 - loss 0.08931322 - time (sec): 14.14 - samples/sec: 3314.10 - lr: 0.000024 - momentum: 0.000000 2023-10-16 19:55:59,194 epoch 3 - iter 270/272 - loss 0.08847740 - time (sec): 15.63 - samples/sec: 3318.50 - lr: 0.000023 - momentum: 0.000000 2023-10-16 19:55:59,280 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:55:59,280 EPOCH 3 done: loss 0.0888 - lr: 0.000023 2023-10-16 19:56:00,703 DEV : loss 0.10064025223255157 - f1-score (micro avg) 0.7821 2023-10-16 19:56:00,707 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:56:02,368 epoch 4 - iter 27/272 - loss 0.03703618 - time (sec): 1.66 - samples/sec: 3340.49 - lr: 0.000023 - momentum: 0.000000 2023-10-16 19:56:04,136 epoch 4 - iter 54/272 - loss 0.04293959 - time (sec): 3.43 - samples/sec: 3405.06 - lr: 0.000023 - momentum: 0.000000 2023-10-16 19:56:05,592 epoch 4 - iter 81/272 - loss 0.04695382 - time (sec): 4.88 - samples/sec: 3376.98 - lr: 0.000022 - momentum: 0.000000 2023-10-16 19:56:07,075 epoch 4 - iter 108/272 - loss 0.05388166 - time (sec): 6.37 - samples/sec: 3376.09 - lr: 0.000022 - momentum: 0.000000 2023-10-16 19:56:08,545 epoch 4 - iter 135/272 - loss 0.05558070 - time (sec): 7.84 - samples/sec: 3384.14 - lr: 0.000022 - momentum: 0.000000 2023-10-16 19:56:10,279 epoch 4 - iter 162/272 - loss 0.05373334 - time (sec): 9.57 - samples/sec: 3268.66 - lr: 0.000021 - momentum: 0.000000 2023-10-16 19:56:11,860 epoch 4 - iter 189/272 - loss 0.05246304 - time (sec): 11.15 - samples/sec: 3309.54 - lr: 0.000021 - momentum: 0.000000 2023-10-16 19:56:13,335 epoch 4 - iter 216/272 - loss 0.05390096 - time (sec): 12.63 - samples/sec: 3265.12 - lr: 0.000021 - momentum: 0.000000 2023-10-16 19:56:14,867 epoch 4 - iter 243/272 - loss 0.05448464 - time (sec): 14.16 - samples/sec: 3269.11 - lr: 0.000020 - momentum: 0.000000 2023-10-16 19:56:16,429 epoch 4 - iter 270/272 - loss 0.05501421 - time (sec): 15.72 - samples/sec: 3274.45 - lr: 0.000020 - momentum: 0.000000 2023-10-16 19:56:16,551 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:56:16,551 EPOCH 4 done: loss 0.0547 - lr: 0.000020 2023-10-16 19:56:18,001 DEV : loss 0.11307715624570847 - f1-score (micro avg) 0.8199 2023-10-16 19:56:18,005 saving best model 2023-10-16 19:56:18,442 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:56:19,980 epoch 5 - iter 27/272 - loss 0.04691907 - time (sec): 1.54 - samples/sec: 3349.16 - lr: 0.000020 - momentum: 0.000000 2023-10-16 19:56:21,422 epoch 5 - iter 54/272 - loss 0.04225722 - time (sec): 2.98 - samples/sec: 3282.75 - lr: 0.000019 - momentum: 0.000000 2023-10-16 19:56:22,868 epoch 5 - iter 81/272 - loss 0.04022072 - time (sec): 4.43 - samples/sec: 3258.89 - lr: 0.000019 - momentum: 0.000000 2023-10-16 19:56:24,349 epoch 5 - iter 108/272 - loss 0.03973764 - time (sec): 5.91 - samples/sec: 3269.21 - lr: 0.000019 - momentum: 0.000000 2023-10-16 19:56:25,917 epoch 5 - iter 135/272 - loss 0.03914369 - time (sec): 7.47 - samples/sec: 3244.34 - lr: 0.000018 - momentum: 0.000000 2023-10-16 19:56:27,596 epoch 5 - iter 162/272 - loss 0.03750230 - time (sec): 9.15 - samples/sec: 3243.35 - lr: 0.000018 - momentum: 0.000000 2023-10-16 19:56:29,363 epoch 5 - iter 189/272 - loss 0.03658967 - time (sec): 10.92 - samples/sec: 3281.76 - lr: 0.000018 - momentum: 0.000000 2023-10-16 19:56:30,973 epoch 5 - iter 216/272 - loss 0.03680208 - time (sec): 12.53 - samples/sec: 3305.68 - lr: 0.000017 - momentum: 0.000000 2023-10-16 19:56:32,478 epoch 5 - iter 243/272 - loss 0.03570734 - time (sec): 14.03 - samples/sec: 3283.66 - lr: 0.000017 - momentum: 0.000000 2023-10-16 19:56:34,142 epoch 5 - iter 270/272 - loss 0.03533262 - time (sec): 15.70 - samples/sec: 3287.98 - lr: 0.000017 - momentum: 0.000000 2023-10-16 19:56:34,276 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:56:34,276 EPOCH 5 done: loss 0.0353 - lr: 0.000017 2023-10-16 19:56:35,722 DEV : loss 0.12297820299863815 - f1-score (micro avg) 0.8154 2023-10-16 19:56:35,727 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:56:37,444 epoch 6 - iter 27/272 - loss 0.01991166 - time (sec): 1.72 - samples/sec: 3322.31 - lr: 0.000016 - momentum: 0.000000 2023-10-16 19:56:38,878 epoch 6 - iter 54/272 - loss 0.01908545 - time (sec): 3.15 - samples/sec: 3148.40 - lr: 0.000016 - momentum: 0.000000 2023-10-16 19:56:40,414 epoch 6 - iter 81/272 - loss 0.02105140 - time (sec): 4.69 - samples/sec: 3163.24 - lr: 0.000016 - momentum: 0.000000 2023-10-16 19:56:41,821 epoch 6 - iter 108/272 - loss 0.02398416 - time (sec): 6.09 - samples/sec: 3170.19 - lr: 0.000015 - momentum: 0.000000 2023-10-16 19:56:43,253 epoch 6 - iter 135/272 - loss 0.02455143 - time (sec): 7.53 - samples/sec: 3197.84 - lr: 0.000015 - momentum: 0.000000 2023-10-16 19:56:44,862 epoch 6 - iter 162/272 - loss 0.02537861 - time (sec): 9.13 - samples/sec: 3200.04 - lr: 0.000015 - momentum: 0.000000 2023-10-16 19:56:46,456 epoch 6 - iter 189/272 - loss 0.02638712 - time (sec): 10.73 - samples/sec: 3231.46 - lr: 0.000014 - momentum: 0.000000 2023-10-16 19:56:48,089 epoch 6 - iter 216/272 - loss 0.02536443 - time (sec): 12.36 - samples/sec: 3289.75 - lr: 0.000014 - momentum: 0.000000 2023-10-16 19:56:49,795 epoch 6 - iter 243/272 - loss 0.02486984 - time (sec): 14.07 - samples/sec: 3329.80 - lr: 0.000014 - momentum: 0.000000 2023-10-16 19:56:51,361 epoch 6 - iter 270/272 - loss 0.02422424 - time (sec): 15.63 - samples/sec: 3310.76 - lr: 0.000013 - momentum: 0.000000 2023-10-16 19:56:51,454 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:56:51,455 EPOCH 6 done: loss 0.0248 - lr: 0.000013 2023-10-16 19:56:52,874 DEV : loss 0.12834559381008148 - f1-score (micro avg) 0.8392 2023-10-16 19:56:52,878 saving best model 2023-10-16 19:56:53,283 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:56:54,889 epoch 7 - iter 27/272 - loss 0.01509354 - time (sec): 1.60 - samples/sec: 3342.76 - lr: 0.000013 - momentum: 0.000000 2023-10-16 19:56:56,559 epoch 7 - iter 54/272 - loss 0.02142213 - time (sec): 3.27 - samples/sec: 3407.39 - lr: 0.000013 - momentum: 0.000000 2023-10-16 19:56:58,145 epoch 7 - iter 81/272 - loss 0.01832991 - time (sec): 4.86 - samples/sec: 3396.16 - lr: 0.000012 - momentum: 0.000000 2023-10-16 19:56:59,710 epoch 7 - iter 108/272 - loss 0.01959823 - time (sec): 6.42 - samples/sec: 3400.56 - lr: 0.000012 - momentum: 0.000000 2023-10-16 19:57:01,243 epoch 7 - iter 135/272 - loss 0.01831873 - time (sec): 7.95 - samples/sec: 3428.75 - lr: 0.000012 - momentum: 0.000000 2023-10-16 19:57:02,746 epoch 7 - iter 162/272 - loss 0.02072289 - time (sec): 9.46 - samples/sec: 3380.77 - lr: 0.000011 - momentum: 0.000000 2023-10-16 19:57:04,279 epoch 7 - iter 189/272 - loss 0.01977148 - time (sec): 10.99 - samples/sec: 3358.94 - lr: 0.000011 - momentum: 0.000000 2023-10-16 19:57:05,886 epoch 7 - iter 216/272 - loss 0.01911425 - time (sec): 12.60 - samples/sec: 3368.28 - lr: 0.000011 - momentum: 0.000000 2023-10-16 19:57:07,420 epoch 7 - iter 243/272 - loss 0.01880223 - time (sec): 14.13 - samples/sec: 3359.24 - lr: 0.000010 - momentum: 0.000000 2023-10-16 19:57:08,916 epoch 7 - iter 270/272 - loss 0.02047643 - time (sec): 15.63 - samples/sec: 3318.25 - lr: 0.000010 - momentum: 0.000000 2023-10-16 19:57:09,000 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:57:09,000 EPOCH 7 done: loss 0.0205 - lr: 0.000010 2023-10-16 19:57:10,600 DEV : loss 0.1370716243982315 - f1-score (micro avg) 0.8439 2023-10-16 19:57:10,604 saving best model 2023-10-16 19:57:11,024 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:57:12,531 epoch 8 - iter 27/272 - loss 0.01570058 - time (sec): 1.51 - samples/sec: 3260.07 - lr: 0.000010 - momentum: 0.000000 2023-10-16 19:57:14,135 epoch 8 - iter 54/272 - loss 0.01138672 - time (sec): 3.11 - samples/sec: 3476.00 - lr: 0.000009 - momentum: 0.000000 2023-10-16 19:57:15,765 epoch 8 - iter 81/272 - loss 0.01524252 - time (sec): 4.74 - samples/sec: 3507.80 - lr: 0.000009 - momentum: 0.000000 2023-10-16 19:57:17,378 epoch 8 - iter 108/272 - loss 0.01459537 - time (sec): 6.35 - samples/sec: 3513.07 - lr: 0.000009 - momentum: 0.000000 2023-10-16 19:57:18,810 epoch 8 - iter 135/272 - loss 0.01560028 - time (sec): 7.78 - samples/sec: 3519.74 - lr: 0.000008 - momentum: 0.000000 2023-10-16 19:57:20,474 epoch 8 - iter 162/272 - loss 0.01486699 - time (sec): 9.45 - samples/sec: 3456.85 - lr: 0.000008 - momentum: 0.000000 2023-10-16 19:57:21,873 epoch 8 - iter 189/272 - loss 0.01430914 - time (sec): 10.85 - samples/sec: 3448.67 - lr: 0.000008 - momentum: 0.000000 2023-10-16 19:57:23,213 epoch 8 - iter 216/272 - loss 0.01524647 - time (sec): 12.19 - samples/sec: 3422.66 - lr: 0.000007 - momentum: 0.000000 2023-10-16 19:57:24,686 epoch 8 - iter 243/272 - loss 0.01493429 - time (sec): 13.66 - samples/sec: 3398.56 - lr: 0.000007 - momentum: 0.000000 2023-10-16 19:57:26,272 epoch 8 - iter 270/272 - loss 0.01498748 - time (sec): 15.25 - samples/sec: 3398.46 - lr: 0.000007 - momentum: 0.000000 2023-10-16 19:57:26,355 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:57:26,355 EPOCH 8 done: loss 0.0153 - lr: 0.000007 2023-10-16 19:57:27,779 DEV : loss 0.15467961132526398 - f1-score (micro avg) 0.8199 2023-10-16 19:57:27,783 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:57:29,287 epoch 9 - iter 27/272 - loss 0.02729236 - time (sec): 1.50 - samples/sec: 3297.39 - lr: 0.000006 - momentum: 0.000000 2023-10-16 19:57:31,004 epoch 9 - iter 54/272 - loss 0.01595842 - time (sec): 3.22 - samples/sec: 3355.44 - lr: 0.000006 - momentum: 0.000000 2023-10-16 19:57:32,462 epoch 9 - iter 81/272 - loss 0.01475055 - time (sec): 4.68 - samples/sec: 3375.42 - lr: 0.000006 - momentum: 0.000000 2023-10-16 19:57:34,011 epoch 9 - iter 108/272 - loss 0.01321516 - time (sec): 6.23 - samples/sec: 3437.62 - lr: 0.000005 - momentum: 0.000000 2023-10-16 19:57:35,432 epoch 9 - iter 135/272 - loss 0.01577781 - time (sec): 7.65 - samples/sec: 3366.56 - lr: 0.000005 - momentum: 0.000000 2023-10-16 19:57:37,001 epoch 9 - iter 162/272 - loss 0.01510896 - time (sec): 9.22 - samples/sec: 3434.11 - lr: 0.000005 - momentum: 0.000000 2023-10-16 19:57:38,374 epoch 9 - iter 189/272 - loss 0.01394347 - time (sec): 10.59 - samples/sec: 3375.39 - lr: 0.000004 - momentum: 0.000000 2023-10-16 19:57:39,957 epoch 9 - iter 216/272 - loss 0.01343107 - time (sec): 12.17 - samples/sec: 3371.56 - lr: 0.000004 - momentum: 0.000000 2023-10-16 19:57:41,408 epoch 9 - iter 243/272 - loss 0.01270919 - time (sec): 13.62 - samples/sec: 3347.79 - lr: 0.000004 - momentum: 0.000000 2023-10-16 19:57:43,155 epoch 9 - iter 270/272 - loss 0.01263245 - time (sec): 15.37 - samples/sec: 3366.15 - lr: 0.000003 - momentum: 0.000000 2023-10-16 19:57:43,246 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:57:43,246 EPOCH 9 done: loss 0.0126 - lr: 0.000003 2023-10-16 19:57:44,689 DEV : loss 0.15767242014408112 - f1-score (micro avg) 0.833 2023-10-16 19:57:44,694 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:57:46,340 epoch 10 - iter 27/272 - loss 0.00835601 - time (sec): 1.64 - samples/sec: 3466.33 - lr: 0.000003 - momentum: 0.000000 2023-10-16 19:57:47,783 epoch 10 - iter 54/272 - loss 0.00485101 - time (sec): 3.09 - samples/sec: 3283.50 - lr: 0.000003 - momentum: 0.000000 2023-10-16 19:57:49,267 epoch 10 - iter 81/272 - loss 0.00584821 - time (sec): 4.57 - samples/sec: 3248.55 - lr: 0.000002 - momentum: 0.000000 2023-10-16 19:57:50,925 epoch 10 - iter 108/272 - loss 0.00558808 - time (sec): 6.23 - samples/sec: 3275.67 - lr: 0.000002 - momentum: 0.000000 2023-10-16 19:57:52,542 epoch 10 - iter 135/272 - loss 0.00481861 - time (sec): 7.85 - samples/sec: 3322.11 - lr: 0.000002 - momentum: 0.000000 2023-10-16 19:57:54,193 epoch 10 - iter 162/272 - loss 0.00836834 - time (sec): 9.50 - samples/sec: 3313.17 - lr: 0.000001 - momentum: 0.000000 2023-10-16 19:57:55,900 epoch 10 - iter 189/272 - loss 0.00829871 - time (sec): 11.20 - samples/sec: 3295.75 - lr: 0.000001 - momentum: 0.000000 2023-10-16 19:57:57,412 epoch 10 - iter 216/272 - loss 0.00879782 - time (sec): 12.72 - samples/sec: 3265.60 - lr: 0.000001 - momentum: 0.000000 2023-10-16 19:57:58,914 epoch 10 - iter 243/272 - loss 0.00937661 - time (sec): 14.22 - samples/sec: 3261.54 - lr: 0.000000 - momentum: 0.000000 2023-10-16 19:58:00,543 epoch 10 - iter 270/272 - loss 0.00993645 - time (sec): 15.85 - samples/sec: 3265.36 - lr: 0.000000 - momentum: 0.000000 2023-10-16 19:58:00,639 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:58:00,640 EPOCH 10 done: loss 0.0099 - lr: 0.000000 2023-10-16 19:58:02,063 DEV : loss 0.16342675685882568 - f1-score (micro avg) 0.819 2023-10-16 19:58:02,408 ---------------------------------------------------------------------------------------------------- 2023-10-16 19:58:02,409 Loading model from best epoch ... 2023-10-16 19:58:04,004 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-16 19:58:05,970 Results: - F-score (micro) 0.7955 - F-score (macro) 0.7533 - Accuracy 0.6786 By class: precision recall f1-score support LOC 0.8212 0.8686 0.8442 312 PER 0.7344 0.8510 0.7884 208 ORG 0.5000 0.4000 0.4444 55 HumanProd 0.8800 1.0000 0.9362 22 micro avg 0.7688 0.8241 0.7955 597 macro avg 0.7339 0.7799 0.7533 597 weighted avg 0.7636 0.8241 0.7913 597 2023-10-16 19:58:05,970 ----------------------------------------------------------------------------------------------------