2023-10-16 22:05:02,568 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:05:02,569 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-16 22:05:02,569 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:05:02,569 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-16 22:05:02,569 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:05:02,569 Train: 6183 sentences 2023-10-16 22:05:02,569 (train_with_dev=False, train_with_test=False) 2023-10-16 22:05:02,569 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:05:02,569 Training Params: 2023-10-16 22:05:02,569 - learning_rate: "5e-05" 2023-10-16 22:05:02,569 - mini_batch_size: "8" 2023-10-16 22:05:02,569 - max_epochs: "10" 2023-10-16 22:05:02,569 - shuffle: "True" 2023-10-16 22:05:02,569 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:05:02,569 Plugins: 2023-10-16 22:05:02,569 - LinearScheduler | warmup_fraction: '0.1' 2023-10-16 22:05:02,569 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:05:02,570 Final evaluation on model from best epoch (best-model.pt) 2023-10-16 22:05:02,570 - metric: "('micro avg', 'f1-score')" 2023-10-16 22:05:02,570 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:05:02,570 Computation: 2023-10-16 22:05:02,570 - compute on device: cuda:0 2023-10-16 22:05:02,570 - embedding storage: none 2023-10-16 22:05:02,570 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:05:02,570 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-16 22:05:02,570 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:05:02,570 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:05:07,108 epoch 1 - iter 77/773 - loss 2.07744315 - time (sec): 4.54 - samples/sec: 2610.32 - lr: 0.000005 - momentum: 0.000000 2023-10-16 22:05:11,553 epoch 1 - iter 154/773 - loss 1.17743079 - time (sec): 8.98 - samples/sec: 2642.75 - lr: 0.000010 - momentum: 0.000000 2023-10-16 22:05:16,195 epoch 1 - iter 231/773 - loss 0.80857450 - time (sec): 13.62 - samples/sec: 2721.79 - lr: 0.000015 - momentum: 0.000000 2023-10-16 22:05:20,634 epoch 1 - iter 308/773 - loss 0.64594888 - time (sec): 18.06 - samples/sec: 2721.06 - lr: 0.000020 - momentum: 0.000000 2023-10-16 22:05:25,021 epoch 1 - iter 385/773 - loss 0.54443571 - time (sec): 22.45 - samples/sec: 2715.07 - lr: 0.000025 - momentum: 0.000000 2023-10-16 22:05:29,309 epoch 1 - iter 462/773 - loss 0.47012773 - time (sec): 26.74 - samples/sec: 2734.10 - lr: 0.000030 - momentum: 0.000000 2023-10-16 22:05:33,752 epoch 1 - iter 539/773 - loss 0.41651204 - time (sec): 31.18 - samples/sec: 2757.19 - lr: 0.000035 - momentum: 0.000000 2023-10-16 22:05:38,386 epoch 1 - iter 616/773 - loss 0.37671127 - time (sec): 35.82 - samples/sec: 2773.34 - lr: 0.000040 - momentum: 0.000000 2023-10-16 22:05:43,095 epoch 1 - iter 693/773 - loss 0.34712982 - time (sec): 40.52 - samples/sec: 2743.12 - lr: 0.000045 - momentum: 0.000000 2023-10-16 22:05:47,514 epoch 1 - iter 770/773 - loss 0.32154236 - time (sec): 44.94 - samples/sec: 2757.20 - lr: 0.000050 - momentum: 0.000000 2023-10-16 22:05:47,670 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:05:47,670 EPOCH 1 done: loss 0.3208 - lr: 0.000050 2023-10-16 22:05:49,710 DEV : loss 0.0772036612033844 - f1-score (micro avg) 0.6643 2023-10-16 22:05:49,722 saving best model 2023-10-16 22:05:50,064 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:05:54,575 epoch 2 - iter 77/773 - loss 0.07753873 - time (sec): 4.51 - samples/sec: 2686.50 - lr: 0.000049 - momentum: 0.000000 2023-10-16 22:05:58,993 epoch 2 - iter 154/773 - loss 0.08814964 - time (sec): 8.93 - samples/sec: 2682.41 - lr: 0.000049 - momentum: 0.000000 2023-10-16 22:06:03,719 epoch 2 - iter 231/773 - loss 0.08367414 - time (sec): 13.65 - samples/sec: 2695.54 - lr: 0.000048 - momentum: 0.000000 2023-10-16 22:06:08,199 epoch 2 - iter 308/773 - loss 0.08164912 - time (sec): 18.13 - samples/sec: 2694.67 - lr: 0.000048 - momentum: 0.000000 2023-10-16 22:06:12,828 epoch 2 - iter 385/773 - loss 0.08321751 - time (sec): 22.76 - samples/sec: 2705.47 - lr: 0.000047 - momentum: 0.000000 2023-10-16 22:06:17,450 epoch 2 - iter 462/773 - loss 0.08271449 - time (sec): 27.38 - samples/sec: 2672.02 - lr: 0.000047 - momentum: 0.000000 2023-10-16 22:06:22,030 epoch 2 - iter 539/773 - loss 0.08056035 - time (sec): 31.96 - samples/sec: 2676.73 - lr: 0.000046 - momentum: 0.000000 2023-10-16 22:06:26,558 epoch 2 - iter 616/773 - loss 0.08048201 - time (sec): 36.49 - samples/sec: 2680.20 - lr: 0.000046 - momentum: 0.000000 2023-10-16 22:06:31,001 epoch 2 - iter 693/773 - loss 0.07801088 - time (sec): 40.94 - samples/sec: 2693.49 - lr: 0.000045 - momentum: 0.000000 2023-10-16 22:06:35,711 epoch 2 - iter 770/773 - loss 0.07752429 - time (sec): 45.65 - samples/sec: 2713.00 - lr: 0.000044 - momentum: 0.000000 2023-10-16 22:06:35,870 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:06:35,870 EPOCH 2 done: loss 0.0773 - lr: 0.000044 2023-10-16 22:06:37,935 DEV : loss 0.06809011846780777 - f1-score (micro avg) 0.7089 2023-10-16 22:06:37,947 saving best model 2023-10-16 22:06:38,714 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:06:43,428 epoch 3 - iter 77/773 - loss 0.05164133 - time (sec): 4.71 - samples/sec: 2719.74 - lr: 0.000044 - momentum: 0.000000 2023-10-16 22:06:47,856 epoch 3 - iter 154/773 - loss 0.05614533 - time (sec): 9.14 - samples/sec: 2744.96 - lr: 0.000043 - momentum: 0.000000 2023-10-16 22:06:52,660 epoch 3 - iter 231/773 - loss 0.05743499 - time (sec): 13.94 - samples/sec: 2704.33 - lr: 0.000043 - momentum: 0.000000 2023-10-16 22:06:57,007 epoch 3 - iter 308/773 - loss 0.05405146 - time (sec): 18.29 - samples/sec: 2728.43 - lr: 0.000042 - momentum: 0.000000 2023-10-16 22:07:01,583 epoch 3 - iter 385/773 - loss 0.05263408 - time (sec): 22.87 - samples/sec: 2735.35 - lr: 0.000042 - momentum: 0.000000 2023-10-16 22:07:06,253 epoch 3 - iter 462/773 - loss 0.05351123 - time (sec): 27.54 - samples/sec: 2719.53 - lr: 0.000041 - momentum: 0.000000 2023-10-16 22:07:10,771 epoch 3 - iter 539/773 - loss 0.05305526 - time (sec): 32.05 - samples/sec: 2720.58 - lr: 0.000041 - momentum: 0.000000 2023-10-16 22:07:15,136 epoch 3 - iter 616/773 - loss 0.05227538 - time (sec): 36.42 - samples/sec: 2714.42 - lr: 0.000040 - momentum: 0.000000 2023-10-16 22:07:19,672 epoch 3 - iter 693/773 - loss 0.05203641 - time (sec): 40.96 - samples/sec: 2731.14 - lr: 0.000039 - momentum: 0.000000 2023-10-16 22:07:24,115 epoch 3 - iter 770/773 - loss 0.05094133 - time (sec): 45.40 - samples/sec: 2730.25 - lr: 0.000039 - momentum: 0.000000 2023-10-16 22:07:24,271 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:07:24,271 EPOCH 3 done: loss 0.0508 - lr: 0.000039 2023-10-16 22:07:26,321 DEV : loss 0.08064333349466324 - f1-score (micro avg) 0.784 2023-10-16 22:07:26,333 saving best model 2023-10-16 22:07:26,782 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:07:31,371 epoch 4 - iter 77/773 - loss 0.04532304 - time (sec): 4.59 - samples/sec: 2713.19 - lr: 0.000038 - momentum: 0.000000 2023-10-16 22:07:35,796 epoch 4 - iter 154/773 - loss 0.04515752 - time (sec): 9.01 - samples/sec: 2642.61 - lr: 0.000038 - momentum: 0.000000 2023-10-16 22:07:40,584 epoch 4 - iter 231/773 - loss 0.04011628 - time (sec): 13.80 - samples/sec: 2609.12 - lr: 0.000037 - momentum: 0.000000 2023-10-16 22:07:45,151 epoch 4 - iter 308/773 - loss 0.03827052 - time (sec): 18.37 - samples/sec: 2625.80 - lr: 0.000037 - momentum: 0.000000 2023-10-16 22:07:49,673 epoch 4 - iter 385/773 - loss 0.03710721 - time (sec): 22.89 - samples/sec: 2663.63 - lr: 0.000036 - momentum: 0.000000 2023-10-16 22:07:54,043 epoch 4 - iter 462/773 - loss 0.03902851 - time (sec): 27.26 - samples/sec: 2684.80 - lr: 0.000036 - momentum: 0.000000 2023-10-16 22:07:58,389 epoch 4 - iter 539/773 - loss 0.03842162 - time (sec): 31.60 - samples/sec: 2707.56 - lr: 0.000035 - momentum: 0.000000 2023-10-16 22:08:03,155 epoch 4 - iter 616/773 - loss 0.03858095 - time (sec): 36.37 - samples/sec: 2700.81 - lr: 0.000034 - momentum: 0.000000 2023-10-16 22:08:07,677 epoch 4 - iter 693/773 - loss 0.03732367 - time (sec): 40.89 - samples/sec: 2711.50 - lr: 0.000034 - momentum: 0.000000 2023-10-16 22:08:12,393 epoch 4 - iter 770/773 - loss 0.03734734 - time (sec): 45.61 - samples/sec: 2716.36 - lr: 0.000033 - momentum: 0.000000 2023-10-16 22:08:12,558 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:08:12,558 EPOCH 4 done: loss 0.0373 - lr: 0.000033 2023-10-16 22:08:14,643 DEV : loss 0.08836734294891357 - f1-score (micro avg) 0.7582 2023-10-16 22:08:14,656 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:08:19,114 epoch 5 - iter 77/773 - loss 0.02324528 - time (sec): 4.46 - samples/sec: 2818.91 - lr: 0.000033 - momentum: 0.000000 2023-10-16 22:08:23,457 epoch 5 - iter 154/773 - loss 0.02467072 - time (sec): 8.80 - samples/sec: 2799.12 - lr: 0.000032 - momentum: 0.000000 2023-10-16 22:08:27,881 epoch 5 - iter 231/773 - loss 0.02448679 - time (sec): 13.22 - samples/sec: 2725.54 - lr: 0.000032 - momentum: 0.000000 2023-10-16 22:08:32,569 epoch 5 - iter 308/773 - loss 0.02574532 - time (sec): 17.91 - samples/sec: 2746.57 - lr: 0.000031 - momentum: 0.000000 2023-10-16 22:08:37,181 epoch 5 - iter 385/773 - loss 0.02467239 - time (sec): 22.52 - samples/sec: 2743.95 - lr: 0.000031 - momentum: 0.000000 2023-10-16 22:08:41,801 epoch 5 - iter 462/773 - loss 0.02435955 - time (sec): 27.14 - samples/sec: 2755.55 - lr: 0.000030 - momentum: 0.000000 2023-10-16 22:08:46,576 epoch 5 - iter 539/773 - loss 0.02374115 - time (sec): 31.92 - samples/sec: 2740.25 - lr: 0.000029 - momentum: 0.000000 2023-10-16 22:08:51,103 epoch 5 - iter 616/773 - loss 0.02505935 - time (sec): 36.45 - samples/sec: 2737.71 - lr: 0.000029 - momentum: 0.000000 2023-10-16 22:08:55,524 epoch 5 - iter 693/773 - loss 0.02470818 - time (sec): 40.87 - samples/sec: 2737.62 - lr: 0.000028 - momentum: 0.000000 2023-10-16 22:08:59,942 epoch 5 - iter 770/773 - loss 0.02388204 - time (sec): 45.29 - samples/sec: 2736.97 - lr: 0.000028 - momentum: 0.000000 2023-10-16 22:09:00,093 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:09:00,094 EPOCH 5 done: loss 0.0239 - lr: 0.000028 2023-10-16 22:09:02,150 DEV : loss 0.10162093490362167 - f1-score (micro avg) 0.7743 2023-10-16 22:09:02,163 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:09:06,579 epoch 6 - iter 77/773 - loss 0.01159246 - time (sec): 4.42 - samples/sec: 2796.00 - lr: 0.000027 - momentum: 0.000000 2023-10-16 22:09:11,152 epoch 6 - iter 154/773 - loss 0.01350875 - time (sec): 8.99 - samples/sec: 2651.15 - lr: 0.000027 - momentum: 0.000000 2023-10-16 22:09:15,655 epoch 6 - iter 231/773 - loss 0.01496486 - time (sec): 13.49 - samples/sec: 2665.48 - lr: 0.000026 - momentum: 0.000000 2023-10-16 22:09:20,180 epoch 6 - iter 308/773 - loss 0.01695021 - time (sec): 18.02 - samples/sec: 2700.59 - lr: 0.000026 - momentum: 0.000000 2023-10-16 22:09:24,538 epoch 6 - iter 385/773 - loss 0.01737618 - time (sec): 22.37 - samples/sec: 2712.80 - lr: 0.000025 - momentum: 0.000000 2023-10-16 22:09:28,871 epoch 6 - iter 462/773 - loss 0.01752556 - time (sec): 26.71 - samples/sec: 2719.44 - lr: 0.000024 - momentum: 0.000000 2023-10-16 22:09:33,395 epoch 6 - iter 539/773 - loss 0.01769065 - time (sec): 31.23 - samples/sec: 2719.13 - lr: 0.000024 - momentum: 0.000000 2023-10-16 22:09:38,097 epoch 6 - iter 616/773 - loss 0.01836114 - time (sec): 35.93 - samples/sec: 2717.44 - lr: 0.000023 - momentum: 0.000000 2023-10-16 22:09:42,421 epoch 6 - iter 693/773 - loss 0.01786052 - time (sec): 40.26 - samples/sec: 2719.08 - lr: 0.000023 - momentum: 0.000000 2023-10-16 22:09:47,260 epoch 6 - iter 770/773 - loss 0.01820946 - time (sec): 45.10 - samples/sec: 2742.13 - lr: 0.000022 - momentum: 0.000000 2023-10-16 22:09:47,448 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:09:47,448 EPOCH 6 done: loss 0.0182 - lr: 0.000022 2023-10-16 22:09:49,464 DEV : loss 0.10438579320907593 - f1-score (micro avg) 0.7863 2023-10-16 22:09:49,476 saving best model 2023-10-16 22:09:49,941 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:09:54,815 epoch 7 - iter 77/773 - loss 0.01326801 - time (sec): 4.87 - samples/sec: 2596.22 - lr: 0.000022 - momentum: 0.000000 2023-10-16 22:09:59,436 epoch 7 - iter 154/773 - loss 0.01133806 - time (sec): 9.49 - samples/sec: 2698.66 - lr: 0.000021 - momentum: 0.000000 2023-10-16 22:10:03,998 epoch 7 - iter 231/773 - loss 0.01166043 - time (sec): 14.05 - samples/sec: 2696.24 - lr: 0.000021 - momentum: 0.000000 2023-10-16 22:10:08,505 epoch 7 - iter 308/773 - loss 0.01086055 - time (sec): 18.56 - samples/sec: 2685.92 - lr: 0.000020 - momentum: 0.000000 2023-10-16 22:10:13,116 epoch 7 - iter 385/773 - loss 0.01053431 - time (sec): 23.17 - samples/sec: 2692.16 - lr: 0.000019 - momentum: 0.000000 2023-10-16 22:10:17,458 epoch 7 - iter 462/773 - loss 0.01175421 - time (sec): 27.51 - samples/sec: 2700.81 - lr: 0.000019 - momentum: 0.000000 2023-10-16 22:10:21,786 epoch 7 - iter 539/773 - loss 0.01094026 - time (sec): 31.84 - samples/sec: 2713.28 - lr: 0.000018 - momentum: 0.000000 2023-10-16 22:10:26,348 epoch 7 - iter 616/773 - loss 0.01147580 - time (sec): 36.40 - samples/sec: 2708.94 - lr: 0.000018 - momentum: 0.000000 2023-10-16 22:10:30,934 epoch 7 - iter 693/773 - loss 0.01097587 - time (sec): 40.99 - samples/sec: 2718.33 - lr: 0.000017 - momentum: 0.000000 2023-10-16 22:10:35,494 epoch 7 - iter 770/773 - loss 0.01075542 - time (sec): 45.55 - samples/sec: 2719.57 - lr: 0.000017 - momentum: 0.000000 2023-10-16 22:10:35,663 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:10:35,663 EPOCH 7 done: loss 0.0109 - lr: 0.000017 2023-10-16 22:10:37,719 DEV : loss 0.10951930284500122 - f1-score (micro avg) 0.7983 2023-10-16 22:10:37,731 saving best model 2023-10-16 22:10:38,198 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:10:42,724 epoch 8 - iter 77/773 - loss 0.00728149 - time (sec): 4.52 - samples/sec: 2766.74 - lr: 0.000016 - momentum: 0.000000 2023-10-16 22:10:47,351 epoch 8 - iter 154/773 - loss 0.00617068 - time (sec): 9.15 - samples/sec: 2771.20 - lr: 0.000016 - momentum: 0.000000 2023-10-16 22:10:51,639 epoch 8 - iter 231/773 - loss 0.00654267 - time (sec): 13.44 - samples/sec: 2812.34 - lr: 0.000015 - momentum: 0.000000 2023-10-16 22:10:56,231 epoch 8 - iter 308/773 - loss 0.00737351 - time (sec): 18.03 - samples/sec: 2828.58 - lr: 0.000014 - momentum: 0.000000 2023-10-16 22:11:00,609 epoch 8 - iter 385/773 - loss 0.00732684 - time (sec): 22.41 - samples/sec: 2793.63 - lr: 0.000014 - momentum: 0.000000 2023-10-16 22:11:05,075 epoch 8 - iter 462/773 - loss 0.00803301 - time (sec): 26.88 - samples/sec: 2785.65 - lr: 0.000013 - momentum: 0.000000 2023-10-16 22:11:09,435 epoch 8 - iter 539/773 - loss 0.00840404 - time (sec): 31.24 - samples/sec: 2780.54 - lr: 0.000013 - momentum: 0.000000 2023-10-16 22:11:13,860 epoch 8 - iter 616/773 - loss 0.00808847 - time (sec): 35.66 - samples/sec: 2772.46 - lr: 0.000012 - momentum: 0.000000 2023-10-16 22:11:18,291 epoch 8 - iter 693/773 - loss 0.00861262 - time (sec): 40.09 - samples/sec: 2766.77 - lr: 0.000012 - momentum: 0.000000 2023-10-16 22:11:23,147 epoch 8 - iter 770/773 - loss 0.00822302 - time (sec): 44.95 - samples/sec: 2756.21 - lr: 0.000011 - momentum: 0.000000 2023-10-16 22:11:23,327 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:11:23,327 EPOCH 8 done: loss 0.0082 - lr: 0.000011 2023-10-16 22:11:25,361 DEV : loss 0.11846552044153214 - f1-score (micro avg) 0.7943 2023-10-16 22:11:25,374 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:11:30,050 epoch 9 - iter 77/773 - loss 0.00433896 - time (sec): 4.67 - samples/sec: 2677.26 - lr: 0.000011 - momentum: 0.000000 2023-10-16 22:11:34,429 epoch 9 - iter 154/773 - loss 0.00498112 - time (sec): 9.05 - samples/sec: 2697.81 - lr: 0.000010 - momentum: 0.000000 2023-10-16 22:11:38,833 epoch 9 - iter 231/773 - loss 0.00373937 - time (sec): 13.46 - samples/sec: 2717.39 - lr: 0.000009 - momentum: 0.000000 2023-10-16 22:11:43,181 epoch 9 - iter 308/773 - loss 0.00412343 - time (sec): 17.81 - samples/sec: 2732.31 - lr: 0.000009 - momentum: 0.000000 2023-10-16 22:11:47,729 epoch 9 - iter 385/773 - loss 0.00375047 - time (sec): 22.35 - samples/sec: 2718.05 - lr: 0.000008 - momentum: 0.000000 2023-10-16 22:11:52,475 epoch 9 - iter 462/773 - loss 0.00369782 - time (sec): 27.10 - samples/sec: 2717.22 - lr: 0.000008 - momentum: 0.000000 2023-10-16 22:11:56,947 epoch 9 - iter 539/773 - loss 0.00383622 - time (sec): 31.57 - samples/sec: 2733.87 - lr: 0.000007 - momentum: 0.000000 2023-10-16 22:12:01,471 epoch 9 - iter 616/773 - loss 0.00413650 - time (sec): 36.10 - samples/sec: 2732.35 - lr: 0.000007 - momentum: 0.000000 2023-10-16 22:12:05,987 epoch 9 - iter 693/773 - loss 0.00456700 - time (sec): 40.61 - samples/sec: 2740.68 - lr: 0.000006 - momentum: 0.000000 2023-10-16 22:12:10,465 epoch 9 - iter 770/773 - loss 0.00488437 - time (sec): 45.09 - samples/sec: 2749.39 - lr: 0.000006 - momentum: 0.000000 2023-10-16 22:12:10,633 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:12:10,634 EPOCH 9 done: loss 0.0049 - lr: 0.000006 2023-10-16 22:12:12,648 DEV : loss 0.11912991851568222 - f1-score (micro avg) 0.7859 2023-10-16 22:12:12,661 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:12:17,152 epoch 10 - iter 77/773 - loss 0.00307045 - time (sec): 4.49 - samples/sec: 2767.84 - lr: 0.000005 - momentum: 0.000000 2023-10-16 22:12:21,594 epoch 10 - iter 154/773 - loss 0.00429958 - time (sec): 8.93 - samples/sec: 2791.36 - lr: 0.000005 - momentum: 0.000000 2023-10-16 22:12:26,080 epoch 10 - iter 231/773 - loss 0.00374969 - time (sec): 13.42 - samples/sec: 2745.64 - lr: 0.000004 - momentum: 0.000000 2023-10-16 22:12:30,759 epoch 10 - iter 308/773 - loss 0.00293141 - time (sec): 18.10 - samples/sec: 2763.47 - lr: 0.000003 - momentum: 0.000000 2023-10-16 22:12:35,215 epoch 10 - iter 385/773 - loss 0.00303829 - time (sec): 22.55 - samples/sec: 2752.65 - lr: 0.000003 - momentum: 0.000000 2023-10-16 22:12:39,757 epoch 10 - iter 462/773 - loss 0.00299411 - time (sec): 27.09 - samples/sec: 2766.56 - lr: 0.000002 - momentum: 0.000000 2023-10-16 22:12:44,528 epoch 10 - iter 539/773 - loss 0.00315115 - time (sec): 31.87 - samples/sec: 2730.07 - lr: 0.000002 - momentum: 0.000000 2023-10-16 22:12:48,932 epoch 10 - iter 616/773 - loss 0.00309341 - time (sec): 36.27 - samples/sec: 2742.38 - lr: 0.000001 - momentum: 0.000000 2023-10-16 22:12:53,383 epoch 10 - iter 693/773 - loss 0.00291500 - time (sec): 40.72 - samples/sec: 2734.97 - lr: 0.000001 - momentum: 0.000000 2023-10-16 22:12:57,938 epoch 10 - iter 770/773 - loss 0.00293636 - time (sec): 45.28 - samples/sec: 2730.69 - lr: 0.000000 - momentum: 0.000000 2023-10-16 22:12:58,116 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:12:58,116 EPOCH 10 done: loss 0.0029 - lr: 0.000000 2023-10-16 22:13:00,131 DEV : loss 0.12098120898008347 - f1-score (micro avg) 0.7832 2023-10-16 22:13:00,473 ---------------------------------------------------------------------------------------------------- 2023-10-16 22:13:00,475 Loading model from best epoch ... 2023-10-16 22:13:01,959 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-16 22:13:07,934 Results: - F-score (micro) 0.8009 - F-score (macro) 0.6968 - Accuracy 0.6881 By class: precision recall f1-score support LOC 0.8549 0.8531 0.8540 946 BUILDING 0.6000 0.4703 0.5273 185 STREET 0.7222 0.6964 0.7091 56 micro avg 0.8163 0.7860 0.8009 1187 macro avg 0.7257 0.6733 0.6968 1187 weighted avg 0.8089 0.7860 0.7962 1187 2023-10-16 22:13:07,935 ----------------------------------------------------------------------------------------------------