stefan-it's picture
Upload folder using huggingface_hub
977b2c5
2023-10-16 22:05:02,568 ----------------------------------------------------------------------------------------------------
2023-10-16 22:05:02,569 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 22:05:02,569 ----------------------------------------------------------------------------------------------------
2023-10-16 22:05:02,569 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-16 22:05:02,569 ----------------------------------------------------------------------------------------------------
2023-10-16 22:05:02,569 Train: 6183 sentences
2023-10-16 22:05:02,569 (train_with_dev=False, train_with_test=False)
2023-10-16 22:05:02,569 ----------------------------------------------------------------------------------------------------
2023-10-16 22:05:02,569 Training Params:
2023-10-16 22:05:02,569 - learning_rate: "5e-05"
2023-10-16 22:05:02,569 - mini_batch_size: "8"
2023-10-16 22:05:02,569 - max_epochs: "10"
2023-10-16 22:05:02,569 - shuffle: "True"
2023-10-16 22:05:02,569 ----------------------------------------------------------------------------------------------------
2023-10-16 22:05:02,569 Plugins:
2023-10-16 22:05:02,569 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 22:05:02,569 ----------------------------------------------------------------------------------------------------
2023-10-16 22:05:02,570 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 22:05:02,570 - metric: "('micro avg', 'f1-score')"
2023-10-16 22:05:02,570 ----------------------------------------------------------------------------------------------------
2023-10-16 22:05:02,570 Computation:
2023-10-16 22:05:02,570 - compute on device: cuda:0
2023-10-16 22:05:02,570 - embedding storage: none
2023-10-16 22:05:02,570 ----------------------------------------------------------------------------------------------------
2023-10-16 22:05:02,570 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-16 22:05:02,570 ----------------------------------------------------------------------------------------------------
2023-10-16 22:05:02,570 ----------------------------------------------------------------------------------------------------
2023-10-16 22:05:07,108 epoch 1 - iter 77/773 - loss 2.07744315 - time (sec): 4.54 - samples/sec: 2610.32 - lr: 0.000005 - momentum: 0.000000
2023-10-16 22:05:11,553 epoch 1 - iter 154/773 - loss 1.17743079 - time (sec): 8.98 - samples/sec: 2642.75 - lr: 0.000010 - momentum: 0.000000
2023-10-16 22:05:16,195 epoch 1 - iter 231/773 - loss 0.80857450 - time (sec): 13.62 - samples/sec: 2721.79 - lr: 0.000015 - momentum: 0.000000
2023-10-16 22:05:20,634 epoch 1 - iter 308/773 - loss 0.64594888 - time (sec): 18.06 - samples/sec: 2721.06 - lr: 0.000020 - momentum: 0.000000
2023-10-16 22:05:25,021 epoch 1 - iter 385/773 - loss 0.54443571 - time (sec): 22.45 - samples/sec: 2715.07 - lr: 0.000025 - momentum: 0.000000
2023-10-16 22:05:29,309 epoch 1 - iter 462/773 - loss 0.47012773 - time (sec): 26.74 - samples/sec: 2734.10 - lr: 0.000030 - momentum: 0.000000
2023-10-16 22:05:33,752 epoch 1 - iter 539/773 - loss 0.41651204 - time (sec): 31.18 - samples/sec: 2757.19 - lr: 0.000035 - momentum: 0.000000
2023-10-16 22:05:38,386 epoch 1 - iter 616/773 - loss 0.37671127 - time (sec): 35.82 - samples/sec: 2773.34 - lr: 0.000040 - momentum: 0.000000
2023-10-16 22:05:43,095 epoch 1 - iter 693/773 - loss 0.34712982 - time (sec): 40.52 - samples/sec: 2743.12 - lr: 0.000045 - momentum: 0.000000
2023-10-16 22:05:47,514 epoch 1 - iter 770/773 - loss 0.32154236 - time (sec): 44.94 - samples/sec: 2757.20 - lr: 0.000050 - momentum: 0.000000
2023-10-16 22:05:47,670 ----------------------------------------------------------------------------------------------------
2023-10-16 22:05:47,670 EPOCH 1 done: loss 0.3208 - lr: 0.000050
2023-10-16 22:05:49,710 DEV : loss 0.0772036612033844 - f1-score (micro avg) 0.6643
2023-10-16 22:05:49,722 saving best model
2023-10-16 22:05:50,064 ----------------------------------------------------------------------------------------------------
2023-10-16 22:05:54,575 epoch 2 - iter 77/773 - loss 0.07753873 - time (sec): 4.51 - samples/sec: 2686.50 - lr: 0.000049 - momentum: 0.000000
2023-10-16 22:05:58,993 epoch 2 - iter 154/773 - loss 0.08814964 - time (sec): 8.93 - samples/sec: 2682.41 - lr: 0.000049 - momentum: 0.000000
2023-10-16 22:06:03,719 epoch 2 - iter 231/773 - loss 0.08367414 - time (sec): 13.65 - samples/sec: 2695.54 - lr: 0.000048 - momentum: 0.000000
2023-10-16 22:06:08,199 epoch 2 - iter 308/773 - loss 0.08164912 - time (sec): 18.13 - samples/sec: 2694.67 - lr: 0.000048 - momentum: 0.000000
2023-10-16 22:06:12,828 epoch 2 - iter 385/773 - loss 0.08321751 - time (sec): 22.76 - samples/sec: 2705.47 - lr: 0.000047 - momentum: 0.000000
2023-10-16 22:06:17,450 epoch 2 - iter 462/773 - loss 0.08271449 - time (sec): 27.38 - samples/sec: 2672.02 - lr: 0.000047 - momentum: 0.000000
2023-10-16 22:06:22,030 epoch 2 - iter 539/773 - loss 0.08056035 - time (sec): 31.96 - samples/sec: 2676.73 - lr: 0.000046 - momentum: 0.000000
2023-10-16 22:06:26,558 epoch 2 - iter 616/773 - loss 0.08048201 - time (sec): 36.49 - samples/sec: 2680.20 - lr: 0.000046 - momentum: 0.000000
2023-10-16 22:06:31,001 epoch 2 - iter 693/773 - loss 0.07801088 - time (sec): 40.94 - samples/sec: 2693.49 - lr: 0.000045 - momentum: 0.000000
2023-10-16 22:06:35,711 epoch 2 - iter 770/773 - loss 0.07752429 - time (sec): 45.65 - samples/sec: 2713.00 - lr: 0.000044 - momentum: 0.000000
2023-10-16 22:06:35,870 ----------------------------------------------------------------------------------------------------
2023-10-16 22:06:35,870 EPOCH 2 done: loss 0.0773 - lr: 0.000044
2023-10-16 22:06:37,935 DEV : loss 0.06809011846780777 - f1-score (micro avg) 0.7089
2023-10-16 22:06:37,947 saving best model
2023-10-16 22:06:38,714 ----------------------------------------------------------------------------------------------------
2023-10-16 22:06:43,428 epoch 3 - iter 77/773 - loss 0.05164133 - time (sec): 4.71 - samples/sec: 2719.74 - lr: 0.000044 - momentum: 0.000000
2023-10-16 22:06:47,856 epoch 3 - iter 154/773 - loss 0.05614533 - time (sec): 9.14 - samples/sec: 2744.96 - lr: 0.000043 - momentum: 0.000000
2023-10-16 22:06:52,660 epoch 3 - iter 231/773 - loss 0.05743499 - time (sec): 13.94 - samples/sec: 2704.33 - lr: 0.000043 - momentum: 0.000000
2023-10-16 22:06:57,007 epoch 3 - iter 308/773 - loss 0.05405146 - time (sec): 18.29 - samples/sec: 2728.43 - lr: 0.000042 - momentum: 0.000000
2023-10-16 22:07:01,583 epoch 3 - iter 385/773 - loss 0.05263408 - time (sec): 22.87 - samples/sec: 2735.35 - lr: 0.000042 - momentum: 0.000000
2023-10-16 22:07:06,253 epoch 3 - iter 462/773 - loss 0.05351123 - time (sec): 27.54 - samples/sec: 2719.53 - lr: 0.000041 - momentum: 0.000000
2023-10-16 22:07:10,771 epoch 3 - iter 539/773 - loss 0.05305526 - time (sec): 32.05 - samples/sec: 2720.58 - lr: 0.000041 - momentum: 0.000000
2023-10-16 22:07:15,136 epoch 3 - iter 616/773 - loss 0.05227538 - time (sec): 36.42 - samples/sec: 2714.42 - lr: 0.000040 - momentum: 0.000000
2023-10-16 22:07:19,672 epoch 3 - iter 693/773 - loss 0.05203641 - time (sec): 40.96 - samples/sec: 2731.14 - lr: 0.000039 - momentum: 0.000000
2023-10-16 22:07:24,115 epoch 3 - iter 770/773 - loss 0.05094133 - time (sec): 45.40 - samples/sec: 2730.25 - lr: 0.000039 - momentum: 0.000000
2023-10-16 22:07:24,271 ----------------------------------------------------------------------------------------------------
2023-10-16 22:07:24,271 EPOCH 3 done: loss 0.0508 - lr: 0.000039
2023-10-16 22:07:26,321 DEV : loss 0.08064333349466324 - f1-score (micro avg) 0.784
2023-10-16 22:07:26,333 saving best model
2023-10-16 22:07:26,782 ----------------------------------------------------------------------------------------------------
2023-10-16 22:07:31,371 epoch 4 - iter 77/773 - loss 0.04532304 - time (sec): 4.59 - samples/sec: 2713.19 - lr: 0.000038 - momentum: 0.000000
2023-10-16 22:07:35,796 epoch 4 - iter 154/773 - loss 0.04515752 - time (sec): 9.01 - samples/sec: 2642.61 - lr: 0.000038 - momentum: 0.000000
2023-10-16 22:07:40,584 epoch 4 - iter 231/773 - loss 0.04011628 - time (sec): 13.80 - samples/sec: 2609.12 - lr: 0.000037 - momentum: 0.000000
2023-10-16 22:07:45,151 epoch 4 - iter 308/773 - loss 0.03827052 - time (sec): 18.37 - samples/sec: 2625.80 - lr: 0.000037 - momentum: 0.000000
2023-10-16 22:07:49,673 epoch 4 - iter 385/773 - loss 0.03710721 - time (sec): 22.89 - samples/sec: 2663.63 - lr: 0.000036 - momentum: 0.000000
2023-10-16 22:07:54,043 epoch 4 - iter 462/773 - loss 0.03902851 - time (sec): 27.26 - samples/sec: 2684.80 - lr: 0.000036 - momentum: 0.000000
2023-10-16 22:07:58,389 epoch 4 - iter 539/773 - loss 0.03842162 - time (sec): 31.60 - samples/sec: 2707.56 - lr: 0.000035 - momentum: 0.000000
2023-10-16 22:08:03,155 epoch 4 - iter 616/773 - loss 0.03858095 - time (sec): 36.37 - samples/sec: 2700.81 - lr: 0.000034 - momentum: 0.000000
2023-10-16 22:08:07,677 epoch 4 - iter 693/773 - loss 0.03732367 - time (sec): 40.89 - samples/sec: 2711.50 - lr: 0.000034 - momentum: 0.000000
2023-10-16 22:08:12,393 epoch 4 - iter 770/773 - loss 0.03734734 - time (sec): 45.61 - samples/sec: 2716.36 - lr: 0.000033 - momentum: 0.000000
2023-10-16 22:08:12,558 ----------------------------------------------------------------------------------------------------
2023-10-16 22:08:12,558 EPOCH 4 done: loss 0.0373 - lr: 0.000033
2023-10-16 22:08:14,643 DEV : loss 0.08836734294891357 - f1-score (micro avg) 0.7582
2023-10-16 22:08:14,656 ----------------------------------------------------------------------------------------------------
2023-10-16 22:08:19,114 epoch 5 - iter 77/773 - loss 0.02324528 - time (sec): 4.46 - samples/sec: 2818.91 - lr: 0.000033 - momentum: 0.000000
2023-10-16 22:08:23,457 epoch 5 - iter 154/773 - loss 0.02467072 - time (sec): 8.80 - samples/sec: 2799.12 - lr: 0.000032 - momentum: 0.000000
2023-10-16 22:08:27,881 epoch 5 - iter 231/773 - loss 0.02448679 - time (sec): 13.22 - samples/sec: 2725.54 - lr: 0.000032 - momentum: 0.000000
2023-10-16 22:08:32,569 epoch 5 - iter 308/773 - loss 0.02574532 - time (sec): 17.91 - samples/sec: 2746.57 - lr: 0.000031 - momentum: 0.000000
2023-10-16 22:08:37,181 epoch 5 - iter 385/773 - loss 0.02467239 - time (sec): 22.52 - samples/sec: 2743.95 - lr: 0.000031 - momentum: 0.000000
2023-10-16 22:08:41,801 epoch 5 - iter 462/773 - loss 0.02435955 - time (sec): 27.14 - samples/sec: 2755.55 - lr: 0.000030 - momentum: 0.000000
2023-10-16 22:08:46,576 epoch 5 - iter 539/773 - loss 0.02374115 - time (sec): 31.92 - samples/sec: 2740.25 - lr: 0.000029 - momentum: 0.000000
2023-10-16 22:08:51,103 epoch 5 - iter 616/773 - loss 0.02505935 - time (sec): 36.45 - samples/sec: 2737.71 - lr: 0.000029 - momentum: 0.000000
2023-10-16 22:08:55,524 epoch 5 - iter 693/773 - loss 0.02470818 - time (sec): 40.87 - samples/sec: 2737.62 - lr: 0.000028 - momentum: 0.000000
2023-10-16 22:08:59,942 epoch 5 - iter 770/773 - loss 0.02388204 - time (sec): 45.29 - samples/sec: 2736.97 - lr: 0.000028 - momentum: 0.000000
2023-10-16 22:09:00,093 ----------------------------------------------------------------------------------------------------
2023-10-16 22:09:00,094 EPOCH 5 done: loss 0.0239 - lr: 0.000028
2023-10-16 22:09:02,150 DEV : loss 0.10162093490362167 - f1-score (micro avg) 0.7743
2023-10-16 22:09:02,163 ----------------------------------------------------------------------------------------------------
2023-10-16 22:09:06,579 epoch 6 - iter 77/773 - loss 0.01159246 - time (sec): 4.42 - samples/sec: 2796.00 - lr: 0.000027 - momentum: 0.000000
2023-10-16 22:09:11,152 epoch 6 - iter 154/773 - loss 0.01350875 - time (sec): 8.99 - samples/sec: 2651.15 - lr: 0.000027 - momentum: 0.000000
2023-10-16 22:09:15,655 epoch 6 - iter 231/773 - loss 0.01496486 - time (sec): 13.49 - samples/sec: 2665.48 - lr: 0.000026 - momentum: 0.000000
2023-10-16 22:09:20,180 epoch 6 - iter 308/773 - loss 0.01695021 - time (sec): 18.02 - samples/sec: 2700.59 - lr: 0.000026 - momentum: 0.000000
2023-10-16 22:09:24,538 epoch 6 - iter 385/773 - loss 0.01737618 - time (sec): 22.37 - samples/sec: 2712.80 - lr: 0.000025 - momentum: 0.000000
2023-10-16 22:09:28,871 epoch 6 - iter 462/773 - loss 0.01752556 - time (sec): 26.71 - samples/sec: 2719.44 - lr: 0.000024 - momentum: 0.000000
2023-10-16 22:09:33,395 epoch 6 - iter 539/773 - loss 0.01769065 - time (sec): 31.23 - samples/sec: 2719.13 - lr: 0.000024 - momentum: 0.000000
2023-10-16 22:09:38,097 epoch 6 - iter 616/773 - loss 0.01836114 - time (sec): 35.93 - samples/sec: 2717.44 - lr: 0.000023 - momentum: 0.000000
2023-10-16 22:09:42,421 epoch 6 - iter 693/773 - loss 0.01786052 - time (sec): 40.26 - samples/sec: 2719.08 - lr: 0.000023 - momentum: 0.000000
2023-10-16 22:09:47,260 epoch 6 - iter 770/773 - loss 0.01820946 - time (sec): 45.10 - samples/sec: 2742.13 - lr: 0.000022 - momentum: 0.000000
2023-10-16 22:09:47,448 ----------------------------------------------------------------------------------------------------
2023-10-16 22:09:47,448 EPOCH 6 done: loss 0.0182 - lr: 0.000022
2023-10-16 22:09:49,464 DEV : loss 0.10438579320907593 - f1-score (micro avg) 0.7863
2023-10-16 22:09:49,476 saving best model
2023-10-16 22:09:49,941 ----------------------------------------------------------------------------------------------------
2023-10-16 22:09:54,815 epoch 7 - iter 77/773 - loss 0.01326801 - time (sec): 4.87 - samples/sec: 2596.22 - lr: 0.000022 - momentum: 0.000000
2023-10-16 22:09:59,436 epoch 7 - iter 154/773 - loss 0.01133806 - time (sec): 9.49 - samples/sec: 2698.66 - lr: 0.000021 - momentum: 0.000000
2023-10-16 22:10:03,998 epoch 7 - iter 231/773 - loss 0.01166043 - time (sec): 14.05 - samples/sec: 2696.24 - lr: 0.000021 - momentum: 0.000000
2023-10-16 22:10:08,505 epoch 7 - iter 308/773 - loss 0.01086055 - time (sec): 18.56 - samples/sec: 2685.92 - lr: 0.000020 - momentum: 0.000000
2023-10-16 22:10:13,116 epoch 7 - iter 385/773 - loss 0.01053431 - time (sec): 23.17 - samples/sec: 2692.16 - lr: 0.000019 - momentum: 0.000000
2023-10-16 22:10:17,458 epoch 7 - iter 462/773 - loss 0.01175421 - time (sec): 27.51 - samples/sec: 2700.81 - lr: 0.000019 - momentum: 0.000000
2023-10-16 22:10:21,786 epoch 7 - iter 539/773 - loss 0.01094026 - time (sec): 31.84 - samples/sec: 2713.28 - lr: 0.000018 - momentum: 0.000000
2023-10-16 22:10:26,348 epoch 7 - iter 616/773 - loss 0.01147580 - time (sec): 36.40 - samples/sec: 2708.94 - lr: 0.000018 - momentum: 0.000000
2023-10-16 22:10:30,934 epoch 7 - iter 693/773 - loss 0.01097587 - time (sec): 40.99 - samples/sec: 2718.33 - lr: 0.000017 - momentum: 0.000000
2023-10-16 22:10:35,494 epoch 7 - iter 770/773 - loss 0.01075542 - time (sec): 45.55 - samples/sec: 2719.57 - lr: 0.000017 - momentum: 0.000000
2023-10-16 22:10:35,663 ----------------------------------------------------------------------------------------------------
2023-10-16 22:10:35,663 EPOCH 7 done: loss 0.0109 - lr: 0.000017
2023-10-16 22:10:37,719 DEV : loss 0.10951930284500122 - f1-score (micro avg) 0.7983
2023-10-16 22:10:37,731 saving best model
2023-10-16 22:10:38,198 ----------------------------------------------------------------------------------------------------
2023-10-16 22:10:42,724 epoch 8 - iter 77/773 - loss 0.00728149 - time (sec): 4.52 - samples/sec: 2766.74 - lr: 0.000016 - momentum: 0.000000
2023-10-16 22:10:47,351 epoch 8 - iter 154/773 - loss 0.00617068 - time (sec): 9.15 - samples/sec: 2771.20 - lr: 0.000016 - momentum: 0.000000
2023-10-16 22:10:51,639 epoch 8 - iter 231/773 - loss 0.00654267 - time (sec): 13.44 - samples/sec: 2812.34 - lr: 0.000015 - momentum: 0.000000
2023-10-16 22:10:56,231 epoch 8 - iter 308/773 - loss 0.00737351 - time (sec): 18.03 - samples/sec: 2828.58 - lr: 0.000014 - momentum: 0.000000
2023-10-16 22:11:00,609 epoch 8 - iter 385/773 - loss 0.00732684 - time (sec): 22.41 - samples/sec: 2793.63 - lr: 0.000014 - momentum: 0.000000
2023-10-16 22:11:05,075 epoch 8 - iter 462/773 - loss 0.00803301 - time (sec): 26.88 - samples/sec: 2785.65 - lr: 0.000013 - momentum: 0.000000
2023-10-16 22:11:09,435 epoch 8 - iter 539/773 - loss 0.00840404 - time (sec): 31.24 - samples/sec: 2780.54 - lr: 0.000013 - momentum: 0.000000
2023-10-16 22:11:13,860 epoch 8 - iter 616/773 - loss 0.00808847 - time (sec): 35.66 - samples/sec: 2772.46 - lr: 0.000012 - momentum: 0.000000
2023-10-16 22:11:18,291 epoch 8 - iter 693/773 - loss 0.00861262 - time (sec): 40.09 - samples/sec: 2766.77 - lr: 0.000012 - momentum: 0.000000
2023-10-16 22:11:23,147 epoch 8 - iter 770/773 - loss 0.00822302 - time (sec): 44.95 - samples/sec: 2756.21 - lr: 0.000011 - momentum: 0.000000
2023-10-16 22:11:23,327 ----------------------------------------------------------------------------------------------------
2023-10-16 22:11:23,327 EPOCH 8 done: loss 0.0082 - lr: 0.000011
2023-10-16 22:11:25,361 DEV : loss 0.11846552044153214 - f1-score (micro avg) 0.7943
2023-10-16 22:11:25,374 ----------------------------------------------------------------------------------------------------
2023-10-16 22:11:30,050 epoch 9 - iter 77/773 - loss 0.00433896 - time (sec): 4.67 - samples/sec: 2677.26 - lr: 0.000011 - momentum: 0.000000
2023-10-16 22:11:34,429 epoch 9 - iter 154/773 - loss 0.00498112 - time (sec): 9.05 - samples/sec: 2697.81 - lr: 0.000010 - momentum: 0.000000
2023-10-16 22:11:38,833 epoch 9 - iter 231/773 - loss 0.00373937 - time (sec): 13.46 - samples/sec: 2717.39 - lr: 0.000009 - momentum: 0.000000
2023-10-16 22:11:43,181 epoch 9 - iter 308/773 - loss 0.00412343 - time (sec): 17.81 - samples/sec: 2732.31 - lr: 0.000009 - momentum: 0.000000
2023-10-16 22:11:47,729 epoch 9 - iter 385/773 - loss 0.00375047 - time (sec): 22.35 - samples/sec: 2718.05 - lr: 0.000008 - momentum: 0.000000
2023-10-16 22:11:52,475 epoch 9 - iter 462/773 - loss 0.00369782 - time (sec): 27.10 - samples/sec: 2717.22 - lr: 0.000008 - momentum: 0.000000
2023-10-16 22:11:56,947 epoch 9 - iter 539/773 - loss 0.00383622 - time (sec): 31.57 - samples/sec: 2733.87 - lr: 0.000007 - momentum: 0.000000
2023-10-16 22:12:01,471 epoch 9 - iter 616/773 - loss 0.00413650 - time (sec): 36.10 - samples/sec: 2732.35 - lr: 0.000007 - momentum: 0.000000
2023-10-16 22:12:05,987 epoch 9 - iter 693/773 - loss 0.00456700 - time (sec): 40.61 - samples/sec: 2740.68 - lr: 0.000006 - momentum: 0.000000
2023-10-16 22:12:10,465 epoch 9 - iter 770/773 - loss 0.00488437 - time (sec): 45.09 - samples/sec: 2749.39 - lr: 0.000006 - momentum: 0.000000
2023-10-16 22:12:10,633 ----------------------------------------------------------------------------------------------------
2023-10-16 22:12:10,634 EPOCH 9 done: loss 0.0049 - lr: 0.000006
2023-10-16 22:12:12,648 DEV : loss 0.11912991851568222 - f1-score (micro avg) 0.7859
2023-10-16 22:12:12,661 ----------------------------------------------------------------------------------------------------
2023-10-16 22:12:17,152 epoch 10 - iter 77/773 - loss 0.00307045 - time (sec): 4.49 - samples/sec: 2767.84 - lr: 0.000005 - momentum: 0.000000
2023-10-16 22:12:21,594 epoch 10 - iter 154/773 - loss 0.00429958 - time (sec): 8.93 - samples/sec: 2791.36 - lr: 0.000005 - momentum: 0.000000
2023-10-16 22:12:26,080 epoch 10 - iter 231/773 - loss 0.00374969 - time (sec): 13.42 - samples/sec: 2745.64 - lr: 0.000004 - momentum: 0.000000
2023-10-16 22:12:30,759 epoch 10 - iter 308/773 - loss 0.00293141 - time (sec): 18.10 - samples/sec: 2763.47 - lr: 0.000003 - momentum: 0.000000
2023-10-16 22:12:35,215 epoch 10 - iter 385/773 - loss 0.00303829 - time (sec): 22.55 - samples/sec: 2752.65 - lr: 0.000003 - momentum: 0.000000
2023-10-16 22:12:39,757 epoch 10 - iter 462/773 - loss 0.00299411 - time (sec): 27.09 - samples/sec: 2766.56 - lr: 0.000002 - momentum: 0.000000
2023-10-16 22:12:44,528 epoch 10 - iter 539/773 - loss 0.00315115 - time (sec): 31.87 - samples/sec: 2730.07 - lr: 0.000002 - momentum: 0.000000
2023-10-16 22:12:48,932 epoch 10 - iter 616/773 - loss 0.00309341 - time (sec): 36.27 - samples/sec: 2742.38 - lr: 0.000001 - momentum: 0.000000
2023-10-16 22:12:53,383 epoch 10 - iter 693/773 - loss 0.00291500 - time (sec): 40.72 - samples/sec: 2734.97 - lr: 0.000001 - momentum: 0.000000
2023-10-16 22:12:57,938 epoch 10 - iter 770/773 - loss 0.00293636 - time (sec): 45.28 - samples/sec: 2730.69 - lr: 0.000000 - momentum: 0.000000
2023-10-16 22:12:58,116 ----------------------------------------------------------------------------------------------------
2023-10-16 22:12:58,116 EPOCH 10 done: loss 0.0029 - lr: 0.000000
2023-10-16 22:13:00,131 DEV : loss 0.12098120898008347 - f1-score (micro avg) 0.7832
2023-10-16 22:13:00,473 ----------------------------------------------------------------------------------------------------
2023-10-16 22:13:00,475 Loading model from best epoch ...
2023-10-16 22:13:01,959 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-16 22:13:07,934
Results:
- F-score (micro) 0.8009
- F-score (macro) 0.6968
- Accuracy 0.6881
By class:
precision recall f1-score support
LOC 0.8549 0.8531 0.8540 946
BUILDING 0.6000 0.4703 0.5273 185
STREET 0.7222 0.6964 0.7091 56
micro avg 0.8163 0.7860 0.8009 1187
macro avg 0.7257 0.6733 0.6968 1187
weighted avg 0.8089 0.7860 0.7962 1187
2023-10-16 22:13:07,935 ----------------------------------------------------------------------------------------------------