2023-10-14 08:28:49,609 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:28:49,610 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-14 08:28:49,610 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:28:49,610 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-14 08:28:49,610 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:28:49,610 Train: 5777 sentences 2023-10-14 08:28:49,610 (train_with_dev=False, train_with_test=False) 2023-10-14 08:28:49,610 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:28:49,610 Training Params: 2023-10-14 08:28:49,610 - learning_rate: "3e-05" 2023-10-14 08:28:49,610 - mini_batch_size: "8" 2023-10-14 08:28:49,610 - max_epochs: "10" 2023-10-14 08:28:49,610 - shuffle: "True" 2023-10-14 08:28:49,611 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:28:49,611 Plugins: 2023-10-14 08:28:49,611 - LinearScheduler | warmup_fraction: '0.1' 2023-10-14 08:28:49,611 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:28:49,611 Final evaluation on model from best epoch (best-model.pt) 2023-10-14 08:28:49,611 - metric: "('micro avg', 'f1-score')" 2023-10-14 08:28:49,611 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:28:49,611 Computation: 2023-10-14 08:28:49,611 - compute on device: cuda:0 2023-10-14 08:28:49,611 - embedding storage: none 2023-10-14 08:28:49,611 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:28:49,611 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-14 08:28:49,611 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:28:49,611 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:28:55,399 epoch 1 - iter 72/723 - loss 2.34500859 - time (sec): 5.79 - samples/sec: 2928.05 - lr: 0.000003 - momentum: 0.000000 2023-10-14 08:29:01,032 epoch 1 - iter 144/723 - loss 1.40146179 - time (sec): 11.42 - samples/sec: 2958.85 - lr: 0.000006 - momentum: 0.000000 2023-10-14 08:29:07,150 epoch 1 - iter 216/723 - loss 0.98472208 - time (sec): 17.54 - samples/sec: 2978.52 - lr: 0.000009 - momentum: 0.000000 2023-10-14 08:29:12,983 epoch 1 - iter 288/723 - loss 0.78948184 - time (sec): 23.37 - samples/sec: 2988.51 - lr: 0.000012 - momentum: 0.000000 2023-10-14 08:29:18,706 epoch 1 - iter 360/723 - loss 0.67269994 - time (sec): 29.09 - samples/sec: 2981.53 - lr: 0.000015 - momentum: 0.000000 2023-10-14 08:29:24,406 epoch 1 - iter 432/723 - loss 0.60136728 - time (sec): 34.79 - samples/sec: 2938.07 - lr: 0.000018 - momentum: 0.000000 2023-10-14 08:29:30,603 epoch 1 - iter 504/723 - loss 0.53425447 - time (sec): 40.99 - samples/sec: 2947.40 - lr: 0.000021 - momentum: 0.000000 2023-10-14 08:29:37,056 epoch 1 - iter 576/723 - loss 0.48656421 - time (sec): 47.44 - samples/sec: 2926.64 - lr: 0.000024 - momentum: 0.000000 2023-10-14 08:29:43,158 epoch 1 - iter 648/723 - loss 0.44962888 - time (sec): 53.55 - samples/sec: 2930.20 - lr: 0.000027 - momentum: 0.000000 2023-10-14 08:29:49,227 epoch 1 - iter 720/723 - loss 0.41686927 - time (sec): 59.62 - samples/sec: 2944.90 - lr: 0.000030 - momentum: 0.000000 2023-10-14 08:29:49,488 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:29:49,488 EPOCH 1 done: loss 0.4157 - lr: 0.000030 2023-10-14 08:29:52,663 DEV : loss 0.16736090183258057 - f1-score (micro avg) 0.5276 2023-10-14 08:29:52,698 saving best model 2023-10-14 08:29:53,052 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:29:58,787 epoch 2 - iter 72/723 - loss 0.15097610 - time (sec): 5.73 - samples/sec: 3028.16 - lr: 0.000030 - momentum: 0.000000 2023-10-14 08:30:04,682 epoch 2 - iter 144/723 - loss 0.13684182 - time (sec): 11.63 - samples/sec: 2960.60 - lr: 0.000029 - momentum: 0.000000 2023-10-14 08:30:10,736 epoch 2 - iter 216/723 - loss 0.13585466 - time (sec): 17.68 - samples/sec: 2945.67 - lr: 0.000029 - momentum: 0.000000 2023-10-14 08:30:16,269 epoch 2 - iter 288/723 - loss 0.12788984 - time (sec): 23.22 - samples/sec: 2985.08 - lr: 0.000029 - momentum: 0.000000 2023-10-14 08:30:22,637 epoch 2 - iter 360/723 - loss 0.12683567 - time (sec): 29.58 - samples/sec: 2963.23 - lr: 0.000028 - momentum: 0.000000 2023-10-14 08:30:28,367 epoch 2 - iter 432/723 - loss 0.12355013 - time (sec): 35.31 - samples/sec: 2965.98 - lr: 0.000028 - momentum: 0.000000 2023-10-14 08:30:34,488 epoch 2 - iter 504/723 - loss 0.12414425 - time (sec): 41.44 - samples/sec: 2962.13 - lr: 0.000028 - momentum: 0.000000 2023-10-14 08:30:39,830 epoch 2 - iter 576/723 - loss 0.12115131 - time (sec): 46.78 - samples/sec: 2963.61 - lr: 0.000027 - momentum: 0.000000 2023-10-14 08:30:46,149 epoch 2 - iter 648/723 - loss 0.11808836 - time (sec): 53.10 - samples/sec: 2972.27 - lr: 0.000027 - momentum: 0.000000 2023-10-14 08:30:52,188 epoch 2 - iter 720/723 - loss 0.11598087 - time (sec): 59.14 - samples/sec: 2970.40 - lr: 0.000027 - momentum: 0.000000 2023-10-14 08:30:52,397 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:30:52,397 EPOCH 2 done: loss 0.1160 - lr: 0.000027 2023-10-14 08:30:56,289 DEV : loss 0.09956270456314087 - f1-score (micro avg) 0.7529 2023-10-14 08:30:56,304 saving best model 2023-10-14 08:30:56,754 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:31:02,841 epoch 3 - iter 72/723 - loss 0.07480008 - time (sec): 6.09 - samples/sec: 2911.61 - lr: 0.000026 - momentum: 0.000000 2023-10-14 08:31:08,895 epoch 3 - iter 144/723 - loss 0.06907493 - time (sec): 12.14 - samples/sec: 2929.73 - lr: 0.000026 - momentum: 0.000000 2023-10-14 08:31:14,786 epoch 3 - iter 216/723 - loss 0.07126848 - time (sec): 18.03 - samples/sec: 2950.59 - lr: 0.000026 - momentum: 0.000000 2023-10-14 08:31:20,680 epoch 3 - iter 288/723 - loss 0.06958853 - time (sec): 23.92 - samples/sec: 2960.31 - lr: 0.000025 - momentum: 0.000000 2023-10-14 08:31:26,523 epoch 3 - iter 360/723 - loss 0.06902856 - time (sec): 29.77 - samples/sec: 2974.29 - lr: 0.000025 - momentum: 0.000000 2023-10-14 08:31:32,101 epoch 3 - iter 432/723 - loss 0.06866383 - time (sec): 35.35 - samples/sec: 3001.94 - lr: 0.000025 - momentum: 0.000000 2023-10-14 08:31:38,269 epoch 3 - iter 504/723 - loss 0.07120681 - time (sec): 41.51 - samples/sec: 2963.35 - lr: 0.000024 - momentum: 0.000000 2023-10-14 08:31:44,320 epoch 3 - iter 576/723 - loss 0.06999573 - time (sec): 47.56 - samples/sec: 2969.47 - lr: 0.000024 - momentum: 0.000000 2023-10-14 08:31:50,039 epoch 3 - iter 648/723 - loss 0.06994277 - time (sec): 53.28 - samples/sec: 2982.09 - lr: 0.000024 - momentum: 0.000000 2023-10-14 08:31:56,214 epoch 3 - iter 720/723 - loss 0.06941747 - time (sec): 59.46 - samples/sec: 2956.51 - lr: 0.000023 - momentum: 0.000000 2023-10-14 08:31:56,392 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:31:56,392 EPOCH 3 done: loss 0.0694 - lr: 0.000023 2023-10-14 08:31:59,885 DEV : loss 0.09209852665662766 - f1-score (micro avg) 0.7741 2023-10-14 08:31:59,909 saving best model 2023-10-14 08:32:00,444 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:32:07,148 epoch 4 - iter 72/723 - loss 0.04111019 - time (sec): 6.70 - samples/sec: 2691.45 - lr: 0.000023 - momentum: 0.000000 2023-10-14 08:32:13,780 epoch 4 - iter 144/723 - loss 0.03743022 - time (sec): 13.33 - samples/sec: 2745.82 - lr: 0.000023 - momentum: 0.000000 2023-10-14 08:32:19,590 epoch 4 - iter 216/723 - loss 0.04032334 - time (sec): 19.14 - samples/sec: 2778.83 - lr: 0.000022 - momentum: 0.000000 2023-10-14 08:32:25,877 epoch 4 - iter 288/723 - loss 0.04209697 - time (sec): 25.43 - samples/sec: 2803.23 - lr: 0.000022 - momentum: 0.000000 2023-10-14 08:32:31,717 epoch 4 - iter 360/723 - loss 0.04334243 - time (sec): 31.27 - samples/sec: 2834.19 - lr: 0.000022 - momentum: 0.000000 2023-10-14 08:32:37,415 epoch 4 - iter 432/723 - loss 0.04378885 - time (sec): 36.97 - samples/sec: 2846.93 - lr: 0.000021 - momentum: 0.000000 2023-10-14 08:32:43,029 epoch 4 - iter 504/723 - loss 0.04269892 - time (sec): 42.58 - samples/sec: 2864.00 - lr: 0.000021 - momentum: 0.000000 2023-10-14 08:32:49,370 epoch 4 - iter 576/723 - loss 0.04332948 - time (sec): 48.92 - samples/sec: 2871.55 - lr: 0.000021 - momentum: 0.000000 2023-10-14 08:32:55,288 epoch 4 - iter 648/723 - loss 0.04450354 - time (sec): 54.84 - samples/sec: 2867.88 - lr: 0.000020 - momentum: 0.000000 2023-10-14 08:33:01,292 epoch 4 - iter 720/723 - loss 0.04532438 - time (sec): 60.84 - samples/sec: 2888.17 - lr: 0.000020 - momentum: 0.000000 2023-10-14 08:33:01,510 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:33:01,510 EPOCH 4 done: loss 0.0459 - lr: 0.000020 2023-10-14 08:33:04,963 DEV : loss 0.09347887337207794 - f1-score (micro avg) 0.7897 2023-10-14 08:33:04,979 saving best model 2023-10-14 08:33:05,498 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:33:11,770 epoch 5 - iter 72/723 - loss 0.02961405 - time (sec): 6.27 - samples/sec: 2830.72 - lr: 0.000020 - momentum: 0.000000 2023-10-14 08:33:17,831 epoch 5 - iter 144/723 - loss 0.03413594 - time (sec): 12.33 - samples/sec: 2887.48 - lr: 0.000019 - momentum: 0.000000 2023-10-14 08:33:23,316 epoch 5 - iter 216/723 - loss 0.03172083 - time (sec): 17.82 - samples/sec: 2945.68 - lr: 0.000019 - momentum: 0.000000 2023-10-14 08:33:29,114 epoch 5 - iter 288/723 - loss 0.03207149 - time (sec): 23.62 - samples/sec: 2961.38 - lr: 0.000019 - momentum: 0.000000 2023-10-14 08:33:34,760 epoch 5 - iter 360/723 - loss 0.03130345 - time (sec): 29.26 - samples/sec: 2995.97 - lr: 0.000018 - momentum: 0.000000 2023-10-14 08:33:41,220 epoch 5 - iter 432/723 - loss 0.03111424 - time (sec): 35.72 - samples/sec: 2964.12 - lr: 0.000018 - momentum: 0.000000 2023-10-14 08:33:46,931 epoch 5 - iter 504/723 - loss 0.03173225 - time (sec): 41.43 - samples/sec: 2962.14 - lr: 0.000018 - momentum: 0.000000 2023-10-14 08:33:52,899 epoch 5 - iter 576/723 - loss 0.03178811 - time (sec): 47.40 - samples/sec: 2966.06 - lr: 0.000017 - momentum: 0.000000 2023-10-14 08:33:59,266 epoch 5 - iter 648/723 - loss 0.03309214 - time (sec): 53.77 - samples/sec: 2947.98 - lr: 0.000017 - momentum: 0.000000 2023-10-14 08:34:04,879 epoch 5 - iter 720/723 - loss 0.03278333 - time (sec): 59.38 - samples/sec: 2955.61 - lr: 0.000017 - momentum: 0.000000 2023-10-14 08:34:05,223 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:34:05,223 EPOCH 5 done: loss 0.0327 - lr: 0.000017 2023-10-14 08:34:09,605 DEV : loss 0.1093878448009491 - f1-score (micro avg) 0.8056 2023-10-14 08:34:09,627 saving best model 2023-10-14 08:34:10,174 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:34:16,234 epoch 6 - iter 72/723 - loss 0.02664406 - time (sec): 6.06 - samples/sec: 2967.43 - lr: 0.000016 - momentum: 0.000000 2023-10-14 08:34:22,424 epoch 6 - iter 144/723 - loss 0.02610538 - time (sec): 12.25 - samples/sec: 2951.86 - lr: 0.000016 - momentum: 0.000000 2023-10-14 08:34:28,422 epoch 6 - iter 216/723 - loss 0.02632130 - time (sec): 18.25 - samples/sec: 2958.67 - lr: 0.000016 - momentum: 0.000000 2023-10-14 08:34:34,905 epoch 6 - iter 288/723 - loss 0.02918746 - time (sec): 24.73 - samples/sec: 2906.90 - lr: 0.000015 - momentum: 0.000000 2023-10-14 08:34:40,499 epoch 6 - iter 360/723 - loss 0.02786739 - time (sec): 30.32 - samples/sec: 2923.14 - lr: 0.000015 - momentum: 0.000000 2023-10-14 08:34:46,405 epoch 6 - iter 432/723 - loss 0.02591985 - time (sec): 36.23 - samples/sec: 2914.88 - lr: 0.000015 - momentum: 0.000000 2023-10-14 08:34:52,512 epoch 6 - iter 504/723 - loss 0.02585531 - time (sec): 42.34 - samples/sec: 2913.63 - lr: 0.000014 - momentum: 0.000000 2023-10-14 08:34:58,755 epoch 6 - iter 576/723 - loss 0.02673939 - time (sec): 48.58 - samples/sec: 2916.01 - lr: 0.000014 - momentum: 0.000000 2023-10-14 08:35:04,619 epoch 6 - iter 648/723 - loss 0.02626021 - time (sec): 54.44 - samples/sec: 2908.49 - lr: 0.000014 - momentum: 0.000000 2023-10-14 08:35:10,343 epoch 6 - iter 720/723 - loss 0.02645765 - time (sec): 60.17 - samples/sec: 2919.78 - lr: 0.000013 - momentum: 0.000000 2023-10-14 08:35:10,568 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:35:10,568 EPOCH 6 done: loss 0.0264 - lr: 0.000013 2023-10-14 08:35:14,104 DEV : loss 0.1391119509935379 - f1-score (micro avg) 0.7855 2023-10-14 08:35:14,124 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:35:20,351 epoch 7 - iter 72/723 - loss 0.01350535 - time (sec): 6.23 - samples/sec: 2796.08 - lr: 0.000013 - momentum: 0.000000 2023-10-14 08:35:26,465 epoch 7 - iter 144/723 - loss 0.01712040 - time (sec): 12.34 - samples/sec: 2766.13 - lr: 0.000013 - momentum: 0.000000 2023-10-14 08:35:33,034 epoch 7 - iter 216/723 - loss 0.01751483 - time (sec): 18.91 - samples/sec: 2774.49 - lr: 0.000012 - momentum: 0.000000 2023-10-14 08:35:39,319 epoch 7 - iter 288/723 - loss 0.01623115 - time (sec): 25.19 - samples/sec: 2798.65 - lr: 0.000012 - momentum: 0.000000 2023-10-14 08:35:45,189 epoch 7 - iter 360/723 - loss 0.01745382 - time (sec): 31.06 - samples/sec: 2833.38 - lr: 0.000012 - momentum: 0.000000 2023-10-14 08:35:51,313 epoch 7 - iter 432/723 - loss 0.01897664 - time (sec): 37.19 - samples/sec: 2863.45 - lr: 0.000011 - momentum: 0.000000 2023-10-14 08:35:56,993 epoch 7 - iter 504/723 - loss 0.01893676 - time (sec): 42.87 - samples/sec: 2874.41 - lr: 0.000011 - momentum: 0.000000 2023-10-14 08:36:02,983 epoch 7 - iter 576/723 - loss 0.01873928 - time (sec): 48.86 - samples/sec: 2894.89 - lr: 0.000011 - momentum: 0.000000 2023-10-14 08:36:08,710 epoch 7 - iter 648/723 - loss 0.01874669 - time (sec): 54.59 - samples/sec: 2894.98 - lr: 0.000010 - momentum: 0.000000 2023-10-14 08:36:14,582 epoch 7 - iter 720/723 - loss 0.01849634 - time (sec): 60.46 - samples/sec: 2906.14 - lr: 0.000010 - momentum: 0.000000 2023-10-14 08:36:14,763 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:36:14,763 EPOCH 7 done: loss 0.0185 - lr: 0.000010 2023-10-14 08:36:18,275 DEV : loss 0.17075838148593903 - f1-score (micro avg) 0.8055 2023-10-14 08:36:18,295 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:36:24,144 epoch 8 - iter 72/723 - loss 0.01760211 - time (sec): 5.85 - samples/sec: 2982.67 - lr: 0.000010 - momentum: 0.000000 2023-10-14 08:36:30,981 epoch 8 - iter 144/723 - loss 0.01623072 - time (sec): 12.68 - samples/sec: 2781.67 - lr: 0.000009 - momentum: 0.000000 2023-10-14 08:36:36,973 epoch 8 - iter 216/723 - loss 0.01533673 - time (sec): 18.68 - samples/sec: 2833.48 - lr: 0.000009 - momentum: 0.000000 2023-10-14 08:36:42,799 epoch 8 - iter 288/723 - loss 0.01601558 - time (sec): 24.50 - samples/sec: 2864.13 - lr: 0.000009 - momentum: 0.000000 2023-10-14 08:36:49,045 epoch 8 - iter 360/723 - loss 0.01466925 - time (sec): 30.75 - samples/sec: 2897.17 - lr: 0.000008 - momentum: 0.000000 2023-10-14 08:36:54,836 epoch 8 - iter 432/723 - loss 0.01402939 - time (sec): 36.54 - samples/sec: 2898.47 - lr: 0.000008 - momentum: 0.000000 2023-10-14 08:37:00,513 epoch 8 - iter 504/723 - loss 0.01405464 - time (sec): 42.22 - samples/sec: 2926.51 - lr: 0.000008 - momentum: 0.000000 2023-10-14 08:37:06,081 epoch 8 - iter 576/723 - loss 0.01491234 - time (sec): 47.78 - samples/sec: 2932.12 - lr: 0.000007 - momentum: 0.000000 2023-10-14 08:37:12,626 epoch 8 - iter 648/723 - loss 0.01502329 - time (sec): 54.33 - samples/sec: 2916.30 - lr: 0.000007 - momentum: 0.000000 2023-10-14 08:37:18,676 epoch 8 - iter 720/723 - loss 0.01521053 - time (sec): 60.38 - samples/sec: 2912.50 - lr: 0.000007 - momentum: 0.000000 2023-10-14 08:37:18,861 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:37:18,861 EPOCH 8 done: loss 0.0152 - lr: 0.000007 2023-10-14 08:37:23,265 DEV : loss 0.17406047880649567 - f1-score (micro avg) 0.7968 2023-10-14 08:37:23,286 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:37:29,454 epoch 9 - iter 72/723 - loss 0.01059210 - time (sec): 6.17 - samples/sec: 2956.17 - lr: 0.000006 - momentum: 0.000000 2023-10-14 08:37:36,031 epoch 9 - iter 144/723 - loss 0.01193666 - time (sec): 12.74 - samples/sec: 2885.31 - lr: 0.000006 - momentum: 0.000000 2023-10-14 08:37:42,240 epoch 9 - iter 216/723 - loss 0.01045503 - time (sec): 18.95 - samples/sec: 2951.17 - lr: 0.000006 - momentum: 0.000000 2023-10-14 08:37:47,987 epoch 9 - iter 288/723 - loss 0.00976298 - time (sec): 24.70 - samples/sec: 2920.37 - lr: 0.000005 - momentum: 0.000000 2023-10-14 08:37:54,314 epoch 9 - iter 360/723 - loss 0.01051185 - time (sec): 31.03 - samples/sec: 2926.68 - lr: 0.000005 - momentum: 0.000000 2023-10-14 08:37:59,729 epoch 9 - iter 432/723 - loss 0.01007391 - time (sec): 36.44 - samples/sec: 2939.41 - lr: 0.000005 - momentum: 0.000000 2023-10-14 08:38:05,709 epoch 9 - iter 504/723 - loss 0.01053006 - time (sec): 42.42 - samples/sec: 2926.20 - lr: 0.000004 - momentum: 0.000000 2023-10-14 08:38:11,121 epoch 9 - iter 576/723 - loss 0.01019871 - time (sec): 47.83 - samples/sec: 2929.84 - lr: 0.000004 - momentum: 0.000000 2023-10-14 08:38:17,094 epoch 9 - iter 648/723 - loss 0.01035051 - time (sec): 53.81 - samples/sec: 2927.27 - lr: 0.000004 - momentum: 0.000000 2023-10-14 08:38:23,337 epoch 9 - iter 720/723 - loss 0.01068224 - time (sec): 60.05 - samples/sec: 2925.47 - lr: 0.000003 - momentum: 0.000000 2023-10-14 08:38:23,534 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:38:23,534 EPOCH 9 done: loss 0.0107 - lr: 0.000003 2023-10-14 08:38:27,028 DEV : loss 0.1967579573392868 - f1-score (micro avg) 0.7972 2023-10-14 08:38:27,047 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:38:33,194 epoch 10 - iter 72/723 - loss 0.00260707 - time (sec): 6.15 - samples/sec: 2990.09 - lr: 0.000003 - momentum: 0.000000 2023-10-14 08:38:38,655 epoch 10 - iter 144/723 - loss 0.00602904 - time (sec): 11.61 - samples/sec: 2988.23 - lr: 0.000003 - momentum: 0.000000 2023-10-14 08:38:44,750 epoch 10 - iter 216/723 - loss 0.01005193 - time (sec): 17.70 - samples/sec: 2974.13 - lr: 0.000002 - momentum: 0.000000 2023-10-14 08:38:51,428 epoch 10 - iter 288/723 - loss 0.00877443 - time (sec): 24.38 - samples/sec: 2908.63 - lr: 0.000002 - momentum: 0.000000 2023-10-14 08:38:56,998 epoch 10 - iter 360/723 - loss 0.00837203 - time (sec): 29.95 - samples/sec: 2933.31 - lr: 0.000002 - momentum: 0.000000 2023-10-14 08:39:03,575 epoch 10 - iter 432/723 - loss 0.00824030 - time (sec): 36.53 - samples/sec: 2931.89 - lr: 0.000001 - momentum: 0.000000 2023-10-14 08:39:09,222 epoch 10 - iter 504/723 - loss 0.00865644 - time (sec): 42.17 - samples/sec: 2942.99 - lr: 0.000001 - momentum: 0.000000 2023-10-14 08:39:14,978 epoch 10 - iter 576/723 - loss 0.00897306 - time (sec): 47.93 - samples/sec: 2944.84 - lr: 0.000001 - momentum: 0.000000 2023-10-14 08:39:20,723 epoch 10 - iter 648/723 - loss 0.00883926 - time (sec): 53.67 - samples/sec: 2940.14 - lr: 0.000000 - momentum: 0.000000 2023-10-14 08:39:26,903 epoch 10 - iter 720/723 - loss 0.00871898 - time (sec): 59.85 - samples/sec: 2938.23 - lr: 0.000000 - momentum: 0.000000 2023-10-14 08:39:27,070 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:39:27,070 EPOCH 10 done: loss 0.0088 - lr: 0.000000 2023-10-14 08:39:30,587 DEV : loss 0.2016027718782425 - f1-score (micro avg) 0.7978 2023-10-14 08:39:30,972 ---------------------------------------------------------------------------------------------------- 2023-10-14 08:39:30,973 Loading model from best epoch ... 2023-10-14 08:39:32,712 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-14 08:39:35,860 Results: - F-score (micro) 0.7954 - F-score (macro) 0.6863 - Accuracy 0.6765 By class: precision recall f1-score support PER 0.7629 0.8610 0.8090 482 LOC 0.8997 0.7838 0.8378 458 ORG 0.4355 0.3913 0.4122 69 micro avg 0.7970 0.7939 0.7954 1009 macro avg 0.6994 0.6787 0.6863 1009 weighted avg 0.8026 0.7939 0.7949 1009 2023-10-14 08:39:35,861 ----------------------------------------------------------------------------------------------------