2023-10-14 11:08:39,421 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:08:39,422 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-14 11:08:39,422 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:08:39,422 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-14 11:08:39,422 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:08:39,422 Train: 5777 sentences 2023-10-14 11:08:39,422 (train_with_dev=False, train_with_test=False) 2023-10-14 11:08:39,423 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:08:39,423 Training Params: 2023-10-14 11:08:39,423 - learning_rate: "5e-05" 2023-10-14 11:08:39,423 - mini_batch_size: "8" 2023-10-14 11:08:39,423 - max_epochs: "10" 2023-10-14 11:08:39,423 - shuffle: "True" 2023-10-14 11:08:39,423 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:08:39,423 Plugins: 2023-10-14 11:08:39,423 - LinearScheduler | warmup_fraction: '0.1' 2023-10-14 11:08:39,423 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:08:39,423 Final evaluation on model from best epoch (best-model.pt) 2023-10-14 11:08:39,423 - metric: "('micro avg', 'f1-score')" 2023-10-14 11:08:39,423 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:08:39,423 Computation: 2023-10-14 11:08:39,423 - compute on device: cuda:0 2023-10-14 11:08:39,423 - embedding storage: none 2023-10-14 11:08:39,423 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:08:39,423 Model training base path: "hmbench-icdar/nl-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-14 11:08:39,423 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:08:39,423 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:08:45,872 epoch 1 - iter 72/723 - loss 1.95446486 - time (sec): 6.45 - samples/sec: 2891.94 - lr: 0.000005 - momentum: 0.000000 2023-10-14 11:08:51,688 epoch 1 - iter 144/723 - loss 1.15070028 - time (sec): 12.26 - samples/sec: 2945.36 - lr: 0.000010 - momentum: 0.000000 2023-10-14 11:08:57,784 epoch 1 - iter 216/723 - loss 0.84752208 - time (sec): 18.36 - samples/sec: 2902.56 - lr: 0.000015 - momentum: 0.000000 2023-10-14 11:09:03,660 epoch 1 - iter 288/723 - loss 0.68506420 - time (sec): 24.24 - samples/sec: 2896.34 - lr: 0.000020 - momentum: 0.000000 2023-10-14 11:09:09,759 epoch 1 - iter 360/723 - loss 0.57910742 - time (sec): 30.34 - samples/sec: 2910.32 - lr: 0.000025 - momentum: 0.000000 2023-10-14 11:09:15,522 epoch 1 - iter 432/723 - loss 0.50869393 - time (sec): 36.10 - samples/sec: 2939.24 - lr: 0.000030 - momentum: 0.000000 2023-10-14 11:09:21,253 epoch 1 - iter 504/723 - loss 0.45633381 - time (sec): 41.83 - samples/sec: 2955.22 - lr: 0.000035 - momentum: 0.000000 2023-10-14 11:09:27,720 epoch 1 - iter 576/723 - loss 0.41501574 - time (sec): 48.30 - samples/sec: 2953.32 - lr: 0.000040 - momentum: 0.000000 2023-10-14 11:09:33,809 epoch 1 - iter 648/723 - loss 0.38295739 - time (sec): 54.39 - samples/sec: 2940.77 - lr: 0.000045 - momentum: 0.000000 2023-10-14 11:09:38,985 epoch 1 - iter 720/723 - loss 0.36045102 - time (sec): 59.56 - samples/sec: 2949.21 - lr: 0.000050 - momentum: 0.000000 2023-10-14 11:09:39,190 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:09:39,190 EPOCH 1 done: loss 0.3598 - lr: 0.000050 2023-10-14 11:09:42,683 DEV : loss 0.11006532609462738 - f1-score (micro avg) 0.7259 2023-10-14 11:09:42,700 saving best model 2023-10-14 11:09:43,090 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:09:48,823 epoch 2 - iter 72/723 - loss 0.11097378 - time (sec): 5.73 - samples/sec: 2830.47 - lr: 0.000049 - momentum: 0.000000 2023-10-14 11:09:54,889 epoch 2 - iter 144/723 - loss 0.10276963 - time (sec): 11.80 - samples/sec: 2864.35 - lr: 0.000049 - momentum: 0.000000 2023-10-14 11:10:01,074 epoch 2 - iter 216/723 - loss 0.10943754 - time (sec): 17.98 - samples/sec: 2878.27 - lr: 0.000048 - momentum: 0.000000 2023-10-14 11:10:07,800 epoch 2 - iter 288/723 - loss 0.10336155 - time (sec): 24.71 - samples/sec: 2868.76 - lr: 0.000048 - momentum: 0.000000 2023-10-14 11:10:13,888 epoch 2 - iter 360/723 - loss 0.09910848 - time (sec): 30.80 - samples/sec: 2884.35 - lr: 0.000047 - momentum: 0.000000 2023-10-14 11:10:19,661 epoch 2 - iter 432/723 - loss 0.09836247 - time (sec): 36.57 - samples/sec: 2887.37 - lr: 0.000047 - momentum: 0.000000 2023-10-14 11:10:25,255 epoch 2 - iter 504/723 - loss 0.09901695 - time (sec): 42.16 - samples/sec: 2896.44 - lr: 0.000046 - momentum: 0.000000 2023-10-14 11:10:30,985 epoch 2 - iter 576/723 - loss 0.09643764 - time (sec): 47.89 - samples/sec: 2911.46 - lr: 0.000046 - momentum: 0.000000 2023-10-14 11:10:37,114 epoch 2 - iter 648/723 - loss 0.09508996 - time (sec): 54.02 - samples/sec: 2910.76 - lr: 0.000045 - momentum: 0.000000 2023-10-14 11:10:43,191 epoch 2 - iter 720/723 - loss 0.09515557 - time (sec): 60.10 - samples/sec: 2923.45 - lr: 0.000044 - momentum: 0.000000 2023-10-14 11:10:43,424 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:10:43,425 EPOCH 2 done: loss 0.0950 - lr: 0.000044 2023-10-14 11:10:46,938 DEV : loss 0.11145603656768799 - f1-score (micro avg) 0.6008 2023-10-14 11:10:46,954 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:10:53,078 epoch 3 - iter 72/723 - loss 0.05697663 - time (sec): 6.12 - samples/sec: 2960.64 - lr: 0.000044 - momentum: 0.000000 2023-10-14 11:10:59,115 epoch 3 - iter 144/723 - loss 0.05590903 - time (sec): 12.16 - samples/sec: 2928.48 - lr: 0.000043 - momentum: 0.000000 2023-10-14 11:11:04,869 epoch 3 - iter 216/723 - loss 0.06028807 - time (sec): 17.91 - samples/sec: 2898.16 - lr: 0.000043 - momentum: 0.000000 2023-10-14 11:11:10,528 epoch 3 - iter 288/723 - loss 0.06014774 - time (sec): 23.57 - samples/sec: 2937.38 - lr: 0.000042 - momentum: 0.000000 2023-10-14 11:11:16,547 epoch 3 - iter 360/723 - loss 0.05942668 - time (sec): 29.59 - samples/sec: 2963.42 - lr: 0.000042 - momentum: 0.000000 2023-10-14 11:11:22,397 epoch 3 - iter 432/723 - loss 0.06112432 - time (sec): 35.44 - samples/sec: 2966.69 - lr: 0.000041 - momentum: 0.000000 2023-10-14 11:11:28,720 epoch 3 - iter 504/723 - loss 0.06151410 - time (sec): 41.77 - samples/sec: 2965.99 - lr: 0.000041 - momentum: 0.000000 2023-10-14 11:11:34,166 epoch 3 - iter 576/723 - loss 0.06139185 - time (sec): 47.21 - samples/sec: 2972.35 - lr: 0.000040 - momentum: 0.000000 2023-10-14 11:11:40,094 epoch 3 - iter 648/723 - loss 0.06076136 - time (sec): 53.14 - samples/sec: 2963.14 - lr: 0.000039 - momentum: 0.000000 2023-10-14 11:11:46,678 epoch 3 - iter 720/723 - loss 0.06160248 - time (sec): 59.72 - samples/sec: 2936.77 - lr: 0.000039 - momentum: 0.000000 2023-10-14 11:11:47,005 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:11:47,005 EPOCH 3 done: loss 0.0615 - lr: 0.000039 2023-10-14 11:11:50,496 DEV : loss 0.09180538356304169 - f1-score (micro avg) 0.8029 2023-10-14 11:11:50,515 saving best model 2023-10-14 11:11:50,997 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:11:57,031 epoch 4 - iter 72/723 - loss 0.03286789 - time (sec): 6.03 - samples/sec: 2911.68 - lr: 0.000038 - momentum: 0.000000 2023-10-14 11:12:03,391 epoch 4 - iter 144/723 - loss 0.04890866 - time (sec): 12.39 - samples/sec: 2893.01 - lr: 0.000038 - momentum: 0.000000 2023-10-14 11:12:09,237 epoch 4 - iter 216/723 - loss 0.04888331 - time (sec): 18.24 - samples/sec: 2894.47 - lr: 0.000037 - momentum: 0.000000 2023-10-14 11:12:15,546 epoch 4 - iter 288/723 - loss 0.04495173 - time (sec): 24.55 - samples/sec: 2878.44 - lr: 0.000037 - momentum: 0.000000 2023-10-14 11:12:21,075 epoch 4 - iter 360/723 - loss 0.04443698 - time (sec): 30.08 - samples/sec: 2897.91 - lr: 0.000036 - momentum: 0.000000 2023-10-14 11:12:27,144 epoch 4 - iter 432/723 - loss 0.04236909 - time (sec): 36.15 - samples/sec: 2924.18 - lr: 0.000036 - momentum: 0.000000 2023-10-14 11:12:33,112 epoch 4 - iter 504/723 - loss 0.04277641 - time (sec): 42.11 - samples/sec: 2914.76 - lr: 0.000035 - momentum: 0.000000 2023-10-14 11:12:39,122 epoch 4 - iter 576/723 - loss 0.04268064 - time (sec): 48.12 - samples/sec: 2919.51 - lr: 0.000034 - momentum: 0.000000 2023-10-14 11:12:45,234 epoch 4 - iter 648/723 - loss 0.04288486 - time (sec): 54.24 - samples/sec: 2923.74 - lr: 0.000034 - momentum: 0.000000 2023-10-14 11:12:51,242 epoch 4 - iter 720/723 - loss 0.04212052 - time (sec): 60.24 - samples/sec: 2914.87 - lr: 0.000033 - momentum: 0.000000 2023-10-14 11:12:51,437 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:12:51,437 EPOCH 4 done: loss 0.0422 - lr: 0.000033 2023-10-14 11:12:55,395 DEV : loss 0.08765760809183121 - f1-score (micro avg) 0.8267 2023-10-14 11:12:55,411 saving best model 2023-10-14 11:12:55,905 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:13:02,279 epoch 5 - iter 72/723 - loss 0.03255457 - time (sec): 6.37 - samples/sec: 2890.41 - lr: 0.000033 - momentum: 0.000000 2023-10-14 11:13:07,749 epoch 5 - iter 144/723 - loss 0.02936104 - time (sec): 11.84 - samples/sec: 2992.68 - lr: 0.000032 - momentum: 0.000000 2023-10-14 11:13:14,058 epoch 5 - iter 216/723 - loss 0.02869964 - time (sec): 18.15 - samples/sec: 2982.64 - lr: 0.000032 - momentum: 0.000000 2023-10-14 11:13:19,868 epoch 5 - iter 288/723 - loss 0.03183997 - time (sec): 23.96 - samples/sec: 2956.14 - lr: 0.000031 - momentum: 0.000000 2023-10-14 11:13:25,531 epoch 5 - iter 360/723 - loss 0.03090078 - time (sec): 29.62 - samples/sec: 2958.85 - lr: 0.000031 - momentum: 0.000000 2023-10-14 11:13:31,005 epoch 5 - iter 432/723 - loss 0.03133124 - time (sec): 35.10 - samples/sec: 2956.71 - lr: 0.000030 - momentum: 0.000000 2023-10-14 11:13:37,056 epoch 5 - iter 504/723 - loss 0.03078959 - time (sec): 41.15 - samples/sec: 2963.95 - lr: 0.000029 - momentum: 0.000000 2023-10-14 11:13:43,185 epoch 5 - iter 576/723 - loss 0.03127458 - time (sec): 47.28 - samples/sec: 2955.69 - lr: 0.000029 - momentum: 0.000000 2023-10-14 11:13:49,383 epoch 5 - iter 648/723 - loss 0.03267352 - time (sec): 53.48 - samples/sec: 2959.58 - lr: 0.000028 - momentum: 0.000000 2023-10-14 11:13:55,255 epoch 5 - iter 720/723 - loss 0.03166451 - time (sec): 59.35 - samples/sec: 2960.54 - lr: 0.000028 - momentum: 0.000000 2023-10-14 11:13:55,425 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:13:55,425 EPOCH 5 done: loss 0.0316 - lr: 0.000028 2023-10-14 11:13:58,953 DEV : loss 0.13083083927631378 - f1-score (micro avg) 0.802 2023-10-14 11:13:58,970 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:14:05,067 epoch 6 - iter 72/723 - loss 0.01949715 - time (sec): 6.10 - samples/sec: 2846.45 - lr: 0.000027 - momentum: 0.000000 2023-10-14 11:14:11,288 epoch 6 - iter 144/723 - loss 0.02577060 - time (sec): 12.32 - samples/sec: 2847.72 - lr: 0.000027 - momentum: 0.000000 2023-10-14 11:14:17,115 epoch 6 - iter 216/723 - loss 0.02325824 - time (sec): 18.14 - samples/sec: 2893.21 - lr: 0.000026 - momentum: 0.000000 2023-10-14 11:14:23,395 epoch 6 - iter 288/723 - loss 0.02320719 - time (sec): 24.42 - samples/sec: 2887.98 - lr: 0.000026 - momentum: 0.000000 2023-10-14 11:14:29,178 epoch 6 - iter 360/723 - loss 0.02320219 - time (sec): 30.21 - samples/sec: 2894.14 - lr: 0.000025 - momentum: 0.000000 2023-10-14 11:14:35,636 epoch 6 - iter 432/723 - loss 0.02204302 - time (sec): 36.66 - samples/sec: 2864.92 - lr: 0.000024 - momentum: 0.000000 2023-10-14 11:14:41,853 epoch 6 - iter 504/723 - loss 0.02223350 - time (sec): 42.88 - samples/sec: 2865.35 - lr: 0.000024 - momentum: 0.000000 2023-10-14 11:14:48,295 epoch 6 - iter 576/723 - loss 0.02115934 - time (sec): 49.32 - samples/sec: 2883.97 - lr: 0.000023 - momentum: 0.000000 2023-10-14 11:14:54,075 epoch 6 - iter 648/723 - loss 0.02227967 - time (sec): 55.10 - samples/sec: 2887.62 - lr: 0.000023 - momentum: 0.000000 2023-10-14 11:14:59,656 epoch 6 - iter 720/723 - loss 0.02225091 - time (sec): 60.68 - samples/sec: 2896.19 - lr: 0.000022 - momentum: 0.000000 2023-10-14 11:14:59,822 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:14:59,822 EPOCH 6 done: loss 0.0222 - lr: 0.000022 2023-10-14 11:15:03,409 DEV : loss 0.14740866422653198 - f1-score (micro avg) 0.8142 2023-10-14 11:15:03,424 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:15:09,622 epoch 7 - iter 72/723 - loss 0.00957452 - time (sec): 6.20 - samples/sec: 2831.54 - lr: 0.000022 - momentum: 0.000000 2023-10-14 11:15:16,166 epoch 7 - iter 144/723 - loss 0.01188012 - time (sec): 12.74 - samples/sec: 2876.66 - lr: 0.000021 - momentum: 0.000000 2023-10-14 11:15:21,813 epoch 7 - iter 216/723 - loss 0.01261768 - time (sec): 18.39 - samples/sec: 2910.92 - lr: 0.000021 - momentum: 0.000000 2023-10-14 11:15:27,957 epoch 7 - iter 288/723 - loss 0.01369299 - time (sec): 24.53 - samples/sec: 2931.95 - lr: 0.000020 - momentum: 0.000000 2023-10-14 11:15:33,566 epoch 7 - iter 360/723 - loss 0.01532146 - time (sec): 30.14 - samples/sec: 2935.86 - lr: 0.000019 - momentum: 0.000000 2023-10-14 11:15:39,009 epoch 7 - iter 432/723 - loss 0.01513085 - time (sec): 35.58 - samples/sec: 2953.20 - lr: 0.000019 - momentum: 0.000000 2023-10-14 11:15:45,475 epoch 7 - iter 504/723 - loss 0.01624201 - time (sec): 42.05 - samples/sec: 2945.23 - lr: 0.000018 - momentum: 0.000000 2023-10-14 11:15:51,426 epoch 7 - iter 576/723 - loss 0.01671837 - time (sec): 48.00 - samples/sec: 2955.70 - lr: 0.000018 - momentum: 0.000000 2023-10-14 11:15:56,996 epoch 7 - iter 648/723 - loss 0.01672154 - time (sec): 53.57 - samples/sec: 2970.48 - lr: 0.000017 - momentum: 0.000000 2023-10-14 11:16:03,131 epoch 7 - iter 720/723 - loss 0.01649884 - time (sec): 59.71 - samples/sec: 2944.54 - lr: 0.000017 - momentum: 0.000000 2023-10-14 11:16:03,334 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:16:03,334 EPOCH 7 done: loss 0.0170 - lr: 0.000017 2023-10-14 11:16:07,244 DEV : loss 0.16563206911087036 - f1-score (micro avg) 0.8131 2023-10-14 11:16:07,261 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:16:13,097 epoch 8 - iter 72/723 - loss 0.01815901 - time (sec): 5.83 - samples/sec: 2864.79 - lr: 0.000016 - momentum: 0.000000 2023-10-14 11:16:19,181 epoch 8 - iter 144/723 - loss 0.01454238 - time (sec): 11.92 - samples/sec: 2902.43 - lr: 0.000016 - momentum: 0.000000 2023-10-14 11:16:25,206 epoch 8 - iter 216/723 - loss 0.01396639 - time (sec): 17.94 - samples/sec: 2916.65 - lr: 0.000015 - momentum: 0.000000 2023-10-14 11:16:31,319 epoch 8 - iter 288/723 - loss 0.01233712 - time (sec): 24.06 - samples/sec: 2896.51 - lr: 0.000014 - momentum: 0.000000 2023-10-14 11:16:37,272 epoch 8 - iter 360/723 - loss 0.01149011 - time (sec): 30.01 - samples/sec: 2930.63 - lr: 0.000014 - momentum: 0.000000 2023-10-14 11:16:42,721 epoch 8 - iter 432/723 - loss 0.01140242 - time (sec): 35.46 - samples/sec: 2948.80 - lr: 0.000013 - momentum: 0.000000 2023-10-14 11:16:49,014 epoch 8 - iter 504/723 - loss 0.01267287 - time (sec): 41.75 - samples/sec: 2939.73 - lr: 0.000013 - momentum: 0.000000 2023-10-14 11:16:54,938 epoch 8 - iter 576/723 - loss 0.01271844 - time (sec): 47.68 - samples/sec: 2944.93 - lr: 0.000012 - momentum: 0.000000 2023-10-14 11:17:00,516 epoch 8 - iter 648/723 - loss 0.01192778 - time (sec): 53.25 - samples/sec: 2958.87 - lr: 0.000012 - momentum: 0.000000 2023-10-14 11:17:06,796 epoch 8 - iter 720/723 - loss 0.01221300 - time (sec): 59.53 - samples/sec: 2946.63 - lr: 0.000011 - momentum: 0.000000 2023-10-14 11:17:07,022 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:17:07,022 EPOCH 8 done: loss 0.0122 - lr: 0.000011 2023-10-14 11:17:10,588 DEV : loss 0.1542436182498932 - f1-score (micro avg) 0.8318 2023-10-14 11:17:10,605 saving best model 2023-10-14 11:17:11,147 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:17:17,372 epoch 9 - iter 72/723 - loss 0.00632595 - time (sec): 6.22 - samples/sec: 2919.02 - lr: 0.000011 - momentum: 0.000000 2023-10-14 11:17:23,181 epoch 9 - iter 144/723 - loss 0.00583846 - time (sec): 12.03 - samples/sec: 2936.80 - lr: 0.000010 - momentum: 0.000000 2023-10-14 11:17:29,380 epoch 9 - iter 216/723 - loss 0.00746569 - time (sec): 18.23 - samples/sec: 2858.87 - lr: 0.000009 - momentum: 0.000000 2023-10-14 11:17:36,031 epoch 9 - iter 288/723 - loss 0.00781052 - time (sec): 24.88 - samples/sec: 2868.30 - lr: 0.000009 - momentum: 0.000000 2023-10-14 11:17:41,837 epoch 9 - iter 360/723 - loss 0.00707142 - time (sec): 30.69 - samples/sec: 2892.67 - lr: 0.000008 - momentum: 0.000000 2023-10-14 11:17:47,948 epoch 9 - iter 432/723 - loss 0.00709441 - time (sec): 36.80 - samples/sec: 2898.81 - lr: 0.000008 - momentum: 0.000000 2023-10-14 11:17:53,401 epoch 9 - iter 504/723 - loss 0.00711054 - time (sec): 42.25 - samples/sec: 2923.32 - lr: 0.000007 - momentum: 0.000000 2023-10-14 11:17:59,535 epoch 9 - iter 576/723 - loss 0.00729666 - time (sec): 48.39 - samples/sec: 2923.90 - lr: 0.000007 - momentum: 0.000000 2023-10-14 11:18:05,332 epoch 9 - iter 648/723 - loss 0.00763742 - time (sec): 54.18 - samples/sec: 2924.87 - lr: 0.000006 - momentum: 0.000000 2023-10-14 11:18:11,175 epoch 9 - iter 720/723 - loss 0.00760018 - time (sec): 60.03 - samples/sec: 2927.43 - lr: 0.000006 - momentum: 0.000000 2023-10-14 11:18:11,415 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:18:11,415 EPOCH 9 done: loss 0.0077 - lr: 0.000006 2023-10-14 11:18:14,900 DEV : loss 0.16931197047233582 - f1-score (micro avg) 0.826 2023-10-14 11:18:14,916 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:18:20,563 epoch 10 - iter 72/723 - loss 0.00226973 - time (sec): 5.65 - samples/sec: 2956.98 - lr: 0.000005 - momentum: 0.000000 2023-10-14 11:18:26,764 epoch 10 - iter 144/723 - loss 0.00486341 - time (sec): 11.85 - samples/sec: 2931.85 - lr: 0.000004 - momentum: 0.000000 2023-10-14 11:18:33,052 epoch 10 - iter 216/723 - loss 0.00637801 - time (sec): 18.13 - samples/sec: 2889.19 - lr: 0.000004 - momentum: 0.000000 2023-10-14 11:18:39,200 epoch 10 - iter 288/723 - loss 0.00616245 - time (sec): 24.28 - samples/sec: 2929.82 - lr: 0.000003 - momentum: 0.000000 2023-10-14 11:18:45,320 epoch 10 - iter 360/723 - loss 0.00540335 - time (sec): 30.40 - samples/sec: 2947.65 - lr: 0.000003 - momentum: 0.000000 2023-10-14 11:18:51,198 epoch 10 - iter 432/723 - loss 0.00563038 - time (sec): 36.28 - samples/sec: 2945.76 - lr: 0.000002 - momentum: 0.000000 2023-10-14 11:18:56,544 epoch 10 - iter 504/723 - loss 0.00536321 - time (sec): 41.63 - samples/sec: 2944.91 - lr: 0.000002 - momentum: 0.000000 2023-10-14 11:19:02,296 epoch 10 - iter 576/723 - loss 0.00493682 - time (sec): 47.38 - samples/sec: 2940.31 - lr: 0.000001 - momentum: 0.000000 2023-10-14 11:19:08,584 epoch 10 - iter 648/723 - loss 0.00541293 - time (sec): 53.67 - samples/sec: 2939.73 - lr: 0.000001 - momentum: 0.000000 2023-10-14 11:19:14,511 epoch 10 - iter 720/723 - loss 0.00514294 - time (sec): 59.59 - samples/sec: 2944.81 - lr: 0.000000 - momentum: 0.000000 2023-10-14 11:19:14,745 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:19:14,745 EPOCH 10 done: loss 0.0052 - lr: 0.000000 2023-10-14 11:19:18,797 DEV : loss 0.17554564774036407 - f1-score (micro avg) 0.8284 2023-10-14 11:19:19,239 ---------------------------------------------------------------------------------------------------- 2023-10-14 11:19:19,240 Loading model from best epoch ... 2023-10-14 11:19:20,833 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-14 11:19:24,050 Results: - F-score (micro) 0.8061 - F-score (macro) 0.6941 - Accuracy 0.6877 By class: precision recall f1-score support PER 0.7714 0.8610 0.8137 482 LOC 0.8868 0.8210 0.8526 458 ORG 0.4643 0.3768 0.4160 69 micro avg 0.8026 0.8097 0.8061 1009 macro avg 0.7075 0.6863 0.6941 1009 weighted avg 0.8028 0.8097 0.8042 1009 2023-10-14 11:19:24,050 ----------------------------------------------------------------------------------------------------