2023-10-16 18:27:24,156 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:27:24,157 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-16 18:27:24,157 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:27:24,157 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-16 18:27:24,157 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:27:24,157 Train: 1166 sentences 2023-10-16 18:27:24,157 (train_with_dev=False, train_with_test=False) 2023-10-16 18:27:24,157 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:27:24,158 Training Params: 2023-10-16 18:27:24,158 - learning_rate: "3e-05" 2023-10-16 18:27:24,158 - mini_batch_size: "8" 2023-10-16 18:27:24,158 - max_epochs: "10" 2023-10-16 18:27:24,158 - shuffle: "True" 2023-10-16 18:27:24,158 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:27:24,158 Plugins: 2023-10-16 18:27:24,158 - LinearScheduler | warmup_fraction: '0.1' 2023-10-16 18:27:24,158 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:27:24,158 Final evaluation on model from best epoch (best-model.pt) 2023-10-16 18:27:24,158 - metric: "('micro avg', 'f1-score')" 2023-10-16 18:27:24,158 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:27:24,158 Computation: 2023-10-16 18:27:24,158 - compute on device: cuda:0 2023-10-16 18:27:24,158 - embedding storage: none 2023-10-16 18:27:24,158 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:27:24,158 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-16 18:27:24,158 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:27:24,158 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:27:25,317 epoch 1 - iter 14/146 - loss 2.93513404 - time (sec): 1.16 - samples/sec: 3436.39 - lr: 0.000003 - momentum: 0.000000 2023-10-16 18:27:26,505 epoch 1 - iter 28/146 - loss 2.77389228 - time (sec): 2.35 - samples/sec: 3099.80 - lr: 0.000006 - momentum: 0.000000 2023-10-16 18:27:28,131 epoch 1 - iter 42/146 - loss 2.21274536 - time (sec): 3.97 - samples/sec: 3082.74 - lr: 0.000008 - momentum: 0.000000 2023-10-16 18:27:29,576 epoch 1 - iter 56/146 - loss 1.86758775 - time (sec): 5.42 - samples/sec: 3038.81 - lr: 0.000011 - momentum: 0.000000 2023-10-16 18:27:30,809 epoch 1 - iter 70/146 - loss 1.63388705 - time (sec): 6.65 - samples/sec: 2992.94 - lr: 0.000014 - momentum: 0.000000 2023-10-16 18:27:32,063 epoch 1 - iter 84/146 - loss 1.51508448 - time (sec): 7.90 - samples/sec: 2991.87 - lr: 0.000017 - momentum: 0.000000 2023-10-16 18:27:33,978 epoch 1 - iter 98/146 - loss 1.33298731 - time (sec): 9.82 - samples/sec: 2942.53 - lr: 0.000020 - momentum: 0.000000 2023-10-16 18:27:35,323 epoch 1 - iter 112/146 - loss 1.20821560 - time (sec): 11.16 - samples/sec: 2986.86 - lr: 0.000023 - momentum: 0.000000 2023-10-16 18:27:36,996 epoch 1 - iter 126/146 - loss 1.09396532 - time (sec): 12.84 - samples/sec: 2965.34 - lr: 0.000026 - momentum: 0.000000 2023-10-16 18:27:38,420 epoch 1 - iter 140/146 - loss 1.00755548 - time (sec): 14.26 - samples/sec: 2973.03 - lr: 0.000029 - momentum: 0.000000 2023-10-16 18:27:39,093 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:27:39,094 EPOCH 1 done: loss 0.9748 - lr: 0.000029 2023-10-16 18:27:39,905 DEV : loss 0.21248747408390045 - f1-score (micro avg) 0.4671 2023-10-16 18:27:39,911 saving best model 2023-10-16 18:27:40,379 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:27:41,907 epoch 2 - iter 14/146 - loss 0.25741814 - time (sec): 1.53 - samples/sec: 3144.33 - lr: 0.000030 - momentum: 0.000000 2023-10-16 18:27:43,594 epoch 2 - iter 28/146 - loss 0.25094580 - time (sec): 3.21 - samples/sec: 2960.80 - lr: 0.000029 - momentum: 0.000000 2023-10-16 18:27:44,866 epoch 2 - iter 42/146 - loss 0.25091985 - time (sec): 4.49 - samples/sec: 2947.16 - lr: 0.000029 - momentum: 0.000000 2023-10-16 18:27:46,323 epoch 2 - iter 56/146 - loss 0.23948423 - time (sec): 5.94 - samples/sec: 2914.89 - lr: 0.000029 - momentum: 0.000000 2023-10-16 18:27:47,677 epoch 2 - iter 70/146 - loss 0.23078029 - time (sec): 7.30 - samples/sec: 2876.03 - lr: 0.000028 - momentum: 0.000000 2023-10-16 18:27:49,415 epoch 2 - iter 84/146 - loss 0.25123902 - time (sec): 9.03 - samples/sec: 2871.89 - lr: 0.000028 - momentum: 0.000000 2023-10-16 18:27:50,999 epoch 2 - iter 98/146 - loss 0.24030979 - time (sec): 10.62 - samples/sec: 2881.48 - lr: 0.000028 - momentum: 0.000000 2023-10-16 18:27:52,231 epoch 2 - iter 112/146 - loss 0.23215042 - time (sec): 11.85 - samples/sec: 2885.68 - lr: 0.000027 - momentum: 0.000000 2023-10-16 18:27:53,529 epoch 2 - iter 126/146 - loss 0.22725978 - time (sec): 13.15 - samples/sec: 2926.76 - lr: 0.000027 - momentum: 0.000000 2023-10-16 18:27:55,155 epoch 2 - iter 140/146 - loss 0.21838231 - time (sec): 14.77 - samples/sec: 2921.39 - lr: 0.000027 - momentum: 0.000000 2023-10-16 18:27:55,621 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:27:55,621 EPOCH 2 done: loss 0.2171 - lr: 0.000027 2023-10-16 18:27:56,858 DEV : loss 0.1434197872877121 - f1-score (micro avg) 0.6021 2023-10-16 18:27:56,863 saving best model 2023-10-16 18:27:57,673 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:27:59,758 epoch 3 - iter 14/146 - loss 0.18575960 - time (sec): 2.08 - samples/sec: 2489.05 - lr: 0.000026 - momentum: 0.000000 2023-10-16 18:28:01,045 epoch 3 - iter 28/146 - loss 0.18887110 - time (sec): 3.37 - samples/sec: 2772.46 - lr: 0.000026 - momentum: 0.000000 2023-10-16 18:28:02,583 epoch 3 - iter 42/146 - loss 0.17188404 - time (sec): 4.91 - samples/sec: 2876.25 - lr: 0.000026 - momentum: 0.000000 2023-10-16 18:28:04,041 epoch 3 - iter 56/146 - loss 0.15696548 - time (sec): 6.37 - samples/sec: 2941.00 - lr: 0.000025 - momentum: 0.000000 2023-10-16 18:28:05,630 epoch 3 - iter 70/146 - loss 0.14502167 - time (sec): 7.96 - samples/sec: 2928.47 - lr: 0.000025 - momentum: 0.000000 2023-10-16 18:28:06,896 epoch 3 - iter 84/146 - loss 0.14099742 - time (sec): 9.22 - samples/sec: 2934.25 - lr: 0.000025 - momentum: 0.000000 2023-10-16 18:28:08,331 epoch 3 - iter 98/146 - loss 0.13734271 - time (sec): 10.66 - samples/sec: 2924.49 - lr: 0.000024 - momentum: 0.000000 2023-10-16 18:28:09,528 epoch 3 - iter 112/146 - loss 0.13345700 - time (sec): 11.85 - samples/sec: 2950.12 - lr: 0.000024 - momentum: 0.000000 2023-10-16 18:28:11,017 epoch 3 - iter 126/146 - loss 0.13132961 - time (sec): 13.34 - samples/sec: 2945.01 - lr: 0.000024 - momentum: 0.000000 2023-10-16 18:28:12,211 epoch 3 - iter 140/146 - loss 0.12890036 - time (sec): 14.54 - samples/sec: 2958.40 - lr: 0.000024 - momentum: 0.000000 2023-10-16 18:28:12,652 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:28:12,652 EPOCH 3 done: loss 0.1276 - lr: 0.000024 2023-10-16 18:28:13,931 DEV : loss 0.11977185308933258 - f1-score (micro avg) 0.6652 2023-10-16 18:28:13,937 saving best model 2023-10-16 18:28:14,450 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:28:15,788 epoch 4 - iter 14/146 - loss 0.07996901 - time (sec): 1.34 - samples/sec: 2911.28 - lr: 0.000023 - momentum: 0.000000 2023-10-16 18:28:17,109 epoch 4 - iter 28/146 - loss 0.08203620 - time (sec): 2.66 - samples/sec: 2996.09 - lr: 0.000023 - momentum: 0.000000 2023-10-16 18:28:18,473 epoch 4 - iter 42/146 - loss 0.09414321 - time (sec): 4.02 - samples/sec: 2900.82 - lr: 0.000022 - momentum: 0.000000 2023-10-16 18:28:20,069 epoch 4 - iter 56/146 - loss 0.08273818 - time (sec): 5.62 - samples/sec: 2934.25 - lr: 0.000022 - momentum: 0.000000 2023-10-16 18:28:21,354 epoch 4 - iter 70/146 - loss 0.08429318 - time (sec): 6.90 - samples/sec: 2935.80 - lr: 0.000022 - momentum: 0.000000 2023-10-16 18:28:22,724 epoch 4 - iter 84/146 - loss 0.08662969 - time (sec): 8.27 - samples/sec: 2941.49 - lr: 0.000021 - momentum: 0.000000 2023-10-16 18:28:24,096 epoch 4 - iter 98/146 - loss 0.08750805 - time (sec): 9.64 - samples/sec: 2935.41 - lr: 0.000021 - momentum: 0.000000 2023-10-16 18:28:25,651 epoch 4 - iter 112/146 - loss 0.08853932 - time (sec): 11.20 - samples/sec: 2939.66 - lr: 0.000021 - momentum: 0.000000 2023-10-16 18:28:26,988 epoch 4 - iter 126/146 - loss 0.08671444 - time (sec): 12.54 - samples/sec: 2961.01 - lr: 0.000021 - momentum: 0.000000 2023-10-16 18:28:28,792 epoch 4 - iter 140/146 - loss 0.08282451 - time (sec): 14.34 - samples/sec: 2977.22 - lr: 0.000020 - momentum: 0.000000 2023-10-16 18:28:29,335 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:28:29,336 EPOCH 4 done: loss 0.0822 - lr: 0.000020 2023-10-16 18:28:30,578 DEV : loss 0.10638927668333054 - f1-score (micro avg) 0.7168 2023-10-16 18:28:30,583 saving best model 2023-10-16 18:28:31,176 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:28:32,716 epoch 5 - iter 14/146 - loss 0.10632934 - time (sec): 1.54 - samples/sec: 2747.36 - lr: 0.000020 - momentum: 0.000000 2023-10-16 18:28:34,221 epoch 5 - iter 28/146 - loss 0.07986467 - time (sec): 3.04 - samples/sec: 2757.96 - lr: 0.000019 - momentum: 0.000000 2023-10-16 18:28:35,865 epoch 5 - iter 42/146 - loss 0.06670757 - time (sec): 4.68 - samples/sec: 2733.59 - lr: 0.000019 - momentum: 0.000000 2023-10-16 18:28:37,137 epoch 5 - iter 56/146 - loss 0.06636050 - time (sec): 5.96 - samples/sec: 2773.06 - lr: 0.000019 - momentum: 0.000000 2023-10-16 18:28:38,628 epoch 5 - iter 70/146 - loss 0.06634048 - time (sec): 7.45 - samples/sec: 2793.51 - lr: 0.000018 - momentum: 0.000000 2023-10-16 18:28:40,017 epoch 5 - iter 84/146 - loss 0.06502579 - time (sec): 8.84 - samples/sec: 2819.90 - lr: 0.000018 - momentum: 0.000000 2023-10-16 18:28:41,496 epoch 5 - iter 98/146 - loss 0.06323217 - time (sec): 10.32 - samples/sec: 2827.88 - lr: 0.000018 - momentum: 0.000000 2023-10-16 18:28:42,923 epoch 5 - iter 112/146 - loss 0.06339836 - time (sec): 11.74 - samples/sec: 2888.38 - lr: 0.000018 - momentum: 0.000000 2023-10-16 18:28:44,351 epoch 5 - iter 126/146 - loss 0.06263725 - time (sec): 13.17 - samples/sec: 2901.58 - lr: 0.000017 - momentum: 0.000000 2023-10-16 18:28:45,678 epoch 5 - iter 140/146 - loss 0.06146587 - time (sec): 14.50 - samples/sec: 2909.62 - lr: 0.000017 - momentum: 0.000000 2023-10-16 18:28:46,393 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:28:46,393 EPOCH 5 done: loss 0.0599 - lr: 0.000017 2023-10-16 18:28:47,691 DEV : loss 0.11089115589857101 - f1-score (micro avg) 0.7484 2023-10-16 18:28:47,696 saving best model 2023-10-16 18:28:48,226 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:28:50,125 epoch 6 - iter 14/146 - loss 0.03612688 - time (sec): 1.90 - samples/sec: 2635.80 - lr: 0.000016 - momentum: 0.000000 2023-10-16 18:28:51,353 epoch 6 - iter 28/146 - loss 0.03686816 - time (sec): 3.13 - samples/sec: 2817.15 - lr: 0.000016 - momentum: 0.000000 2023-10-16 18:28:52,703 epoch 6 - iter 42/146 - loss 0.03616211 - time (sec): 4.48 - samples/sec: 2829.65 - lr: 0.000016 - momentum: 0.000000 2023-10-16 18:28:54,290 epoch 6 - iter 56/146 - loss 0.03447287 - time (sec): 6.06 - samples/sec: 2763.87 - lr: 0.000015 - momentum: 0.000000 2023-10-16 18:28:55,642 epoch 6 - iter 70/146 - loss 0.03490954 - time (sec): 7.41 - samples/sec: 2901.81 - lr: 0.000015 - momentum: 0.000000 2023-10-16 18:28:56,811 epoch 6 - iter 84/146 - loss 0.03505656 - time (sec): 8.58 - samples/sec: 2936.17 - lr: 0.000015 - momentum: 0.000000 2023-10-16 18:28:58,271 epoch 6 - iter 98/146 - loss 0.03312980 - time (sec): 10.04 - samples/sec: 2955.51 - lr: 0.000015 - momentum: 0.000000 2023-10-16 18:28:59,521 epoch 6 - iter 112/146 - loss 0.03670852 - time (sec): 11.29 - samples/sec: 2951.86 - lr: 0.000014 - momentum: 0.000000 2023-10-16 18:29:01,313 epoch 6 - iter 126/146 - loss 0.03829386 - time (sec): 13.09 - samples/sec: 2986.27 - lr: 0.000014 - momentum: 0.000000 2023-10-16 18:29:02,439 epoch 6 - iter 140/146 - loss 0.04191597 - time (sec): 14.21 - samples/sec: 2979.92 - lr: 0.000014 - momentum: 0.000000 2023-10-16 18:29:03,316 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:29:03,316 EPOCH 6 done: loss 0.0433 - lr: 0.000014 2023-10-16 18:29:04,619 DEV : loss 0.12725397944450378 - f1-score (micro avg) 0.7152 2023-10-16 18:29:04,626 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:29:06,333 epoch 7 - iter 14/146 - loss 0.03280998 - time (sec): 1.71 - samples/sec: 3099.99 - lr: 0.000013 - momentum: 0.000000 2023-10-16 18:29:07,641 epoch 7 - iter 28/146 - loss 0.02824876 - time (sec): 3.01 - samples/sec: 3047.06 - lr: 0.000013 - momentum: 0.000000 2023-10-16 18:29:09,289 epoch 7 - iter 42/146 - loss 0.02707074 - time (sec): 4.66 - samples/sec: 2956.74 - lr: 0.000012 - momentum: 0.000000 2023-10-16 18:29:10,811 epoch 7 - iter 56/146 - loss 0.02889321 - time (sec): 6.18 - samples/sec: 2871.09 - lr: 0.000012 - momentum: 0.000000 2023-10-16 18:29:12,358 epoch 7 - iter 70/146 - loss 0.03251612 - time (sec): 7.73 - samples/sec: 2848.77 - lr: 0.000012 - momentum: 0.000000 2023-10-16 18:29:13,920 epoch 7 - iter 84/146 - loss 0.02978633 - time (sec): 9.29 - samples/sec: 2857.28 - lr: 0.000012 - momentum: 0.000000 2023-10-16 18:29:15,091 epoch 7 - iter 98/146 - loss 0.02989419 - time (sec): 10.46 - samples/sec: 2884.06 - lr: 0.000011 - momentum: 0.000000 2023-10-16 18:29:16,500 epoch 7 - iter 112/146 - loss 0.02868727 - time (sec): 11.87 - samples/sec: 2879.60 - lr: 0.000011 - momentum: 0.000000 2023-10-16 18:29:17,885 epoch 7 - iter 126/146 - loss 0.03081760 - time (sec): 13.26 - samples/sec: 2928.91 - lr: 0.000011 - momentum: 0.000000 2023-10-16 18:29:19,190 epoch 7 - iter 140/146 - loss 0.03274991 - time (sec): 14.56 - samples/sec: 2931.80 - lr: 0.000010 - momentum: 0.000000 2023-10-16 18:29:19,925 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:29:19,926 EPOCH 7 done: loss 0.0321 - lr: 0.000010 2023-10-16 18:29:21,217 DEV : loss 0.12044133991003036 - f1-score (micro avg) 0.766 2023-10-16 18:29:21,222 saving best model 2023-10-16 18:29:21,806 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:29:23,128 epoch 8 - iter 14/146 - loss 0.02741513 - time (sec): 1.32 - samples/sec: 3165.44 - lr: 0.000010 - momentum: 0.000000 2023-10-16 18:29:24,529 epoch 8 - iter 28/146 - loss 0.01990358 - time (sec): 2.72 - samples/sec: 3158.95 - lr: 0.000009 - momentum: 0.000000 2023-10-16 18:29:26,092 epoch 8 - iter 42/146 - loss 0.02296771 - time (sec): 4.28 - samples/sec: 2976.98 - lr: 0.000009 - momentum: 0.000000 2023-10-16 18:29:27,603 epoch 8 - iter 56/146 - loss 0.02253618 - time (sec): 5.80 - samples/sec: 2889.22 - lr: 0.000009 - momentum: 0.000000 2023-10-16 18:29:29,125 epoch 8 - iter 70/146 - loss 0.02333529 - time (sec): 7.32 - samples/sec: 2902.57 - lr: 0.000009 - momentum: 0.000000 2023-10-16 18:29:30,335 epoch 8 - iter 84/146 - loss 0.02425044 - time (sec): 8.53 - samples/sec: 2940.76 - lr: 0.000008 - momentum: 0.000000 2023-10-16 18:29:32,104 epoch 8 - iter 98/146 - loss 0.02550230 - time (sec): 10.30 - samples/sec: 2902.86 - lr: 0.000008 - momentum: 0.000000 2023-10-16 18:29:33,562 epoch 8 - iter 112/146 - loss 0.02765254 - time (sec): 11.75 - samples/sec: 2930.06 - lr: 0.000008 - momentum: 0.000000 2023-10-16 18:29:34,824 epoch 8 - iter 126/146 - loss 0.02724778 - time (sec): 13.02 - samples/sec: 2926.68 - lr: 0.000007 - momentum: 0.000000 2023-10-16 18:29:36,217 epoch 8 - iter 140/146 - loss 0.02645279 - time (sec): 14.41 - samples/sec: 2956.07 - lr: 0.000007 - momentum: 0.000000 2023-10-16 18:29:36,861 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:29:36,861 EPOCH 8 done: loss 0.0261 - lr: 0.000007 2023-10-16 18:29:38,320 DEV : loss 0.12693804502487183 - f1-score (micro avg) 0.7526 2023-10-16 18:29:38,324 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:29:39,576 epoch 9 - iter 14/146 - loss 0.01650818 - time (sec): 1.25 - samples/sec: 3366.30 - lr: 0.000006 - momentum: 0.000000 2023-10-16 18:29:41,284 epoch 9 - iter 28/146 - loss 0.02351489 - time (sec): 2.96 - samples/sec: 2916.17 - lr: 0.000006 - momentum: 0.000000 2023-10-16 18:29:42,771 epoch 9 - iter 42/146 - loss 0.02544422 - time (sec): 4.45 - samples/sec: 2897.68 - lr: 0.000006 - momentum: 0.000000 2023-10-16 18:29:44,255 epoch 9 - iter 56/146 - loss 0.02923101 - time (sec): 5.93 - samples/sec: 2977.74 - lr: 0.000006 - momentum: 0.000000 2023-10-16 18:29:45,955 epoch 9 - iter 70/146 - loss 0.02603799 - time (sec): 7.63 - samples/sec: 2934.50 - lr: 0.000005 - momentum: 0.000000 2023-10-16 18:29:47,435 epoch 9 - iter 84/146 - loss 0.02466823 - time (sec): 9.11 - samples/sec: 2929.35 - lr: 0.000005 - momentum: 0.000000 2023-10-16 18:29:48,776 epoch 9 - iter 98/146 - loss 0.02570875 - time (sec): 10.45 - samples/sec: 2950.34 - lr: 0.000005 - momentum: 0.000000 2023-10-16 18:29:50,231 epoch 9 - iter 112/146 - loss 0.02418524 - time (sec): 11.91 - samples/sec: 2947.42 - lr: 0.000004 - momentum: 0.000000 2023-10-16 18:29:51,536 epoch 9 - iter 126/146 - loss 0.02378156 - time (sec): 13.21 - samples/sec: 2941.77 - lr: 0.000004 - momentum: 0.000000 2023-10-16 18:29:53,032 epoch 9 - iter 140/146 - loss 0.02245718 - time (sec): 14.71 - samples/sec: 2913.45 - lr: 0.000004 - momentum: 0.000000 2023-10-16 18:29:53,512 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:29:53,512 EPOCH 9 done: loss 0.0224 - lr: 0.000004 2023-10-16 18:29:54,818 DEV : loss 0.13801662623882294 - f1-score (micro avg) 0.7417 2023-10-16 18:29:54,822 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:29:56,258 epoch 10 - iter 14/146 - loss 0.00940816 - time (sec): 1.43 - samples/sec: 3070.97 - lr: 0.000003 - momentum: 0.000000 2023-10-16 18:29:57,875 epoch 10 - iter 28/146 - loss 0.01387647 - time (sec): 3.05 - samples/sec: 3120.84 - lr: 0.000003 - momentum: 0.000000 2023-10-16 18:29:59,255 epoch 10 - iter 42/146 - loss 0.02669714 - time (sec): 4.43 - samples/sec: 3025.78 - lr: 0.000003 - momentum: 0.000000 2023-10-16 18:30:00,595 epoch 10 - iter 56/146 - loss 0.02370339 - time (sec): 5.77 - samples/sec: 3054.06 - lr: 0.000002 - momentum: 0.000000 2023-10-16 18:30:02,063 epoch 10 - iter 70/146 - loss 0.02208689 - time (sec): 7.24 - samples/sec: 2976.98 - lr: 0.000002 - momentum: 0.000000 2023-10-16 18:30:03,748 epoch 10 - iter 84/146 - loss 0.02139677 - time (sec): 8.92 - samples/sec: 3007.22 - lr: 0.000002 - momentum: 0.000000 2023-10-16 18:30:05,069 epoch 10 - iter 98/146 - loss 0.01968660 - time (sec): 10.25 - samples/sec: 3022.30 - lr: 0.000001 - momentum: 0.000000 2023-10-16 18:30:06,371 epoch 10 - iter 112/146 - loss 0.01902306 - time (sec): 11.55 - samples/sec: 2995.40 - lr: 0.000001 - momentum: 0.000000 2023-10-16 18:30:07,887 epoch 10 - iter 126/146 - loss 0.01748688 - time (sec): 13.06 - samples/sec: 2969.77 - lr: 0.000001 - momentum: 0.000000 2023-10-16 18:30:09,235 epoch 10 - iter 140/146 - loss 0.01846671 - time (sec): 14.41 - samples/sec: 2990.60 - lr: 0.000000 - momentum: 0.000000 2023-10-16 18:30:09,714 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:30:09,714 EPOCH 10 done: loss 0.0184 - lr: 0.000000 2023-10-16 18:30:11,005 DEV : loss 0.144253209233284 - f1-score (micro avg) 0.7318 2023-10-16 18:30:11,430 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:30:11,431 Loading model from best epoch ... 2023-10-16 18:30:13,049 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-16 18:30:15,723 Results: - F-score (micro) 0.7512 - F-score (macro) 0.6702 - Accuracy 0.6244 By class: precision recall f1-score support PER 0.7962 0.8420 0.8184 348 LOC 0.6503 0.8123 0.7223 261 ORG 0.5000 0.4231 0.4583 52 HumanProd 0.6818 0.6818 0.6818 22 micro avg 0.7132 0.7936 0.7512 683 macro avg 0.6571 0.6898 0.6702 683 weighted avg 0.7142 0.7936 0.7499 683 2023-10-16 18:30:15,723 ----------------------------------------------------------------------------------------------------