2023-10-16 18:40:58,252 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:40:58,253 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-16 18:40:58,253 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:40:58,253 MultiCorpus: 1166 train + 165 dev + 415 test sentences - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator 2023-10-16 18:40:58,254 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:40:58,254 Train: 1166 sentences 2023-10-16 18:40:58,254 (train_with_dev=False, train_with_test=False) 2023-10-16 18:40:58,254 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:40:58,254 Training Params: 2023-10-16 18:40:58,254 - learning_rate: "3e-05" 2023-10-16 18:40:58,254 - mini_batch_size: "8" 2023-10-16 18:40:58,254 - max_epochs: "10" 2023-10-16 18:40:58,254 - shuffle: "True" 2023-10-16 18:40:58,254 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:40:58,254 Plugins: 2023-10-16 18:40:58,254 - LinearScheduler | warmup_fraction: '0.1' 2023-10-16 18:40:58,254 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:40:58,254 Final evaluation on model from best epoch (best-model.pt) 2023-10-16 18:40:58,254 - metric: "('micro avg', 'f1-score')" 2023-10-16 18:40:58,254 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:40:58,254 Computation: 2023-10-16 18:40:58,254 - compute on device: cuda:0 2023-10-16 18:40:58,254 - embedding storage: none 2023-10-16 18:40:58,254 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:40:58,254 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-16 18:40:58,254 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:40:58,254 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:40:59,717 epoch 1 - iter 14/146 - loss 2.97390099 - time (sec): 1.46 - samples/sec: 3017.24 - lr: 0.000003 - momentum: 0.000000 2023-10-16 18:41:00,900 epoch 1 - iter 28/146 - loss 2.77526217 - time (sec): 2.64 - samples/sec: 3044.65 - lr: 0.000006 - momentum: 0.000000 2023-10-16 18:41:02,535 epoch 1 - iter 42/146 - loss 2.37063522 - time (sec): 4.28 - samples/sec: 2990.83 - lr: 0.000008 - momentum: 0.000000 2023-10-16 18:41:04,316 epoch 1 - iter 56/146 - loss 1.94618735 - time (sec): 6.06 - samples/sec: 2861.03 - lr: 0.000011 - momentum: 0.000000 2023-10-16 18:41:05,651 epoch 1 - iter 70/146 - loss 1.73354724 - time (sec): 7.40 - samples/sec: 2856.99 - lr: 0.000014 - momentum: 0.000000 2023-10-16 18:41:07,365 epoch 1 - iter 84/146 - loss 1.56843020 - time (sec): 9.11 - samples/sec: 2831.84 - lr: 0.000017 - momentum: 0.000000 2023-10-16 18:41:08,780 epoch 1 - iter 98/146 - loss 1.41105109 - time (sec): 10.53 - samples/sec: 2856.75 - lr: 0.000020 - momentum: 0.000000 2023-10-16 18:41:10,204 epoch 1 - iter 112/146 - loss 1.27309307 - time (sec): 11.95 - samples/sec: 2885.22 - lr: 0.000023 - momentum: 0.000000 2023-10-16 18:41:11,681 epoch 1 - iter 126/146 - loss 1.16627535 - time (sec): 13.43 - samples/sec: 2901.13 - lr: 0.000026 - momentum: 0.000000 2023-10-16 18:41:12,960 epoch 1 - iter 140/146 - loss 1.08434747 - time (sec): 14.70 - samples/sec: 2924.63 - lr: 0.000029 - momentum: 0.000000 2023-10-16 18:41:13,475 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:41:13,475 EPOCH 1 done: loss 1.0599 - lr: 0.000029 2023-10-16 18:41:14,283 DEV : loss 0.22420375049114227 - f1-score (micro avg) 0.3689 2023-10-16 18:41:14,287 saving best model 2023-10-16 18:41:14,730 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:41:16,109 epoch 2 - iter 14/146 - loss 0.29776445 - time (sec): 1.38 - samples/sec: 3082.48 - lr: 0.000030 - momentum: 0.000000 2023-10-16 18:41:17,774 epoch 2 - iter 28/146 - loss 0.31816820 - time (sec): 3.04 - samples/sec: 3091.80 - lr: 0.000029 - momentum: 0.000000 2023-10-16 18:41:19,469 epoch 2 - iter 42/146 - loss 0.33197211 - time (sec): 4.74 - samples/sec: 2877.88 - lr: 0.000029 - momentum: 0.000000 2023-10-16 18:41:21,233 epoch 2 - iter 56/146 - loss 0.29603529 - time (sec): 6.50 - samples/sec: 2823.99 - lr: 0.000029 - momentum: 0.000000 2023-10-16 18:41:22,578 epoch 2 - iter 70/146 - loss 0.28356557 - time (sec): 7.85 - samples/sec: 2835.26 - lr: 0.000028 - momentum: 0.000000 2023-10-16 18:41:23,993 epoch 2 - iter 84/146 - loss 0.27591257 - time (sec): 9.26 - samples/sec: 2858.44 - lr: 0.000028 - momentum: 0.000000 2023-10-16 18:41:25,233 epoch 2 - iter 98/146 - loss 0.26858017 - time (sec): 10.50 - samples/sec: 2878.25 - lr: 0.000028 - momentum: 0.000000 2023-10-16 18:41:26,511 epoch 2 - iter 112/146 - loss 0.25364260 - time (sec): 11.78 - samples/sec: 2913.62 - lr: 0.000027 - momentum: 0.000000 2023-10-16 18:41:28,100 epoch 2 - iter 126/146 - loss 0.24500735 - time (sec): 13.37 - samples/sec: 2896.51 - lr: 0.000027 - momentum: 0.000000 2023-10-16 18:41:29,365 epoch 2 - iter 140/146 - loss 0.23870700 - time (sec): 14.63 - samples/sec: 2907.77 - lr: 0.000027 - momentum: 0.000000 2023-10-16 18:41:30,053 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:41:30,053 EPOCH 2 done: loss 0.2330 - lr: 0.000027 2023-10-16 18:41:31,652 DEV : loss 0.13390937447547913 - f1-score (micro avg) 0.6225 2023-10-16 18:41:31,657 saving best model 2023-10-16 18:41:32,194 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:41:33,744 epoch 3 - iter 14/146 - loss 0.10987697 - time (sec): 1.55 - samples/sec: 3085.26 - lr: 0.000026 - momentum: 0.000000 2023-10-16 18:41:35,595 epoch 3 - iter 28/146 - loss 0.12399653 - time (sec): 3.40 - samples/sec: 2866.15 - lr: 0.000026 - momentum: 0.000000 2023-10-16 18:41:36,764 epoch 3 - iter 42/146 - loss 0.13044732 - time (sec): 4.57 - samples/sec: 2925.68 - lr: 0.000026 - momentum: 0.000000 2023-10-16 18:41:37,927 epoch 3 - iter 56/146 - loss 0.12721141 - time (sec): 5.73 - samples/sec: 2937.70 - lr: 0.000025 - momentum: 0.000000 2023-10-16 18:41:39,222 epoch 3 - iter 70/146 - loss 0.12759325 - time (sec): 7.03 - samples/sec: 2956.17 - lr: 0.000025 - momentum: 0.000000 2023-10-16 18:41:40,773 epoch 3 - iter 84/146 - loss 0.12258261 - time (sec): 8.58 - samples/sec: 2989.78 - lr: 0.000025 - momentum: 0.000000 2023-10-16 18:41:42,438 epoch 3 - iter 98/146 - loss 0.12529590 - time (sec): 10.24 - samples/sec: 3008.63 - lr: 0.000024 - momentum: 0.000000 2023-10-16 18:41:43,950 epoch 3 - iter 112/146 - loss 0.12515723 - time (sec): 11.75 - samples/sec: 2983.18 - lr: 0.000024 - momentum: 0.000000 2023-10-16 18:41:45,318 epoch 3 - iter 126/146 - loss 0.12458492 - time (sec): 13.12 - samples/sec: 2974.87 - lr: 0.000024 - momentum: 0.000000 2023-10-16 18:41:46,842 epoch 3 - iter 140/146 - loss 0.12705098 - time (sec): 14.65 - samples/sec: 2933.52 - lr: 0.000024 - momentum: 0.000000 2023-10-16 18:41:47,343 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:41:47,343 EPOCH 3 done: loss 0.1269 - lr: 0.000024 2023-10-16 18:41:48,612 DEV : loss 0.1261664777994156 - f1-score (micro avg) 0.7047 2023-10-16 18:41:48,617 saving best model 2023-10-16 18:41:49,190 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:41:50,833 epoch 4 - iter 14/146 - loss 0.08426860 - time (sec): 1.64 - samples/sec: 3083.74 - lr: 0.000023 - momentum: 0.000000 2023-10-16 18:41:52,488 epoch 4 - iter 28/146 - loss 0.09754780 - time (sec): 3.29 - samples/sec: 2865.44 - lr: 0.000023 - momentum: 0.000000 2023-10-16 18:41:53,974 epoch 4 - iter 42/146 - loss 0.08349681 - time (sec): 4.78 - samples/sec: 2919.36 - lr: 0.000022 - momentum: 0.000000 2023-10-16 18:41:55,124 epoch 4 - iter 56/146 - loss 0.08357742 - time (sec): 5.93 - samples/sec: 2923.52 - lr: 0.000022 - momentum: 0.000000 2023-10-16 18:41:56,664 epoch 4 - iter 70/146 - loss 0.08252609 - time (sec): 7.47 - samples/sec: 2933.63 - lr: 0.000022 - momentum: 0.000000 2023-10-16 18:41:58,302 epoch 4 - iter 84/146 - loss 0.08292927 - time (sec): 9.11 - samples/sec: 2881.12 - lr: 0.000021 - momentum: 0.000000 2023-10-16 18:41:59,580 epoch 4 - iter 98/146 - loss 0.08204986 - time (sec): 10.39 - samples/sec: 2876.23 - lr: 0.000021 - momentum: 0.000000 2023-10-16 18:42:01,361 epoch 4 - iter 112/146 - loss 0.08113285 - time (sec): 12.17 - samples/sec: 2880.19 - lr: 0.000021 - momentum: 0.000000 2023-10-16 18:42:02,669 epoch 4 - iter 126/146 - loss 0.08204963 - time (sec): 13.47 - samples/sec: 2909.89 - lr: 0.000021 - momentum: 0.000000 2023-10-16 18:42:03,984 epoch 4 - iter 140/146 - loss 0.08229389 - time (sec): 14.79 - samples/sec: 2910.43 - lr: 0.000020 - momentum: 0.000000 2023-10-16 18:42:04,425 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:42:04,425 EPOCH 4 done: loss 0.0823 - lr: 0.000020 2023-10-16 18:42:05,739 DEV : loss 0.11765624582767487 - f1-score (micro avg) 0.7382 2023-10-16 18:42:05,746 saving best model 2023-10-16 18:42:06,266 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:42:07,846 epoch 5 - iter 14/146 - loss 0.05045388 - time (sec): 1.58 - samples/sec: 2745.48 - lr: 0.000020 - momentum: 0.000000 2023-10-16 18:42:09,412 epoch 5 - iter 28/146 - loss 0.04670668 - time (sec): 3.14 - samples/sec: 2646.53 - lr: 0.000019 - momentum: 0.000000 2023-10-16 18:42:10,934 epoch 5 - iter 42/146 - loss 0.04342810 - time (sec): 4.66 - samples/sec: 2726.93 - lr: 0.000019 - momentum: 0.000000 2023-10-16 18:42:12,625 epoch 5 - iter 56/146 - loss 0.05072506 - time (sec): 6.35 - samples/sec: 2831.43 - lr: 0.000019 - momentum: 0.000000 2023-10-16 18:42:14,211 epoch 5 - iter 70/146 - loss 0.04968675 - time (sec): 7.94 - samples/sec: 2863.10 - lr: 0.000018 - momentum: 0.000000 2023-10-16 18:42:15,525 epoch 5 - iter 84/146 - loss 0.05401489 - time (sec): 9.25 - samples/sec: 2881.42 - lr: 0.000018 - momentum: 0.000000 2023-10-16 18:42:17,136 epoch 5 - iter 98/146 - loss 0.05351693 - time (sec): 10.87 - samples/sec: 2916.64 - lr: 0.000018 - momentum: 0.000000 2023-10-16 18:42:18,316 epoch 5 - iter 112/146 - loss 0.05569047 - time (sec): 12.04 - samples/sec: 2930.09 - lr: 0.000018 - momentum: 0.000000 2023-10-16 18:42:19,708 epoch 5 - iter 126/146 - loss 0.05757850 - time (sec): 13.44 - samples/sec: 2926.24 - lr: 0.000017 - momentum: 0.000000 2023-10-16 18:42:20,948 epoch 5 - iter 140/146 - loss 0.05972547 - time (sec): 14.68 - samples/sec: 2944.98 - lr: 0.000017 - momentum: 0.000000 2023-10-16 18:42:21,431 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:42:21,431 EPOCH 5 done: loss 0.0601 - lr: 0.000017 2023-10-16 18:42:22,719 DEV : loss 0.10323068499565125 - f1-score (micro avg) 0.7639 2023-10-16 18:42:22,724 saving best model 2023-10-16 18:42:23,609 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:42:24,951 epoch 6 - iter 14/146 - loss 0.06420642 - time (sec): 1.34 - samples/sec: 3160.55 - lr: 0.000016 - momentum: 0.000000 2023-10-16 18:42:26,500 epoch 6 - iter 28/146 - loss 0.05228968 - time (sec): 2.89 - samples/sec: 3171.17 - lr: 0.000016 - momentum: 0.000000 2023-10-16 18:42:27,809 epoch 6 - iter 42/146 - loss 0.04666758 - time (sec): 4.20 - samples/sec: 3131.77 - lr: 0.000016 - momentum: 0.000000 2023-10-16 18:42:29,236 epoch 6 - iter 56/146 - loss 0.04726276 - time (sec): 5.62 - samples/sec: 3044.60 - lr: 0.000015 - momentum: 0.000000 2023-10-16 18:42:30,757 epoch 6 - iter 70/146 - loss 0.04494478 - time (sec): 7.14 - samples/sec: 3007.68 - lr: 0.000015 - momentum: 0.000000 2023-10-16 18:42:32,150 epoch 6 - iter 84/146 - loss 0.04396196 - time (sec): 8.54 - samples/sec: 2974.60 - lr: 0.000015 - momentum: 0.000000 2023-10-16 18:42:33,587 epoch 6 - iter 98/146 - loss 0.04190146 - time (sec): 9.97 - samples/sec: 3005.86 - lr: 0.000015 - momentum: 0.000000 2023-10-16 18:42:34,981 epoch 6 - iter 112/146 - loss 0.04097606 - time (sec): 11.37 - samples/sec: 3021.65 - lr: 0.000014 - momentum: 0.000000 2023-10-16 18:42:36,451 epoch 6 - iter 126/146 - loss 0.04174709 - time (sec): 12.84 - samples/sec: 3030.85 - lr: 0.000014 - momentum: 0.000000 2023-10-16 18:42:37,774 epoch 6 - iter 140/146 - loss 0.04259495 - time (sec): 14.16 - samples/sec: 3013.59 - lr: 0.000014 - momentum: 0.000000 2023-10-16 18:42:38,410 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:42:38,411 EPOCH 6 done: loss 0.0429 - lr: 0.000014 2023-10-16 18:42:39,666 DEV : loss 0.12086369842290878 - f1-score (micro avg) 0.7331 2023-10-16 18:42:39,671 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:42:41,060 epoch 7 - iter 14/146 - loss 0.03536041 - time (sec): 1.39 - samples/sec: 3083.88 - lr: 0.000013 - momentum: 0.000000 2023-10-16 18:42:42,338 epoch 7 - iter 28/146 - loss 0.03135766 - time (sec): 2.67 - samples/sec: 3113.18 - lr: 0.000013 - momentum: 0.000000 2023-10-16 18:42:43,772 epoch 7 - iter 42/146 - loss 0.02784114 - time (sec): 4.10 - samples/sec: 3139.52 - lr: 0.000012 - momentum: 0.000000 2023-10-16 18:42:45,298 epoch 7 - iter 56/146 - loss 0.02751308 - time (sec): 5.63 - samples/sec: 3096.78 - lr: 0.000012 - momentum: 0.000000 2023-10-16 18:42:46,664 epoch 7 - iter 70/146 - loss 0.02605524 - time (sec): 6.99 - samples/sec: 3045.10 - lr: 0.000012 - momentum: 0.000000 2023-10-16 18:42:48,465 epoch 7 - iter 84/146 - loss 0.03041736 - time (sec): 8.79 - samples/sec: 2978.93 - lr: 0.000012 - momentum: 0.000000 2023-10-16 18:42:49,866 epoch 7 - iter 98/146 - loss 0.02942324 - time (sec): 10.19 - samples/sec: 2977.37 - lr: 0.000011 - momentum: 0.000000 2023-10-16 18:42:51,429 epoch 7 - iter 112/146 - loss 0.03021149 - time (sec): 11.76 - samples/sec: 2934.42 - lr: 0.000011 - momentum: 0.000000 2023-10-16 18:42:52,967 epoch 7 - iter 126/146 - loss 0.03017723 - time (sec): 13.30 - samples/sec: 2911.90 - lr: 0.000011 - momentum: 0.000000 2023-10-16 18:42:54,512 epoch 7 - iter 140/146 - loss 0.03237412 - time (sec): 14.84 - samples/sec: 2895.37 - lr: 0.000010 - momentum: 0.000000 2023-10-16 18:42:55,004 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:42:55,005 EPOCH 7 done: loss 0.0319 - lr: 0.000010 2023-10-16 18:42:56,306 DEV : loss 0.12415074557065964 - f1-score (micro avg) 0.7716 2023-10-16 18:42:56,312 saving best model 2023-10-16 18:42:56,844 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:42:58,269 epoch 8 - iter 14/146 - loss 0.01649001 - time (sec): 1.42 - samples/sec: 3042.84 - lr: 0.000010 - momentum: 0.000000 2023-10-16 18:42:59,833 epoch 8 - iter 28/146 - loss 0.01761338 - time (sec): 2.99 - samples/sec: 2985.23 - lr: 0.000009 - momentum: 0.000000 2023-10-16 18:43:01,519 epoch 8 - iter 42/146 - loss 0.02810541 - time (sec): 4.67 - samples/sec: 2933.19 - lr: 0.000009 - momentum: 0.000000 2023-10-16 18:43:02,724 epoch 8 - iter 56/146 - loss 0.02894650 - time (sec): 5.88 - samples/sec: 2883.58 - lr: 0.000009 - momentum: 0.000000 2023-10-16 18:43:04,191 epoch 8 - iter 70/146 - loss 0.02727455 - time (sec): 7.34 - samples/sec: 2927.98 - lr: 0.000009 - momentum: 0.000000 2023-10-16 18:43:05,867 epoch 8 - iter 84/146 - loss 0.02891311 - time (sec): 9.02 - samples/sec: 2917.72 - lr: 0.000008 - momentum: 0.000000 2023-10-16 18:43:07,161 epoch 8 - iter 98/146 - loss 0.02597422 - time (sec): 10.31 - samples/sec: 2963.64 - lr: 0.000008 - momentum: 0.000000 2023-10-16 18:43:08,650 epoch 8 - iter 112/146 - loss 0.02727534 - time (sec): 11.80 - samples/sec: 2946.61 - lr: 0.000008 - momentum: 0.000000 2023-10-16 18:43:10,154 epoch 8 - iter 126/146 - loss 0.02628321 - time (sec): 13.31 - samples/sec: 2947.47 - lr: 0.000007 - momentum: 0.000000 2023-10-16 18:43:11,406 epoch 8 - iter 140/146 - loss 0.02579516 - time (sec): 14.56 - samples/sec: 2952.88 - lr: 0.000007 - momentum: 0.000000 2023-10-16 18:43:11,909 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:43:11,909 EPOCH 8 done: loss 0.0259 - lr: 0.000007 2023-10-16 18:43:13,196 DEV : loss 0.1355086863040924 - f1-score (micro avg) 0.7191 2023-10-16 18:43:13,202 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:43:14,688 epoch 9 - iter 14/146 - loss 0.05423101 - time (sec): 1.48 - samples/sec: 3174.47 - lr: 0.000006 - momentum: 0.000000 2023-10-16 18:43:16,439 epoch 9 - iter 28/146 - loss 0.04240930 - time (sec): 3.24 - samples/sec: 2894.49 - lr: 0.000006 - momentum: 0.000000 2023-10-16 18:43:17,996 epoch 9 - iter 42/146 - loss 0.03263167 - time (sec): 4.79 - samples/sec: 2719.93 - lr: 0.000006 - momentum: 0.000000 2023-10-16 18:43:19,384 epoch 9 - iter 56/146 - loss 0.02854858 - time (sec): 6.18 - samples/sec: 2772.88 - lr: 0.000006 - momentum: 0.000000 2023-10-16 18:43:20,598 epoch 9 - iter 70/146 - loss 0.02688335 - time (sec): 7.39 - samples/sec: 2826.17 - lr: 0.000005 - momentum: 0.000000 2023-10-16 18:43:21,940 epoch 9 - iter 84/146 - loss 0.02662945 - time (sec): 8.74 - samples/sec: 2861.28 - lr: 0.000005 - momentum: 0.000000 2023-10-16 18:43:23,486 epoch 9 - iter 98/146 - loss 0.02365962 - time (sec): 10.28 - samples/sec: 2875.77 - lr: 0.000005 - momentum: 0.000000 2023-10-16 18:43:24,796 epoch 9 - iter 112/146 - loss 0.02270041 - time (sec): 11.59 - samples/sec: 2868.17 - lr: 0.000004 - momentum: 0.000000 2023-10-16 18:43:26,452 epoch 9 - iter 126/146 - loss 0.02197911 - time (sec): 13.25 - samples/sec: 2846.24 - lr: 0.000004 - momentum: 0.000000 2023-10-16 18:43:28,011 epoch 9 - iter 140/146 - loss 0.02169160 - time (sec): 14.81 - samples/sec: 2863.71 - lr: 0.000004 - momentum: 0.000000 2023-10-16 18:43:28,594 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:43:28,595 EPOCH 9 done: loss 0.0215 - lr: 0.000004 2023-10-16 18:43:29,856 DEV : loss 0.13685813546180725 - f1-score (micro avg) 0.7484 2023-10-16 18:43:29,862 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:43:31,152 epoch 10 - iter 14/146 - loss 0.02494954 - time (sec): 1.29 - samples/sec: 2857.52 - lr: 0.000003 - momentum: 0.000000 2023-10-16 18:43:32,476 epoch 10 - iter 28/146 - loss 0.01602085 - time (sec): 2.61 - samples/sec: 3020.65 - lr: 0.000003 - momentum: 0.000000 2023-10-16 18:43:33,794 epoch 10 - iter 42/146 - loss 0.01621527 - time (sec): 3.93 - samples/sec: 3066.78 - lr: 0.000003 - momentum: 0.000000 2023-10-16 18:43:35,570 epoch 10 - iter 56/146 - loss 0.01678019 - time (sec): 5.71 - samples/sec: 2969.94 - lr: 0.000002 - momentum: 0.000000 2023-10-16 18:43:37,094 epoch 10 - iter 70/146 - loss 0.01639143 - time (sec): 7.23 - samples/sec: 3007.77 - lr: 0.000002 - momentum: 0.000000 2023-10-16 18:43:38,504 epoch 10 - iter 84/146 - loss 0.01946257 - time (sec): 8.64 - samples/sec: 3023.29 - lr: 0.000002 - momentum: 0.000000 2023-10-16 18:43:39,791 epoch 10 - iter 98/146 - loss 0.01874862 - time (sec): 9.93 - samples/sec: 3038.93 - lr: 0.000001 - momentum: 0.000000 2023-10-16 18:43:41,180 epoch 10 - iter 112/146 - loss 0.01865213 - time (sec): 11.32 - samples/sec: 3027.93 - lr: 0.000001 - momentum: 0.000000 2023-10-16 18:43:42,660 epoch 10 - iter 126/146 - loss 0.01993260 - time (sec): 12.80 - samples/sec: 3034.99 - lr: 0.000001 - momentum: 0.000000 2023-10-16 18:43:44,108 epoch 10 - iter 140/146 - loss 0.01869978 - time (sec): 14.25 - samples/sec: 3042.21 - lr: 0.000000 - momentum: 0.000000 2023-10-16 18:43:44,558 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:43:44,558 EPOCH 10 done: loss 0.0183 - lr: 0.000000 2023-10-16 18:43:45,833 DEV : loss 0.14017795026302338 - f1-score (micro avg) 0.742 2023-10-16 18:43:46,218 ---------------------------------------------------------------------------------------------------- 2023-10-16 18:43:46,219 Loading model from best epoch ... 2023-10-16 18:43:47,826 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-16 18:43:50,204 Results: - F-score (micro) 0.7545 - F-score (macro) 0.6678 - Accuracy 0.6271 By class: precision recall f1-score support PER 0.8027 0.8420 0.8219 348 LOC 0.6707 0.8429 0.7470 261 ORG 0.3500 0.4038 0.3750 52 HumanProd 0.7273 0.7273 0.7273 22 micro avg 0.7097 0.8053 0.7545 683 macro avg 0.6377 0.7040 0.6678 683 weighted avg 0.7154 0.8053 0.7562 683 2023-10-16 18:43:50,205 ----------------------------------------------------------------------------------------------------