stefan-it's picture
Upload folder using huggingface_hub
e74e2c0
2023-10-16 18:40:58,252 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,253 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 18:40:58,253 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,253 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-16 18:40:58,254 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,254 Train: 1166 sentences
2023-10-16 18:40:58,254 (train_with_dev=False, train_with_test=False)
2023-10-16 18:40:58,254 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,254 Training Params:
2023-10-16 18:40:58,254 - learning_rate: "3e-05"
2023-10-16 18:40:58,254 - mini_batch_size: "8"
2023-10-16 18:40:58,254 - max_epochs: "10"
2023-10-16 18:40:58,254 - shuffle: "True"
2023-10-16 18:40:58,254 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,254 Plugins:
2023-10-16 18:40:58,254 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 18:40:58,254 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,254 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 18:40:58,254 - metric: "('micro avg', 'f1-score')"
2023-10-16 18:40:58,254 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,254 Computation:
2023-10-16 18:40:58,254 - compute on device: cuda:0
2023-10-16 18:40:58,254 - embedding storage: none
2023-10-16 18:40:58,254 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,254 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-16 18:40:58,254 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,254 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:59,717 epoch 1 - iter 14/146 - loss 2.97390099 - time (sec): 1.46 - samples/sec: 3017.24 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:41:00,900 epoch 1 - iter 28/146 - loss 2.77526217 - time (sec): 2.64 - samples/sec: 3044.65 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:41:02,535 epoch 1 - iter 42/146 - loss 2.37063522 - time (sec): 4.28 - samples/sec: 2990.83 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:41:04,316 epoch 1 - iter 56/146 - loss 1.94618735 - time (sec): 6.06 - samples/sec: 2861.03 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:41:05,651 epoch 1 - iter 70/146 - loss 1.73354724 - time (sec): 7.40 - samples/sec: 2856.99 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:41:07,365 epoch 1 - iter 84/146 - loss 1.56843020 - time (sec): 9.11 - samples/sec: 2831.84 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:41:08,780 epoch 1 - iter 98/146 - loss 1.41105109 - time (sec): 10.53 - samples/sec: 2856.75 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:41:10,204 epoch 1 - iter 112/146 - loss 1.27309307 - time (sec): 11.95 - samples/sec: 2885.22 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:41:11,681 epoch 1 - iter 126/146 - loss 1.16627535 - time (sec): 13.43 - samples/sec: 2901.13 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:41:12,960 epoch 1 - iter 140/146 - loss 1.08434747 - time (sec): 14.70 - samples/sec: 2924.63 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:41:13,475 ----------------------------------------------------------------------------------------------------
2023-10-16 18:41:13,475 EPOCH 1 done: loss 1.0599 - lr: 0.000029
2023-10-16 18:41:14,283 DEV : loss 0.22420375049114227 - f1-score (micro avg) 0.3689
2023-10-16 18:41:14,287 saving best model
2023-10-16 18:41:14,730 ----------------------------------------------------------------------------------------------------
2023-10-16 18:41:16,109 epoch 2 - iter 14/146 - loss 0.29776445 - time (sec): 1.38 - samples/sec: 3082.48 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:41:17,774 epoch 2 - iter 28/146 - loss 0.31816820 - time (sec): 3.04 - samples/sec: 3091.80 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:41:19,469 epoch 2 - iter 42/146 - loss 0.33197211 - time (sec): 4.74 - samples/sec: 2877.88 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:41:21,233 epoch 2 - iter 56/146 - loss 0.29603529 - time (sec): 6.50 - samples/sec: 2823.99 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:41:22,578 epoch 2 - iter 70/146 - loss 0.28356557 - time (sec): 7.85 - samples/sec: 2835.26 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:41:23,993 epoch 2 - iter 84/146 - loss 0.27591257 - time (sec): 9.26 - samples/sec: 2858.44 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:41:25,233 epoch 2 - iter 98/146 - loss 0.26858017 - time (sec): 10.50 - samples/sec: 2878.25 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:41:26,511 epoch 2 - iter 112/146 - loss 0.25364260 - time (sec): 11.78 - samples/sec: 2913.62 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:41:28,100 epoch 2 - iter 126/146 - loss 0.24500735 - time (sec): 13.37 - samples/sec: 2896.51 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:41:29,365 epoch 2 - iter 140/146 - loss 0.23870700 - time (sec): 14.63 - samples/sec: 2907.77 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:41:30,053 ----------------------------------------------------------------------------------------------------
2023-10-16 18:41:30,053 EPOCH 2 done: loss 0.2330 - lr: 0.000027
2023-10-16 18:41:31,652 DEV : loss 0.13390937447547913 - f1-score (micro avg) 0.6225
2023-10-16 18:41:31,657 saving best model
2023-10-16 18:41:32,194 ----------------------------------------------------------------------------------------------------
2023-10-16 18:41:33,744 epoch 3 - iter 14/146 - loss 0.10987697 - time (sec): 1.55 - samples/sec: 3085.26 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:41:35,595 epoch 3 - iter 28/146 - loss 0.12399653 - time (sec): 3.40 - samples/sec: 2866.15 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:41:36,764 epoch 3 - iter 42/146 - loss 0.13044732 - time (sec): 4.57 - samples/sec: 2925.68 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:41:37,927 epoch 3 - iter 56/146 - loss 0.12721141 - time (sec): 5.73 - samples/sec: 2937.70 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:41:39,222 epoch 3 - iter 70/146 - loss 0.12759325 - time (sec): 7.03 - samples/sec: 2956.17 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:41:40,773 epoch 3 - iter 84/146 - loss 0.12258261 - time (sec): 8.58 - samples/sec: 2989.78 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:41:42,438 epoch 3 - iter 98/146 - loss 0.12529590 - time (sec): 10.24 - samples/sec: 3008.63 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:41:43,950 epoch 3 - iter 112/146 - loss 0.12515723 - time (sec): 11.75 - samples/sec: 2983.18 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:41:45,318 epoch 3 - iter 126/146 - loss 0.12458492 - time (sec): 13.12 - samples/sec: 2974.87 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:41:46,842 epoch 3 - iter 140/146 - loss 0.12705098 - time (sec): 14.65 - samples/sec: 2933.52 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:41:47,343 ----------------------------------------------------------------------------------------------------
2023-10-16 18:41:47,343 EPOCH 3 done: loss 0.1269 - lr: 0.000024
2023-10-16 18:41:48,612 DEV : loss 0.1261664777994156 - f1-score (micro avg) 0.7047
2023-10-16 18:41:48,617 saving best model
2023-10-16 18:41:49,190 ----------------------------------------------------------------------------------------------------
2023-10-16 18:41:50,833 epoch 4 - iter 14/146 - loss 0.08426860 - time (sec): 1.64 - samples/sec: 3083.74 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:41:52,488 epoch 4 - iter 28/146 - loss 0.09754780 - time (sec): 3.29 - samples/sec: 2865.44 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:41:53,974 epoch 4 - iter 42/146 - loss 0.08349681 - time (sec): 4.78 - samples/sec: 2919.36 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:41:55,124 epoch 4 - iter 56/146 - loss 0.08357742 - time (sec): 5.93 - samples/sec: 2923.52 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:41:56,664 epoch 4 - iter 70/146 - loss 0.08252609 - time (sec): 7.47 - samples/sec: 2933.63 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:41:58,302 epoch 4 - iter 84/146 - loss 0.08292927 - time (sec): 9.11 - samples/sec: 2881.12 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:41:59,580 epoch 4 - iter 98/146 - loss 0.08204986 - time (sec): 10.39 - samples/sec: 2876.23 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:42:01,361 epoch 4 - iter 112/146 - loss 0.08113285 - time (sec): 12.17 - samples/sec: 2880.19 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:42:02,669 epoch 4 - iter 126/146 - loss 0.08204963 - time (sec): 13.47 - samples/sec: 2909.89 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:42:03,984 epoch 4 - iter 140/146 - loss 0.08229389 - time (sec): 14.79 - samples/sec: 2910.43 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:42:04,425 ----------------------------------------------------------------------------------------------------
2023-10-16 18:42:04,425 EPOCH 4 done: loss 0.0823 - lr: 0.000020
2023-10-16 18:42:05,739 DEV : loss 0.11765624582767487 - f1-score (micro avg) 0.7382
2023-10-16 18:42:05,746 saving best model
2023-10-16 18:42:06,266 ----------------------------------------------------------------------------------------------------
2023-10-16 18:42:07,846 epoch 5 - iter 14/146 - loss 0.05045388 - time (sec): 1.58 - samples/sec: 2745.48 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:42:09,412 epoch 5 - iter 28/146 - loss 0.04670668 - time (sec): 3.14 - samples/sec: 2646.53 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:42:10,934 epoch 5 - iter 42/146 - loss 0.04342810 - time (sec): 4.66 - samples/sec: 2726.93 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:42:12,625 epoch 5 - iter 56/146 - loss 0.05072506 - time (sec): 6.35 - samples/sec: 2831.43 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:42:14,211 epoch 5 - iter 70/146 - loss 0.04968675 - time (sec): 7.94 - samples/sec: 2863.10 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:42:15,525 epoch 5 - iter 84/146 - loss 0.05401489 - time (sec): 9.25 - samples/sec: 2881.42 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:42:17,136 epoch 5 - iter 98/146 - loss 0.05351693 - time (sec): 10.87 - samples/sec: 2916.64 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:42:18,316 epoch 5 - iter 112/146 - loss 0.05569047 - time (sec): 12.04 - samples/sec: 2930.09 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:42:19,708 epoch 5 - iter 126/146 - loss 0.05757850 - time (sec): 13.44 - samples/sec: 2926.24 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:42:20,948 epoch 5 - iter 140/146 - loss 0.05972547 - time (sec): 14.68 - samples/sec: 2944.98 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:42:21,431 ----------------------------------------------------------------------------------------------------
2023-10-16 18:42:21,431 EPOCH 5 done: loss 0.0601 - lr: 0.000017
2023-10-16 18:42:22,719 DEV : loss 0.10323068499565125 - f1-score (micro avg) 0.7639
2023-10-16 18:42:22,724 saving best model
2023-10-16 18:42:23,609 ----------------------------------------------------------------------------------------------------
2023-10-16 18:42:24,951 epoch 6 - iter 14/146 - loss 0.06420642 - time (sec): 1.34 - samples/sec: 3160.55 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:42:26,500 epoch 6 - iter 28/146 - loss 0.05228968 - time (sec): 2.89 - samples/sec: 3171.17 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:42:27,809 epoch 6 - iter 42/146 - loss 0.04666758 - time (sec): 4.20 - samples/sec: 3131.77 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:42:29,236 epoch 6 - iter 56/146 - loss 0.04726276 - time (sec): 5.62 - samples/sec: 3044.60 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:42:30,757 epoch 6 - iter 70/146 - loss 0.04494478 - time (sec): 7.14 - samples/sec: 3007.68 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:42:32,150 epoch 6 - iter 84/146 - loss 0.04396196 - time (sec): 8.54 - samples/sec: 2974.60 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:42:33,587 epoch 6 - iter 98/146 - loss 0.04190146 - time (sec): 9.97 - samples/sec: 3005.86 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:42:34,981 epoch 6 - iter 112/146 - loss 0.04097606 - time (sec): 11.37 - samples/sec: 3021.65 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:42:36,451 epoch 6 - iter 126/146 - loss 0.04174709 - time (sec): 12.84 - samples/sec: 3030.85 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:42:37,774 epoch 6 - iter 140/146 - loss 0.04259495 - time (sec): 14.16 - samples/sec: 3013.59 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:42:38,410 ----------------------------------------------------------------------------------------------------
2023-10-16 18:42:38,411 EPOCH 6 done: loss 0.0429 - lr: 0.000014
2023-10-16 18:42:39,666 DEV : loss 0.12086369842290878 - f1-score (micro avg) 0.7331
2023-10-16 18:42:39,671 ----------------------------------------------------------------------------------------------------
2023-10-16 18:42:41,060 epoch 7 - iter 14/146 - loss 0.03536041 - time (sec): 1.39 - samples/sec: 3083.88 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:42:42,338 epoch 7 - iter 28/146 - loss 0.03135766 - time (sec): 2.67 - samples/sec: 3113.18 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:42:43,772 epoch 7 - iter 42/146 - loss 0.02784114 - time (sec): 4.10 - samples/sec: 3139.52 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:42:45,298 epoch 7 - iter 56/146 - loss 0.02751308 - time (sec): 5.63 - samples/sec: 3096.78 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:42:46,664 epoch 7 - iter 70/146 - loss 0.02605524 - time (sec): 6.99 - samples/sec: 3045.10 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:42:48,465 epoch 7 - iter 84/146 - loss 0.03041736 - time (sec): 8.79 - samples/sec: 2978.93 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:42:49,866 epoch 7 - iter 98/146 - loss 0.02942324 - time (sec): 10.19 - samples/sec: 2977.37 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:42:51,429 epoch 7 - iter 112/146 - loss 0.03021149 - time (sec): 11.76 - samples/sec: 2934.42 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:42:52,967 epoch 7 - iter 126/146 - loss 0.03017723 - time (sec): 13.30 - samples/sec: 2911.90 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:42:54,512 epoch 7 - iter 140/146 - loss 0.03237412 - time (sec): 14.84 - samples/sec: 2895.37 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:42:55,004 ----------------------------------------------------------------------------------------------------
2023-10-16 18:42:55,005 EPOCH 7 done: loss 0.0319 - lr: 0.000010
2023-10-16 18:42:56,306 DEV : loss 0.12415074557065964 - f1-score (micro avg) 0.7716
2023-10-16 18:42:56,312 saving best model
2023-10-16 18:42:56,844 ----------------------------------------------------------------------------------------------------
2023-10-16 18:42:58,269 epoch 8 - iter 14/146 - loss 0.01649001 - time (sec): 1.42 - samples/sec: 3042.84 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:42:59,833 epoch 8 - iter 28/146 - loss 0.01761338 - time (sec): 2.99 - samples/sec: 2985.23 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:43:01,519 epoch 8 - iter 42/146 - loss 0.02810541 - time (sec): 4.67 - samples/sec: 2933.19 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:43:02,724 epoch 8 - iter 56/146 - loss 0.02894650 - time (sec): 5.88 - samples/sec: 2883.58 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:43:04,191 epoch 8 - iter 70/146 - loss 0.02727455 - time (sec): 7.34 - samples/sec: 2927.98 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:43:05,867 epoch 8 - iter 84/146 - loss 0.02891311 - time (sec): 9.02 - samples/sec: 2917.72 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:43:07,161 epoch 8 - iter 98/146 - loss 0.02597422 - time (sec): 10.31 - samples/sec: 2963.64 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:43:08,650 epoch 8 - iter 112/146 - loss 0.02727534 - time (sec): 11.80 - samples/sec: 2946.61 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:43:10,154 epoch 8 - iter 126/146 - loss 0.02628321 - time (sec): 13.31 - samples/sec: 2947.47 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:43:11,406 epoch 8 - iter 140/146 - loss 0.02579516 - time (sec): 14.56 - samples/sec: 2952.88 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:43:11,909 ----------------------------------------------------------------------------------------------------
2023-10-16 18:43:11,909 EPOCH 8 done: loss 0.0259 - lr: 0.000007
2023-10-16 18:43:13,196 DEV : loss 0.1355086863040924 - f1-score (micro avg) 0.7191
2023-10-16 18:43:13,202 ----------------------------------------------------------------------------------------------------
2023-10-16 18:43:14,688 epoch 9 - iter 14/146 - loss 0.05423101 - time (sec): 1.48 - samples/sec: 3174.47 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:43:16,439 epoch 9 - iter 28/146 - loss 0.04240930 - time (sec): 3.24 - samples/sec: 2894.49 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:43:17,996 epoch 9 - iter 42/146 - loss 0.03263167 - time (sec): 4.79 - samples/sec: 2719.93 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:43:19,384 epoch 9 - iter 56/146 - loss 0.02854858 - time (sec): 6.18 - samples/sec: 2772.88 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:43:20,598 epoch 9 - iter 70/146 - loss 0.02688335 - time (sec): 7.39 - samples/sec: 2826.17 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:43:21,940 epoch 9 - iter 84/146 - loss 0.02662945 - time (sec): 8.74 - samples/sec: 2861.28 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:43:23,486 epoch 9 - iter 98/146 - loss 0.02365962 - time (sec): 10.28 - samples/sec: 2875.77 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:43:24,796 epoch 9 - iter 112/146 - loss 0.02270041 - time (sec): 11.59 - samples/sec: 2868.17 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:43:26,452 epoch 9 - iter 126/146 - loss 0.02197911 - time (sec): 13.25 - samples/sec: 2846.24 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:43:28,011 epoch 9 - iter 140/146 - loss 0.02169160 - time (sec): 14.81 - samples/sec: 2863.71 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:43:28,594 ----------------------------------------------------------------------------------------------------
2023-10-16 18:43:28,595 EPOCH 9 done: loss 0.0215 - lr: 0.000004
2023-10-16 18:43:29,856 DEV : loss 0.13685813546180725 - f1-score (micro avg) 0.7484
2023-10-16 18:43:29,862 ----------------------------------------------------------------------------------------------------
2023-10-16 18:43:31,152 epoch 10 - iter 14/146 - loss 0.02494954 - time (sec): 1.29 - samples/sec: 2857.52 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:43:32,476 epoch 10 - iter 28/146 - loss 0.01602085 - time (sec): 2.61 - samples/sec: 3020.65 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:43:33,794 epoch 10 - iter 42/146 - loss 0.01621527 - time (sec): 3.93 - samples/sec: 3066.78 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:43:35,570 epoch 10 - iter 56/146 - loss 0.01678019 - time (sec): 5.71 - samples/sec: 2969.94 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:43:37,094 epoch 10 - iter 70/146 - loss 0.01639143 - time (sec): 7.23 - samples/sec: 3007.77 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:43:38,504 epoch 10 - iter 84/146 - loss 0.01946257 - time (sec): 8.64 - samples/sec: 3023.29 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:43:39,791 epoch 10 - iter 98/146 - loss 0.01874862 - time (sec): 9.93 - samples/sec: 3038.93 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:43:41,180 epoch 10 - iter 112/146 - loss 0.01865213 - time (sec): 11.32 - samples/sec: 3027.93 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:43:42,660 epoch 10 - iter 126/146 - loss 0.01993260 - time (sec): 12.80 - samples/sec: 3034.99 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:43:44,108 epoch 10 - iter 140/146 - loss 0.01869978 - time (sec): 14.25 - samples/sec: 3042.21 - lr: 0.000000 - momentum: 0.000000
2023-10-16 18:43:44,558 ----------------------------------------------------------------------------------------------------
2023-10-16 18:43:44,558 EPOCH 10 done: loss 0.0183 - lr: 0.000000
2023-10-16 18:43:45,833 DEV : loss 0.14017795026302338 - f1-score (micro avg) 0.742
2023-10-16 18:43:46,218 ----------------------------------------------------------------------------------------------------
2023-10-16 18:43:46,219 Loading model from best epoch ...
2023-10-16 18:43:47,826 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-16 18:43:50,204
Results:
- F-score (micro) 0.7545
- F-score (macro) 0.6678
- Accuracy 0.6271
By class:
precision recall f1-score support
PER 0.8027 0.8420 0.8219 348
LOC 0.6707 0.8429 0.7470 261
ORG 0.3500 0.4038 0.3750 52
HumanProd 0.7273 0.7273 0.7273 22
micro avg 0.7097 0.8053 0.7545 683
macro avg 0.6377 0.7040 0.6678 683
weighted avg 0.7154 0.8053 0.7562 683
2023-10-16 18:43:50,205 ----------------------------------------------------------------------------------------------------