stefan-it's picture
Upload folder using huggingface_hub
a165ff7
2023-10-16 18:27:24,156 ----------------------------------------------------------------------------------------------------
2023-10-16 18:27:24,157 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 18:27:24,157 ----------------------------------------------------------------------------------------------------
2023-10-16 18:27:24,157 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-16 18:27:24,157 ----------------------------------------------------------------------------------------------------
2023-10-16 18:27:24,157 Train: 1166 sentences
2023-10-16 18:27:24,157 (train_with_dev=False, train_with_test=False)
2023-10-16 18:27:24,157 ----------------------------------------------------------------------------------------------------
2023-10-16 18:27:24,158 Training Params:
2023-10-16 18:27:24,158 - learning_rate: "3e-05"
2023-10-16 18:27:24,158 - mini_batch_size: "8"
2023-10-16 18:27:24,158 - max_epochs: "10"
2023-10-16 18:27:24,158 - shuffle: "True"
2023-10-16 18:27:24,158 ----------------------------------------------------------------------------------------------------
2023-10-16 18:27:24,158 Plugins:
2023-10-16 18:27:24,158 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 18:27:24,158 ----------------------------------------------------------------------------------------------------
2023-10-16 18:27:24,158 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 18:27:24,158 - metric: "('micro avg', 'f1-score')"
2023-10-16 18:27:24,158 ----------------------------------------------------------------------------------------------------
2023-10-16 18:27:24,158 Computation:
2023-10-16 18:27:24,158 - compute on device: cuda:0
2023-10-16 18:27:24,158 - embedding storage: none
2023-10-16 18:27:24,158 ----------------------------------------------------------------------------------------------------
2023-10-16 18:27:24,158 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-16 18:27:24,158 ----------------------------------------------------------------------------------------------------
2023-10-16 18:27:24,158 ----------------------------------------------------------------------------------------------------
2023-10-16 18:27:25,317 epoch 1 - iter 14/146 - loss 2.93513404 - time (sec): 1.16 - samples/sec: 3436.39 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:27:26,505 epoch 1 - iter 28/146 - loss 2.77389228 - time (sec): 2.35 - samples/sec: 3099.80 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:27:28,131 epoch 1 - iter 42/146 - loss 2.21274536 - time (sec): 3.97 - samples/sec: 3082.74 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:27:29,576 epoch 1 - iter 56/146 - loss 1.86758775 - time (sec): 5.42 - samples/sec: 3038.81 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:27:30,809 epoch 1 - iter 70/146 - loss 1.63388705 - time (sec): 6.65 - samples/sec: 2992.94 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:27:32,063 epoch 1 - iter 84/146 - loss 1.51508448 - time (sec): 7.90 - samples/sec: 2991.87 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:27:33,978 epoch 1 - iter 98/146 - loss 1.33298731 - time (sec): 9.82 - samples/sec: 2942.53 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:27:35,323 epoch 1 - iter 112/146 - loss 1.20821560 - time (sec): 11.16 - samples/sec: 2986.86 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:27:36,996 epoch 1 - iter 126/146 - loss 1.09396532 - time (sec): 12.84 - samples/sec: 2965.34 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:27:38,420 epoch 1 - iter 140/146 - loss 1.00755548 - time (sec): 14.26 - samples/sec: 2973.03 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:27:39,093 ----------------------------------------------------------------------------------------------------
2023-10-16 18:27:39,094 EPOCH 1 done: loss 0.9748 - lr: 0.000029
2023-10-16 18:27:39,905 DEV : loss 0.21248747408390045 - f1-score (micro avg) 0.4671
2023-10-16 18:27:39,911 saving best model
2023-10-16 18:27:40,379 ----------------------------------------------------------------------------------------------------
2023-10-16 18:27:41,907 epoch 2 - iter 14/146 - loss 0.25741814 - time (sec): 1.53 - samples/sec: 3144.33 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:27:43,594 epoch 2 - iter 28/146 - loss 0.25094580 - time (sec): 3.21 - samples/sec: 2960.80 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:27:44,866 epoch 2 - iter 42/146 - loss 0.25091985 - time (sec): 4.49 - samples/sec: 2947.16 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:27:46,323 epoch 2 - iter 56/146 - loss 0.23948423 - time (sec): 5.94 - samples/sec: 2914.89 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:27:47,677 epoch 2 - iter 70/146 - loss 0.23078029 - time (sec): 7.30 - samples/sec: 2876.03 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:27:49,415 epoch 2 - iter 84/146 - loss 0.25123902 - time (sec): 9.03 - samples/sec: 2871.89 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:27:50,999 epoch 2 - iter 98/146 - loss 0.24030979 - time (sec): 10.62 - samples/sec: 2881.48 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:27:52,231 epoch 2 - iter 112/146 - loss 0.23215042 - time (sec): 11.85 - samples/sec: 2885.68 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:27:53,529 epoch 2 - iter 126/146 - loss 0.22725978 - time (sec): 13.15 - samples/sec: 2926.76 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:27:55,155 epoch 2 - iter 140/146 - loss 0.21838231 - time (sec): 14.77 - samples/sec: 2921.39 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:27:55,621 ----------------------------------------------------------------------------------------------------
2023-10-16 18:27:55,621 EPOCH 2 done: loss 0.2171 - lr: 0.000027
2023-10-16 18:27:56,858 DEV : loss 0.1434197872877121 - f1-score (micro avg) 0.6021
2023-10-16 18:27:56,863 saving best model
2023-10-16 18:27:57,673 ----------------------------------------------------------------------------------------------------
2023-10-16 18:27:59,758 epoch 3 - iter 14/146 - loss 0.18575960 - time (sec): 2.08 - samples/sec: 2489.05 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:28:01,045 epoch 3 - iter 28/146 - loss 0.18887110 - time (sec): 3.37 - samples/sec: 2772.46 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:28:02,583 epoch 3 - iter 42/146 - loss 0.17188404 - time (sec): 4.91 - samples/sec: 2876.25 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:28:04,041 epoch 3 - iter 56/146 - loss 0.15696548 - time (sec): 6.37 - samples/sec: 2941.00 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:28:05,630 epoch 3 - iter 70/146 - loss 0.14502167 - time (sec): 7.96 - samples/sec: 2928.47 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:28:06,896 epoch 3 - iter 84/146 - loss 0.14099742 - time (sec): 9.22 - samples/sec: 2934.25 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:28:08,331 epoch 3 - iter 98/146 - loss 0.13734271 - time (sec): 10.66 - samples/sec: 2924.49 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:28:09,528 epoch 3 - iter 112/146 - loss 0.13345700 - time (sec): 11.85 - samples/sec: 2950.12 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:28:11,017 epoch 3 - iter 126/146 - loss 0.13132961 - time (sec): 13.34 - samples/sec: 2945.01 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:28:12,211 epoch 3 - iter 140/146 - loss 0.12890036 - time (sec): 14.54 - samples/sec: 2958.40 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:28:12,652 ----------------------------------------------------------------------------------------------------
2023-10-16 18:28:12,652 EPOCH 3 done: loss 0.1276 - lr: 0.000024
2023-10-16 18:28:13,931 DEV : loss 0.11977185308933258 - f1-score (micro avg) 0.6652
2023-10-16 18:28:13,937 saving best model
2023-10-16 18:28:14,450 ----------------------------------------------------------------------------------------------------
2023-10-16 18:28:15,788 epoch 4 - iter 14/146 - loss 0.07996901 - time (sec): 1.34 - samples/sec: 2911.28 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:28:17,109 epoch 4 - iter 28/146 - loss 0.08203620 - time (sec): 2.66 - samples/sec: 2996.09 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:28:18,473 epoch 4 - iter 42/146 - loss 0.09414321 - time (sec): 4.02 - samples/sec: 2900.82 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:28:20,069 epoch 4 - iter 56/146 - loss 0.08273818 - time (sec): 5.62 - samples/sec: 2934.25 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:28:21,354 epoch 4 - iter 70/146 - loss 0.08429318 - time (sec): 6.90 - samples/sec: 2935.80 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:28:22,724 epoch 4 - iter 84/146 - loss 0.08662969 - time (sec): 8.27 - samples/sec: 2941.49 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:28:24,096 epoch 4 - iter 98/146 - loss 0.08750805 - time (sec): 9.64 - samples/sec: 2935.41 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:28:25,651 epoch 4 - iter 112/146 - loss 0.08853932 - time (sec): 11.20 - samples/sec: 2939.66 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:28:26,988 epoch 4 - iter 126/146 - loss 0.08671444 - time (sec): 12.54 - samples/sec: 2961.01 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:28:28,792 epoch 4 - iter 140/146 - loss 0.08282451 - time (sec): 14.34 - samples/sec: 2977.22 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:28:29,335 ----------------------------------------------------------------------------------------------------
2023-10-16 18:28:29,336 EPOCH 4 done: loss 0.0822 - lr: 0.000020
2023-10-16 18:28:30,578 DEV : loss 0.10638927668333054 - f1-score (micro avg) 0.7168
2023-10-16 18:28:30,583 saving best model
2023-10-16 18:28:31,176 ----------------------------------------------------------------------------------------------------
2023-10-16 18:28:32,716 epoch 5 - iter 14/146 - loss 0.10632934 - time (sec): 1.54 - samples/sec: 2747.36 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:28:34,221 epoch 5 - iter 28/146 - loss 0.07986467 - time (sec): 3.04 - samples/sec: 2757.96 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:28:35,865 epoch 5 - iter 42/146 - loss 0.06670757 - time (sec): 4.68 - samples/sec: 2733.59 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:28:37,137 epoch 5 - iter 56/146 - loss 0.06636050 - time (sec): 5.96 - samples/sec: 2773.06 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:28:38,628 epoch 5 - iter 70/146 - loss 0.06634048 - time (sec): 7.45 - samples/sec: 2793.51 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:28:40,017 epoch 5 - iter 84/146 - loss 0.06502579 - time (sec): 8.84 - samples/sec: 2819.90 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:28:41,496 epoch 5 - iter 98/146 - loss 0.06323217 - time (sec): 10.32 - samples/sec: 2827.88 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:28:42,923 epoch 5 - iter 112/146 - loss 0.06339836 - time (sec): 11.74 - samples/sec: 2888.38 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:28:44,351 epoch 5 - iter 126/146 - loss 0.06263725 - time (sec): 13.17 - samples/sec: 2901.58 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:28:45,678 epoch 5 - iter 140/146 - loss 0.06146587 - time (sec): 14.50 - samples/sec: 2909.62 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:28:46,393 ----------------------------------------------------------------------------------------------------
2023-10-16 18:28:46,393 EPOCH 5 done: loss 0.0599 - lr: 0.000017
2023-10-16 18:28:47,691 DEV : loss 0.11089115589857101 - f1-score (micro avg) 0.7484
2023-10-16 18:28:47,696 saving best model
2023-10-16 18:28:48,226 ----------------------------------------------------------------------------------------------------
2023-10-16 18:28:50,125 epoch 6 - iter 14/146 - loss 0.03612688 - time (sec): 1.90 - samples/sec: 2635.80 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:28:51,353 epoch 6 - iter 28/146 - loss 0.03686816 - time (sec): 3.13 - samples/sec: 2817.15 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:28:52,703 epoch 6 - iter 42/146 - loss 0.03616211 - time (sec): 4.48 - samples/sec: 2829.65 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:28:54,290 epoch 6 - iter 56/146 - loss 0.03447287 - time (sec): 6.06 - samples/sec: 2763.87 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:28:55,642 epoch 6 - iter 70/146 - loss 0.03490954 - time (sec): 7.41 - samples/sec: 2901.81 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:28:56,811 epoch 6 - iter 84/146 - loss 0.03505656 - time (sec): 8.58 - samples/sec: 2936.17 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:28:58,271 epoch 6 - iter 98/146 - loss 0.03312980 - time (sec): 10.04 - samples/sec: 2955.51 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:28:59,521 epoch 6 - iter 112/146 - loss 0.03670852 - time (sec): 11.29 - samples/sec: 2951.86 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:29:01,313 epoch 6 - iter 126/146 - loss 0.03829386 - time (sec): 13.09 - samples/sec: 2986.27 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:29:02,439 epoch 6 - iter 140/146 - loss 0.04191597 - time (sec): 14.21 - samples/sec: 2979.92 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:29:03,316 ----------------------------------------------------------------------------------------------------
2023-10-16 18:29:03,316 EPOCH 6 done: loss 0.0433 - lr: 0.000014
2023-10-16 18:29:04,619 DEV : loss 0.12725397944450378 - f1-score (micro avg) 0.7152
2023-10-16 18:29:04,626 ----------------------------------------------------------------------------------------------------
2023-10-16 18:29:06,333 epoch 7 - iter 14/146 - loss 0.03280998 - time (sec): 1.71 - samples/sec: 3099.99 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:29:07,641 epoch 7 - iter 28/146 - loss 0.02824876 - time (sec): 3.01 - samples/sec: 3047.06 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:29:09,289 epoch 7 - iter 42/146 - loss 0.02707074 - time (sec): 4.66 - samples/sec: 2956.74 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:29:10,811 epoch 7 - iter 56/146 - loss 0.02889321 - time (sec): 6.18 - samples/sec: 2871.09 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:29:12,358 epoch 7 - iter 70/146 - loss 0.03251612 - time (sec): 7.73 - samples/sec: 2848.77 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:29:13,920 epoch 7 - iter 84/146 - loss 0.02978633 - time (sec): 9.29 - samples/sec: 2857.28 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:29:15,091 epoch 7 - iter 98/146 - loss 0.02989419 - time (sec): 10.46 - samples/sec: 2884.06 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:29:16,500 epoch 7 - iter 112/146 - loss 0.02868727 - time (sec): 11.87 - samples/sec: 2879.60 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:29:17,885 epoch 7 - iter 126/146 - loss 0.03081760 - time (sec): 13.26 - samples/sec: 2928.91 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:29:19,190 epoch 7 - iter 140/146 - loss 0.03274991 - time (sec): 14.56 - samples/sec: 2931.80 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:29:19,925 ----------------------------------------------------------------------------------------------------
2023-10-16 18:29:19,926 EPOCH 7 done: loss 0.0321 - lr: 0.000010
2023-10-16 18:29:21,217 DEV : loss 0.12044133991003036 - f1-score (micro avg) 0.766
2023-10-16 18:29:21,222 saving best model
2023-10-16 18:29:21,806 ----------------------------------------------------------------------------------------------------
2023-10-16 18:29:23,128 epoch 8 - iter 14/146 - loss 0.02741513 - time (sec): 1.32 - samples/sec: 3165.44 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:29:24,529 epoch 8 - iter 28/146 - loss 0.01990358 - time (sec): 2.72 - samples/sec: 3158.95 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:29:26,092 epoch 8 - iter 42/146 - loss 0.02296771 - time (sec): 4.28 - samples/sec: 2976.98 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:29:27,603 epoch 8 - iter 56/146 - loss 0.02253618 - time (sec): 5.80 - samples/sec: 2889.22 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:29:29,125 epoch 8 - iter 70/146 - loss 0.02333529 - time (sec): 7.32 - samples/sec: 2902.57 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:29:30,335 epoch 8 - iter 84/146 - loss 0.02425044 - time (sec): 8.53 - samples/sec: 2940.76 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:29:32,104 epoch 8 - iter 98/146 - loss 0.02550230 - time (sec): 10.30 - samples/sec: 2902.86 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:29:33,562 epoch 8 - iter 112/146 - loss 0.02765254 - time (sec): 11.75 - samples/sec: 2930.06 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:29:34,824 epoch 8 - iter 126/146 - loss 0.02724778 - time (sec): 13.02 - samples/sec: 2926.68 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:29:36,217 epoch 8 - iter 140/146 - loss 0.02645279 - time (sec): 14.41 - samples/sec: 2956.07 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:29:36,861 ----------------------------------------------------------------------------------------------------
2023-10-16 18:29:36,861 EPOCH 8 done: loss 0.0261 - lr: 0.000007
2023-10-16 18:29:38,320 DEV : loss 0.12693804502487183 - f1-score (micro avg) 0.7526
2023-10-16 18:29:38,324 ----------------------------------------------------------------------------------------------------
2023-10-16 18:29:39,576 epoch 9 - iter 14/146 - loss 0.01650818 - time (sec): 1.25 - samples/sec: 3366.30 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:29:41,284 epoch 9 - iter 28/146 - loss 0.02351489 - time (sec): 2.96 - samples/sec: 2916.17 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:29:42,771 epoch 9 - iter 42/146 - loss 0.02544422 - time (sec): 4.45 - samples/sec: 2897.68 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:29:44,255 epoch 9 - iter 56/146 - loss 0.02923101 - time (sec): 5.93 - samples/sec: 2977.74 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:29:45,955 epoch 9 - iter 70/146 - loss 0.02603799 - time (sec): 7.63 - samples/sec: 2934.50 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:29:47,435 epoch 9 - iter 84/146 - loss 0.02466823 - time (sec): 9.11 - samples/sec: 2929.35 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:29:48,776 epoch 9 - iter 98/146 - loss 0.02570875 - time (sec): 10.45 - samples/sec: 2950.34 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:29:50,231 epoch 9 - iter 112/146 - loss 0.02418524 - time (sec): 11.91 - samples/sec: 2947.42 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:29:51,536 epoch 9 - iter 126/146 - loss 0.02378156 - time (sec): 13.21 - samples/sec: 2941.77 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:29:53,032 epoch 9 - iter 140/146 - loss 0.02245718 - time (sec): 14.71 - samples/sec: 2913.45 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:29:53,512 ----------------------------------------------------------------------------------------------------
2023-10-16 18:29:53,512 EPOCH 9 done: loss 0.0224 - lr: 0.000004
2023-10-16 18:29:54,818 DEV : loss 0.13801662623882294 - f1-score (micro avg) 0.7417
2023-10-16 18:29:54,822 ----------------------------------------------------------------------------------------------------
2023-10-16 18:29:56,258 epoch 10 - iter 14/146 - loss 0.00940816 - time (sec): 1.43 - samples/sec: 3070.97 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:29:57,875 epoch 10 - iter 28/146 - loss 0.01387647 - time (sec): 3.05 - samples/sec: 3120.84 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:29:59,255 epoch 10 - iter 42/146 - loss 0.02669714 - time (sec): 4.43 - samples/sec: 3025.78 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:30:00,595 epoch 10 - iter 56/146 - loss 0.02370339 - time (sec): 5.77 - samples/sec: 3054.06 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:30:02,063 epoch 10 - iter 70/146 - loss 0.02208689 - time (sec): 7.24 - samples/sec: 2976.98 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:30:03,748 epoch 10 - iter 84/146 - loss 0.02139677 - time (sec): 8.92 - samples/sec: 3007.22 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:30:05,069 epoch 10 - iter 98/146 - loss 0.01968660 - time (sec): 10.25 - samples/sec: 3022.30 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:30:06,371 epoch 10 - iter 112/146 - loss 0.01902306 - time (sec): 11.55 - samples/sec: 2995.40 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:30:07,887 epoch 10 - iter 126/146 - loss 0.01748688 - time (sec): 13.06 - samples/sec: 2969.77 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:30:09,235 epoch 10 - iter 140/146 - loss 0.01846671 - time (sec): 14.41 - samples/sec: 2990.60 - lr: 0.000000 - momentum: 0.000000
2023-10-16 18:30:09,714 ----------------------------------------------------------------------------------------------------
2023-10-16 18:30:09,714 EPOCH 10 done: loss 0.0184 - lr: 0.000000
2023-10-16 18:30:11,005 DEV : loss 0.144253209233284 - f1-score (micro avg) 0.7318
2023-10-16 18:30:11,430 ----------------------------------------------------------------------------------------------------
2023-10-16 18:30:11,431 Loading model from best epoch ...
2023-10-16 18:30:13,049 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-16 18:30:15,723
Results:
- F-score (micro) 0.7512
- F-score (macro) 0.6702
- Accuracy 0.6244
By class:
precision recall f1-score support
PER 0.7962 0.8420 0.8184 348
LOC 0.6503 0.8123 0.7223 261
ORG 0.5000 0.4231 0.4583 52
HumanProd 0.6818 0.6818 0.6818 22
micro avg 0.7132 0.7936 0.7512 683
macro avg 0.6571 0.6898 0.6702 683
weighted avg 0.7142 0.7936 0.7499 683
2023-10-16 18:30:15,723 ----------------------------------------------------------------------------------------------------