stefan-it's picture
Upload folder using huggingface_hub
6d1c8f6
2023-10-19 23:55:49,914 ----------------------------------------------------------------------------------------------------
2023-10-19 23:55:49,915 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-19 23:55:49,915 ----------------------------------------------------------------------------------------------------
2023-10-19 23:55:49,915 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-19 23:55:49,915 ----------------------------------------------------------------------------------------------------
2023-10-19 23:55:49,915 Train: 1166 sentences
2023-10-19 23:55:49,915 (train_with_dev=False, train_with_test=False)
2023-10-19 23:55:49,915 ----------------------------------------------------------------------------------------------------
2023-10-19 23:55:49,915 Training Params:
2023-10-19 23:55:49,915 - learning_rate: "3e-05"
2023-10-19 23:55:49,915 - mini_batch_size: "8"
2023-10-19 23:55:49,915 - max_epochs: "10"
2023-10-19 23:55:49,915 - shuffle: "True"
2023-10-19 23:55:49,915 ----------------------------------------------------------------------------------------------------
2023-10-19 23:55:49,915 Plugins:
2023-10-19 23:55:49,915 - TensorboardLogger
2023-10-19 23:55:49,915 - LinearScheduler | warmup_fraction: '0.1'
2023-10-19 23:55:49,915 ----------------------------------------------------------------------------------------------------
2023-10-19 23:55:49,915 Final evaluation on model from best epoch (best-model.pt)
2023-10-19 23:55:49,915 - metric: "('micro avg', 'f1-score')"
2023-10-19 23:55:49,915 ----------------------------------------------------------------------------------------------------
2023-10-19 23:55:49,915 Computation:
2023-10-19 23:55:49,915 - compute on device: cuda:0
2023-10-19 23:55:49,915 - embedding storage: none
2023-10-19 23:55:49,916 ----------------------------------------------------------------------------------------------------
2023-10-19 23:55:49,916 Model training base path: "hmbench-newseye/fi-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-19 23:55:49,916 ----------------------------------------------------------------------------------------------------
2023-10-19 23:55:49,916 ----------------------------------------------------------------------------------------------------
2023-10-19 23:55:49,916 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-19 23:55:50,257 epoch 1 - iter 14/146 - loss 3.62686657 - time (sec): 0.34 - samples/sec: 11462.03 - lr: 0.000003 - momentum: 0.000000
2023-10-19 23:55:50,667 epoch 1 - iter 28/146 - loss 3.56566381 - time (sec): 0.75 - samples/sec: 12179.62 - lr: 0.000006 - momentum: 0.000000
2023-10-19 23:55:51,024 epoch 1 - iter 42/146 - loss 3.54309758 - time (sec): 1.11 - samples/sec: 11595.73 - lr: 0.000008 - momentum: 0.000000
2023-10-19 23:55:51,375 epoch 1 - iter 56/146 - loss 3.49169459 - time (sec): 1.46 - samples/sec: 11651.88 - lr: 0.000011 - momentum: 0.000000
2023-10-19 23:55:51,731 epoch 1 - iter 70/146 - loss 3.41550947 - time (sec): 1.81 - samples/sec: 11684.22 - lr: 0.000014 - momentum: 0.000000
2023-10-19 23:55:52,075 epoch 1 - iter 84/146 - loss 3.30661928 - time (sec): 2.16 - samples/sec: 11659.69 - lr: 0.000017 - momentum: 0.000000
2023-10-19 23:55:52,445 epoch 1 - iter 98/146 - loss 3.16012091 - time (sec): 2.53 - samples/sec: 11732.42 - lr: 0.000020 - momentum: 0.000000
2023-10-19 23:55:52,826 epoch 1 - iter 112/146 - loss 3.01059943 - time (sec): 2.91 - samples/sec: 11857.44 - lr: 0.000023 - momentum: 0.000000
2023-10-19 23:55:53,168 epoch 1 - iter 126/146 - loss 2.85714802 - time (sec): 3.25 - samples/sec: 11958.56 - lr: 0.000026 - momentum: 0.000000
2023-10-19 23:55:53,513 epoch 1 - iter 140/146 - loss 2.72108786 - time (sec): 3.60 - samples/sec: 11862.69 - lr: 0.000029 - momentum: 0.000000
2023-10-19 23:55:53,663 ----------------------------------------------------------------------------------------------------
2023-10-19 23:55:53,663 EPOCH 1 done: loss 2.6533 - lr: 0.000029
2023-10-19 23:55:54,077 DEV : loss 0.6145911812782288 - f1-score (micro avg) 0.0
2023-10-19 23:55:54,081 ----------------------------------------------------------------------------------------------------
2023-10-19 23:55:54,384 epoch 2 - iter 14/146 - loss 1.01168374 - time (sec): 0.30 - samples/sec: 10680.08 - lr: 0.000030 - momentum: 0.000000
2023-10-19 23:55:54,732 epoch 2 - iter 28/146 - loss 0.94678590 - time (sec): 0.65 - samples/sec: 12103.77 - lr: 0.000029 - momentum: 0.000000
2023-10-19 23:55:55,052 epoch 2 - iter 42/146 - loss 0.88474481 - time (sec): 0.97 - samples/sec: 12016.96 - lr: 0.000029 - momentum: 0.000000
2023-10-19 23:55:55,410 epoch 2 - iter 56/146 - loss 0.84644859 - time (sec): 1.33 - samples/sec: 11952.94 - lr: 0.000029 - momentum: 0.000000
2023-10-19 23:55:55,772 epoch 2 - iter 70/146 - loss 0.82087110 - time (sec): 1.69 - samples/sec: 12197.15 - lr: 0.000028 - momentum: 0.000000
2023-10-19 23:55:56,176 epoch 2 - iter 84/146 - loss 0.82573188 - time (sec): 2.10 - samples/sec: 12233.44 - lr: 0.000028 - momentum: 0.000000
2023-10-19 23:55:56,548 epoch 2 - iter 98/146 - loss 0.78402906 - time (sec): 2.47 - samples/sec: 12422.83 - lr: 0.000028 - momentum: 0.000000
2023-10-19 23:55:56,896 epoch 2 - iter 112/146 - loss 0.76691151 - time (sec): 2.82 - samples/sec: 12174.71 - lr: 0.000027 - momentum: 0.000000
2023-10-19 23:55:57,260 epoch 2 - iter 126/146 - loss 0.74906536 - time (sec): 3.18 - samples/sec: 12104.22 - lr: 0.000027 - momentum: 0.000000
2023-10-19 23:55:57,628 epoch 2 - iter 140/146 - loss 0.73743452 - time (sec): 3.55 - samples/sec: 12066.27 - lr: 0.000027 - momentum: 0.000000
2023-10-19 23:55:57,770 ----------------------------------------------------------------------------------------------------
2023-10-19 23:55:57,770 EPOCH 2 done: loss 0.7358 - lr: 0.000027
2023-10-19 23:55:58,418 DEV : loss 0.4394523501396179 - f1-score (micro avg) 0.0
2023-10-19 23:55:58,422 ----------------------------------------------------------------------------------------------------
2023-10-19 23:55:58,789 epoch 3 - iter 14/146 - loss 0.55148619 - time (sec): 0.37 - samples/sec: 12383.51 - lr: 0.000026 - momentum: 0.000000
2023-10-19 23:55:59,175 epoch 3 - iter 28/146 - loss 0.59695029 - time (sec): 0.75 - samples/sec: 12007.04 - lr: 0.000026 - momentum: 0.000000
2023-10-19 23:55:59,521 epoch 3 - iter 42/146 - loss 0.61934555 - time (sec): 1.10 - samples/sec: 11516.03 - lr: 0.000026 - momentum: 0.000000
2023-10-19 23:55:59,885 epoch 3 - iter 56/146 - loss 0.60983026 - time (sec): 1.46 - samples/sec: 11617.10 - lr: 0.000025 - momentum: 0.000000
2023-10-19 23:56:00,238 epoch 3 - iter 70/146 - loss 0.62579085 - time (sec): 1.82 - samples/sec: 11513.86 - lr: 0.000025 - momentum: 0.000000
2023-10-19 23:56:00,598 epoch 3 - iter 84/146 - loss 0.62085526 - time (sec): 2.18 - samples/sec: 11495.44 - lr: 0.000025 - momentum: 0.000000
2023-10-19 23:56:00,979 epoch 3 - iter 98/146 - loss 0.60632732 - time (sec): 2.56 - samples/sec: 11717.36 - lr: 0.000024 - momentum: 0.000000
2023-10-19 23:56:01,356 epoch 3 - iter 112/146 - loss 0.62524719 - time (sec): 2.93 - samples/sec: 11811.94 - lr: 0.000024 - momentum: 0.000000
2023-10-19 23:56:01,706 epoch 3 - iter 126/146 - loss 0.61922831 - time (sec): 3.28 - samples/sec: 11741.34 - lr: 0.000024 - momentum: 0.000000
2023-10-19 23:56:02,056 epoch 3 - iter 140/146 - loss 0.61879485 - time (sec): 3.63 - samples/sec: 11646.27 - lr: 0.000024 - momentum: 0.000000
2023-10-19 23:56:02,223 ----------------------------------------------------------------------------------------------------
2023-10-19 23:56:02,223 EPOCH 3 done: loss 0.6072 - lr: 0.000024
2023-10-19 23:56:02,848 DEV : loss 0.3797164857387543 - f1-score (micro avg) 0.0
2023-10-19 23:56:02,851 ----------------------------------------------------------------------------------------------------
2023-10-19 23:56:03,218 epoch 4 - iter 14/146 - loss 0.58641619 - time (sec): 0.37 - samples/sec: 12170.12 - lr: 0.000023 - momentum: 0.000000
2023-10-19 23:56:03,592 epoch 4 - iter 28/146 - loss 0.57181105 - time (sec): 0.74 - samples/sec: 12468.22 - lr: 0.000023 - momentum: 0.000000
2023-10-19 23:56:03,978 epoch 4 - iter 42/146 - loss 0.52217062 - time (sec): 1.13 - samples/sec: 12377.06 - lr: 0.000022 - momentum: 0.000000
2023-10-19 23:56:04,330 epoch 4 - iter 56/146 - loss 0.52381805 - time (sec): 1.48 - samples/sec: 11812.83 - lr: 0.000022 - momentum: 0.000000
2023-10-19 23:56:04,680 epoch 4 - iter 70/146 - loss 0.52577098 - time (sec): 1.83 - samples/sec: 11554.17 - lr: 0.000022 - momentum: 0.000000
2023-10-19 23:56:05,045 epoch 4 - iter 84/146 - loss 0.51450474 - time (sec): 2.19 - samples/sec: 11452.25 - lr: 0.000021 - momentum: 0.000000
2023-10-19 23:56:05,385 epoch 4 - iter 98/146 - loss 0.51166184 - time (sec): 2.53 - samples/sec: 11274.24 - lr: 0.000021 - momentum: 0.000000
2023-10-19 23:56:05,773 epoch 4 - iter 112/146 - loss 0.51209906 - time (sec): 2.92 - samples/sec: 11425.38 - lr: 0.000021 - momentum: 0.000000
2023-10-19 23:56:06,157 epoch 4 - iter 126/146 - loss 0.54642243 - time (sec): 3.30 - samples/sec: 11674.51 - lr: 0.000021 - momentum: 0.000000
2023-10-19 23:56:06,509 epoch 4 - iter 140/146 - loss 0.54514031 - time (sec): 3.66 - samples/sec: 11603.60 - lr: 0.000020 - momentum: 0.000000
2023-10-19 23:56:06,668 ----------------------------------------------------------------------------------------------------
2023-10-19 23:56:06,668 EPOCH 4 done: loss 0.5436 - lr: 0.000020
2023-10-19 23:56:07,301 DEV : loss 0.34846723079681396 - f1-score (micro avg) 0.0
2023-10-19 23:56:07,305 ----------------------------------------------------------------------------------------------------
2023-10-19 23:56:07,684 epoch 5 - iter 14/146 - loss 0.48210081 - time (sec): 0.38 - samples/sec: 10721.55 - lr: 0.000020 - momentum: 0.000000
2023-10-19 23:56:08,065 epoch 5 - iter 28/146 - loss 0.48716782 - time (sec): 0.76 - samples/sec: 11473.13 - lr: 0.000019 - momentum: 0.000000
2023-10-19 23:56:08,444 epoch 5 - iter 42/146 - loss 0.51726817 - time (sec): 1.14 - samples/sec: 11747.61 - lr: 0.000019 - momentum: 0.000000
2023-10-19 23:56:08,799 epoch 5 - iter 56/146 - loss 0.52378662 - time (sec): 1.49 - samples/sec: 11395.11 - lr: 0.000019 - momentum: 0.000000
2023-10-19 23:56:09,173 epoch 5 - iter 70/146 - loss 0.52298388 - time (sec): 1.87 - samples/sec: 11618.95 - lr: 0.000018 - momentum: 0.000000
2023-10-19 23:56:09,546 epoch 5 - iter 84/146 - loss 0.52146337 - time (sec): 2.24 - samples/sec: 11490.07 - lr: 0.000018 - momentum: 0.000000
2023-10-19 23:56:09,917 epoch 5 - iter 98/146 - loss 0.52723859 - time (sec): 2.61 - samples/sec: 11397.33 - lr: 0.000018 - momentum: 0.000000
2023-10-19 23:56:10,438 epoch 5 - iter 112/146 - loss 0.51952344 - time (sec): 3.13 - samples/sec: 10819.48 - lr: 0.000018 - momentum: 0.000000
2023-10-19 23:56:10,776 epoch 5 - iter 126/146 - loss 0.51715617 - time (sec): 3.47 - samples/sec: 10972.98 - lr: 0.000017 - momentum: 0.000000
2023-10-19 23:56:11,153 epoch 5 - iter 140/146 - loss 0.49965520 - time (sec): 3.85 - samples/sec: 11217.58 - lr: 0.000017 - momentum: 0.000000
2023-10-19 23:56:11,298 ----------------------------------------------------------------------------------------------------
2023-10-19 23:56:11,298 EPOCH 5 done: loss 0.5025 - lr: 0.000017
2023-10-19 23:56:11,928 DEV : loss 0.3356049656867981 - f1-score (micro avg) 0.0
2023-10-19 23:56:11,932 ----------------------------------------------------------------------------------------------------
2023-10-19 23:56:12,289 epoch 6 - iter 14/146 - loss 0.51328999 - time (sec): 0.36 - samples/sec: 12430.14 - lr: 0.000016 - momentum: 0.000000
2023-10-19 23:56:12,646 epoch 6 - iter 28/146 - loss 0.50111882 - time (sec): 0.71 - samples/sec: 12164.80 - lr: 0.000016 - momentum: 0.000000
2023-10-19 23:56:12,997 epoch 6 - iter 42/146 - loss 0.47633665 - time (sec): 1.06 - samples/sec: 11765.26 - lr: 0.000016 - momentum: 0.000000
2023-10-19 23:56:13,368 epoch 6 - iter 56/146 - loss 0.50441129 - time (sec): 1.44 - samples/sec: 11938.40 - lr: 0.000015 - momentum: 0.000000
2023-10-19 23:56:13,736 epoch 6 - iter 70/146 - loss 0.48930849 - time (sec): 1.80 - samples/sec: 11776.73 - lr: 0.000015 - momentum: 0.000000
2023-10-19 23:56:14,150 epoch 6 - iter 84/146 - loss 0.48837595 - time (sec): 2.22 - samples/sec: 12129.85 - lr: 0.000015 - momentum: 0.000000
2023-10-19 23:56:14,527 epoch 6 - iter 98/146 - loss 0.49885434 - time (sec): 2.59 - samples/sec: 11969.84 - lr: 0.000015 - momentum: 0.000000
2023-10-19 23:56:14,889 epoch 6 - iter 112/146 - loss 0.48721693 - time (sec): 2.96 - samples/sec: 11934.58 - lr: 0.000014 - momentum: 0.000000
2023-10-19 23:56:15,237 epoch 6 - iter 126/146 - loss 0.48865106 - time (sec): 3.30 - samples/sec: 11751.03 - lr: 0.000014 - momentum: 0.000000
2023-10-19 23:56:15,599 epoch 6 - iter 140/146 - loss 0.48424717 - time (sec): 3.67 - samples/sec: 11597.07 - lr: 0.000014 - momentum: 0.000000
2023-10-19 23:56:15,749 ----------------------------------------------------------------------------------------------------
2023-10-19 23:56:15,749 EPOCH 6 done: loss 0.4803 - lr: 0.000014
2023-10-19 23:56:16,391 DEV : loss 0.33338305354118347 - f1-score (micro avg) 0.0159
2023-10-19 23:56:16,395 saving best model
2023-10-19 23:56:16,424 ----------------------------------------------------------------------------------------------------
2023-10-19 23:56:16,810 epoch 7 - iter 14/146 - loss 0.51247900 - time (sec): 0.39 - samples/sec: 10542.60 - lr: 0.000013 - momentum: 0.000000
2023-10-19 23:56:17,182 epoch 7 - iter 28/146 - loss 0.46353999 - time (sec): 0.76 - samples/sec: 11315.92 - lr: 0.000013 - momentum: 0.000000
2023-10-19 23:56:17,538 epoch 7 - iter 42/146 - loss 0.47267613 - time (sec): 1.11 - samples/sec: 12103.13 - lr: 0.000012 - momentum: 0.000000
2023-10-19 23:56:17,909 epoch 7 - iter 56/146 - loss 0.51136275 - time (sec): 1.48 - samples/sec: 11998.57 - lr: 0.000012 - momentum: 0.000000
2023-10-19 23:56:18,283 epoch 7 - iter 70/146 - loss 0.49491599 - time (sec): 1.86 - samples/sec: 11859.20 - lr: 0.000012 - momentum: 0.000000
2023-10-19 23:56:18,648 epoch 7 - iter 84/146 - loss 0.49513367 - time (sec): 2.22 - samples/sec: 11602.73 - lr: 0.000012 - momentum: 0.000000
2023-10-19 23:56:18,991 epoch 7 - iter 98/146 - loss 0.48191917 - time (sec): 2.57 - samples/sec: 11562.30 - lr: 0.000011 - momentum: 0.000000
2023-10-19 23:56:19,373 epoch 7 - iter 112/146 - loss 0.48660231 - time (sec): 2.95 - samples/sec: 11284.94 - lr: 0.000011 - momentum: 0.000000
2023-10-19 23:56:19,779 epoch 7 - iter 126/146 - loss 0.47276807 - time (sec): 3.35 - samples/sec: 11505.26 - lr: 0.000011 - momentum: 0.000000
2023-10-19 23:56:20,145 epoch 7 - iter 140/146 - loss 0.46794068 - time (sec): 3.72 - samples/sec: 11482.84 - lr: 0.000010 - momentum: 0.000000
2023-10-19 23:56:20,309 ----------------------------------------------------------------------------------------------------
2023-10-19 23:56:20,309 EPOCH 7 done: loss 0.4652 - lr: 0.000010
2023-10-19 23:56:20,961 DEV : loss 0.3205569088459015 - f1-score (micro avg) 0.0643
2023-10-19 23:56:20,965 saving best model
2023-10-19 23:56:21,012 ----------------------------------------------------------------------------------------------------
2023-10-19 23:56:21,398 epoch 8 - iter 14/146 - loss 0.40190856 - time (sec): 0.39 - samples/sec: 11719.30 - lr: 0.000010 - momentum: 0.000000
2023-10-19 23:56:21,801 epoch 8 - iter 28/146 - loss 0.44444101 - time (sec): 0.79 - samples/sec: 12183.11 - lr: 0.000009 - momentum: 0.000000
2023-10-19 23:56:22,130 epoch 8 - iter 42/146 - loss 0.44314790 - time (sec): 1.12 - samples/sec: 11777.77 - lr: 0.000009 - momentum: 0.000000
2023-10-19 23:56:22,520 epoch 8 - iter 56/146 - loss 0.43739722 - time (sec): 1.51 - samples/sec: 11786.96 - lr: 0.000009 - momentum: 0.000000
2023-10-19 23:56:22,869 epoch 8 - iter 70/146 - loss 0.44209666 - time (sec): 1.86 - samples/sec: 11339.21 - lr: 0.000009 - momentum: 0.000000
2023-10-19 23:56:23,229 epoch 8 - iter 84/146 - loss 0.44111953 - time (sec): 2.22 - samples/sec: 11001.38 - lr: 0.000008 - momentum: 0.000000
2023-10-19 23:56:23,618 epoch 8 - iter 98/146 - loss 0.45685855 - time (sec): 2.61 - samples/sec: 11135.51 - lr: 0.000008 - momentum: 0.000000
2023-10-19 23:56:23,986 epoch 8 - iter 112/146 - loss 0.44776697 - time (sec): 2.97 - samples/sec: 11397.68 - lr: 0.000008 - momentum: 0.000000
2023-10-19 23:56:24,374 epoch 8 - iter 126/146 - loss 0.44711644 - time (sec): 3.36 - samples/sec: 11425.36 - lr: 0.000007 - momentum: 0.000000
2023-10-19 23:56:24,755 epoch 8 - iter 140/146 - loss 0.44596266 - time (sec): 3.74 - samples/sec: 11451.64 - lr: 0.000007 - momentum: 0.000000
2023-10-19 23:56:24,916 ----------------------------------------------------------------------------------------------------
2023-10-19 23:56:24,916 EPOCH 8 done: loss 0.4450 - lr: 0.000007
2023-10-19 23:56:25,551 DEV : loss 0.3211706280708313 - f1-score (micro avg) 0.0707
2023-10-19 23:56:25,555 saving best model
2023-10-19 23:56:25,587 ----------------------------------------------------------------------------------------------------
2023-10-19 23:56:25,942 epoch 9 - iter 14/146 - loss 0.41113884 - time (sec): 0.35 - samples/sec: 10970.83 - lr: 0.000006 - momentum: 0.000000
2023-10-19 23:56:26,317 epoch 9 - iter 28/146 - loss 0.46709047 - time (sec): 0.73 - samples/sec: 11164.26 - lr: 0.000006 - momentum: 0.000000
2023-10-19 23:56:26,657 epoch 9 - iter 42/146 - loss 0.46730149 - time (sec): 1.07 - samples/sec: 10630.35 - lr: 0.000006 - momentum: 0.000000
2023-10-19 23:56:27,014 epoch 9 - iter 56/146 - loss 0.47270215 - time (sec): 1.43 - samples/sec: 10837.67 - lr: 0.000006 - momentum: 0.000000
2023-10-19 23:56:27,384 epoch 9 - iter 70/146 - loss 0.46801889 - time (sec): 1.80 - samples/sec: 11086.23 - lr: 0.000005 - momentum: 0.000000
2023-10-19 23:56:27,760 epoch 9 - iter 84/146 - loss 0.46293481 - time (sec): 2.17 - samples/sec: 11335.03 - lr: 0.000005 - momentum: 0.000000
2023-10-19 23:56:28,130 epoch 9 - iter 98/146 - loss 0.46472869 - time (sec): 2.54 - samples/sec: 11343.97 - lr: 0.000005 - momentum: 0.000000
2023-10-19 23:56:28,649 epoch 9 - iter 112/146 - loss 0.46885479 - time (sec): 3.06 - samples/sec: 10880.87 - lr: 0.000004 - momentum: 0.000000
2023-10-19 23:56:29,036 epoch 9 - iter 126/146 - loss 0.45905054 - time (sec): 3.45 - samples/sec: 11042.62 - lr: 0.000004 - momentum: 0.000000
2023-10-19 23:56:29,430 epoch 9 - iter 140/146 - loss 0.44994878 - time (sec): 3.84 - samples/sec: 11214.07 - lr: 0.000004 - momentum: 0.000000
2023-10-19 23:56:29,569 ----------------------------------------------------------------------------------------------------
2023-10-19 23:56:29,570 EPOCH 9 done: loss 0.4484 - lr: 0.000004
2023-10-19 23:56:30,206 DEV : loss 0.3181568682193756 - f1-score (micro avg) 0.1063
2023-10-19 23:56:30,210 saving best model
2023-10-19 23:56:30,244 ----------------------------------------------------------------------------------------------------
2023-10-19 23:56:30,637 epoch 10 - iter 14/146 - loss 0.35982159 - time (sec): 0.39 - samples/sec: 12001.18 - lr: 0.000003 - momentum: 0.000000
2023-10-19 23:56:31,027 epoch 10 - iter 28/146 - loss 0.43016882 - time (sec): 0.78 - samples/sec: 12131.26 - lr: 0.000003 - momentum: 0.000000
2023-10-19 23:56:31,382 epoch 10 - iter 42/146 - loss 0.40530873 - time (sec): 1.14 - samples/sec: 12189.99 - lr: 0.000003 - momentum: 0.000000
2023-10-19 23:56:31,738 epoch 10 - iter 56/146 - loss 0.40504319 - time (sec): 1.49 - samples/sec: 11983.54 - lr: 0.000002 - momentum: 0.000000
2023-10-19 23:56:32,072 epoch 10 - iter 70/146 - loss 0.42251256 - time (sec): 1.83 - samples/sec: 11478.97 - lr: 0.000002 - momentum: 0.000000
2023-10-19 23:56:32,444 epoch 10 - iter 84/146 - loss 0.41757393 - time (sec): 2.20 - samples/sec: 11724.88 - lr: 0.000002 - momentum: 0.000000
2023-10-19 23:56:32,800 epoch 10 - iter 98/146 - loss 0.41278875 - time (sec): 2.56 - samples/sec: 11633.49 - lr: 0.000001 - momentum: 0.000000
2023-10-19 23:56:33,178 epoch 10 - iter 112/146 - loss 0.42903464 - time (sec): 2.93 - samples/sec: 11796.43 - lr: 0.000001 - momentum: 0.000000
2023-10-19 23:56:33,536 epoch 10 - iter 126/146 - loss 0.43335473 - time (sec): 3.29 - samples/sec: 11618.36 - lr: 0.000001 - momentum: 0.000000
2023-10-19 23:56:33,910 epoch 10 - iter 140/146 - loss 0.43507462 - time (sec): 3.67 - samples/sec: 11645.96 - lr: 0.000000 - momentum: 0.000000
2023-10-19 23:56:34,057 ----------------------------------------------------------------------------------------------------
2023-10-19 23:56:34,057 EPOCH 10 done: loss 0.4355 - lr: 0.000000
2023-10-19 23:56:34,694 DEV : loss 0.31846562027931213 - f1-score (micro avg) 0.1049
2023-10-19 23:56:34,725 ----------------------------------------------------------------------------------------------------
2023-10-19 23:56:34,726 Loading model from best epoch ...
2023-10-19 23:56:34,799 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-19 23:56:35,694
Results:
- F-score (micro) 0.172
- F-score (macro) 0.0739
- Accuracy 0.0958
By class:
precision recall f1-score support
PER 0.2470 0.2328 0.2396 348
LOC 0.3333 0.0307 0.0561 261
ORG 0.0000 0.0000 0.0000 52
HumanProd 0.0000 0.0000 0.0000 22
micro avg 0.2528 0.1303 0.1720 683
macro avg 0.1451 0.0659 0.0739 683
weighted avg 0.2532 0.1303 0.1436 683
2023-10-19 23:56:35,694 ----------------------------------------------------------------------------------------------------