stefan-it's picture
Upload folder using huggingface_hub
46ddc51
raw
history blame
23.9 kB
2023-10-18 22:40:00,249 ----------------------------------------------------------------------------------------------------
2023-10-18 22:40:00,249 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 22:40:00,249 ----------------------------------------------------------------------------------------------------
2023-10-18 22:40:00,249 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-18 22:40:00,249 ----------------------------------------------------------------------------------------------------
2023-10-18 22:40:00,249 Train: 5777 sentences
2023-10-18 22:40:00,249 (train_with_dev=False, train_with_test=False)
2023-10-18 22:40:00,250 ----------------------------------------------------------------------------------------------------
2023-10-18 22:40:00,250 Training Params:
2023-10-18 22:40:00,250 - learning_rate: "3e-05"
2023-10-18 22:40:00,250 - mini_batch_size: "8"
2023-10-18 22:40:00,250 - max_epochs: "10"
2023-10-18 22:40:00,250 - shuffle: "True"
2023-10-18 22:40:00,250 ----------------------------------------------------------------------------------------------------
2023-10-18 22:40:00,250 Plugins:
2023-10-18 22:40:00,250 - TensorboardLogger
2023-10-18 22:40:00,250 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 22:40:00,250 ----------------------------------------------------------------------------------------------------
2023-10-18 22:40:00,250 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 22:40:00,250 - metric: "('micro avg', 'f1-score')"
2023-10-18 22:40:00,250 ----------------------------------------------------------------------------------------------------
2023-10-18 22:40:00,250 Computation:
2023-10-18 22:40:00,250 - compute on device: cuda:0
2023-10-18 22:40:00,250 - embedding storage: none
2023-10-18 22:40:00,250 ----------------------------------------------------------------------------------------------------
2023-10-18 22:40:00,250 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-18 22:40:00,250 ----------------------------------------------------------------------------------------------------
2023-10-18 22:40:00,250 ----------------------------------------------------------------------------------------------------
2023-10-18 22:40:00,250 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 22:40:02,113 epoch 1 - iter 72/723 - loss 2.98841079 - time (sec): 1.86 - samples/sec: 9578.47 - lr: 0.000003 - momentum: 0.000000
2023-10-18 22:40:04,053 epoch 1 - iter 144/723 - loss 2.79139642 - time (sec): 3.80 - samples/sec: 9516.22 - lr: 0.000006 - momentum: 0.000000
2023-10-18 22:40:05,867 epoch 1 - iter 216/723 - loss 2.52716576 - time (sec): 5.62 - samples/sec: 9503.17 - lr: 0.000009 - momentum: 0.000000
2023-10-18 22:40:07,667 epoch 1 - iter 288/723 - loss 2.18597493 - time (sec): 7.42 - samples/sec: 9593.00 - lr: 0.000012 - momentum: 0.000000
2023-10-18 22:40:09,442 epoch 1 - iter 360/723 - loss 1.85823175 - time (sec): 9.19 - samples/sec: 9731.15 - lr: 0.000015 - momentum: 0.000000
2023-10-18 22:40:11,226 epoch 1 - iter 432/723 - loss 1.61794529 - time (sec): 10.98 - samples/sec: 9752.10 - lr: 0.000018 - momentum: 0.000000
2023-10-18 22:40:13,025 epoch 1 - iter 504/723 - loss 1.43043055 - time (sec): 12.77 - samples/sec: 9802.57 - lr: 0.000021 - momentum: 0.000000
2023-10-18 22:40:14,790 epoch 1 - iter 576/723 - loss 1.29820059 - time (sec): 14.54 - samples/sec: 9794.94 - lr: 0.000024 - momentum: 0.000000
2023-10-18 22:40:16,555 epoch 1 - iter 648/723 - loss 1.19724166 - time (sec): 16.30 - samples/sec: 9758.71 - lr: 0.000027 - momentum: 0.000000
2023-10-18 22:40:18,265 epoch 1 - iter 720/723 - loss 1.11187930 - time (sec): 18.01 - samples/sec: 9759.03 - lr: 0.000030 - momentum: 0.000000
2023-10-18 22:40:18,320 ----------------------------------------------------------------------------------------------------
2023-10-18 22:40:18,320 EPOCH 1 done: loss 1.1102 - lr: 0.000030
2023-10-18 22:40:19,582 DEV : loss 0.35777369141578674 - f1-score (micro avg) 0.0
2023-10-18 22:40:19,596 ----------------------------------------------------------------------------------------------------
2023-10-18 22:40:21,365 epoch 2 - iter 72/723 - loss 0.27341211 - time (sec): 1.77 - samples/sec: 9574.80 - lr: 0.000030 - momentum: 0.000000
2023-10-18 22:40:23,118 epoch 2 - iter 144/723 - loss 0.27156084 - time (sec): 3.52 - samples/sec: 9803.83 - lr: 0.000029 - momentum: 0.000000
2023-10-18 22:40:24,968 epoch 2 - iter 216/723 - loss 0.25524188 - time (sec): 5.37 - samples/sec: 9778.40 - lr: 0.000029 - momentum: 0.000000
2023-10-18 22:40:26,723 epoch 2 - iter 288/723 - loss 0.25718151 - time (sec): 7.13 - samples/sec: 9726.70 - lr: 0.000029 - momentum: 0.000000
2023-10-18 22:40:28,489 epoch 2 - iter 360/723 - loss 0.25119017 - time (sec): 8.89 - samples/sec: 9725.42 - lr: 0.000028 - momentum: 0.000000
2023-10-18 22:40:30,277 epoch 2 - iter 432/723 - loss 0.24390917 - time (sec): 10.68 - samples/sec: 9774.04 - lr: 0.000028 - momentum: 0.000000
2023-10-18 22:40:32,050 epoch 2 - iter 504/723 - loss 0.24273201 - time (sec): 12.45 - samples/sec: 9809.63 - lr: 0.000028 - momentum: 0.000000
2023-10-18 22:40:33,864 epoch 2 - iter 576/723 - loss 0.24040013 - time (sec): 14.27 - samples/sec: 9892.91 - lr: 0.000027 - momentum: 0.000000
2023-10-18 22:40:35,585 epoch 2 - iter 648/723 - loss 0.23365611 - time (sec): 15.99 - samples/sec: 9932.90 - lr: 0.000027 - momentum: 0.000000
2023-10-18 22:40:37,385 epoch 2 - iter 720/723 - loss 0.23607899 - time (sec): 17.79 - samples/sec: 9875.14 - lr: 0.000027 - momentum: 0.000000
2023-10-18 22:40:37,455 ----------------------------------------------------------------------------------------------------
2023-10-18 22:40:37,455 EPOCH 2 done: loss 0.2362 - lr: 0.000027
2023-10-18 22:40:39,578 DEV : loss 0.2550312876701355 - f1-score (micro avg) 0.2002
2023-10-18 22:40:39,592 saving best model
2023-10-18 22:40:39,622 ----------------------------------------------------------------------------------------------------
2023-10-18 22:40:41,214 epoch 3 - iter 72/723 - loss 0.22079626 - time (sec): 1.59 - samples/sec: 10781.78 - lr: 0.000026 - momentum: 0.000000
2023-10-18 22:40:42,794 epoch 3 - iter 144/723 - loss 0.20244048 - time (sec): 3.17 - samples/sec: 11109.25 - lr: 0.000026 - momentum: 0.000000
2023-10-18 22:40:44,423 epoch 3 - iter 216/723 - loss 0.19849172 - time (sec): 4.80 - samples/sec: 11002.12 - lr: 0.000026 - momentum: 0.000000
2023-10-18 22:40:46,182 epoch 3 - iter 288/723 - loss 0.19769252 - time (sec): 6.56 - samples/sec: 10779.32 - lr: 0.000025 - momentum: 0.000000
2023-10-18 22:40:47,893 epoch 3 - iter 360/723 - loss 0.19995418 - time (sec): 8.27 - samples/sec: 10451.44 - lr: 0.000025 - momentum: 0.000000
2023-10-18 22:40:49,679 epoch 3 - iter 432/723 - loss 0.20011392 - time (sec): 10.06 - samples/sec: 10377.07 - lr: 0.000025 - momentum: 0.000000
2023-10-18 22:40:51,486 epoch 3 - iter 504/723 - loss 0.19929608 - time (sec): 11.86 - samples/sec: 10340.39 - lr: 0.000024 - momentum: 0.000000
2023-10-18 22:40:53,195 epoch 3 - iter 576/723 - loss 0.20030929 - time (sec): 13.57 - samples/sec: 10252.78 - lr: 0.000024 - momentum: 0.000000
2023-10-18 22:40:54,998 epoch 3 - iter 648/723 - loss 0.20073613 - time (sec): 15.38 - samples/sec: 10270.49 - lr: 0.000024 - momentum: 0.000000
2023-10-18 22:40:56,810 epoch 3 - iter 720/723 - loss 0.19498318 - time (sec): 17.19 - samples/sec: 10222.93 - lr: 0.000023 - momentum: 0.000000
2023-10-18 22:40:56,873 ----------------------------------------------------------------------------------------------------
2023-10-18 22:40:56,873 EPOCH 3 done: loss 0.1949 - lr: 0.000023
2023-10-18 22:40:58,634 DEV : loss 0.2251613438129425 - f1-score (micro avg) 0.3257
2023-10-18 22:40:58,648 saving best model
2023-10-18 22:40:58,686 ----------------------------------------------------------------------------------------------------
2023-10-18 22:41:00,437 epoch 4 - iter 72/723 - loss 0.19403979 - time (sec): 1.75 - samples/sec: 10149.67 - lr: 0.000023 - momentum: 0.000000
2023-10-18 22:41:02,183 epoch 4 - iter 144/723 - loss 0.18784146 - time (sec): 3.50 - samples/sec: 9748.92 - lr: 0.000023 - momentum: 0.000000
2023-10-18 22:41:03,952 epoch 4 - iter 216/723 - loss 0.19203285 - time (sec): 5.26 - samples/sec: 9895.01 - lr: 0.000022 - momentum: 0.000000
2023-10-18 22:41:05,781 epoch 4 - iter 288/723 - loss 0.18324681 - time (sec): 7.09 - samples/sec: 9871.53 - lr: 0.000022 - momentum: 0.000000
2023-10-18 22:41:07,582 epoch 4 - iter 360/723 - loss 0.18156486 - time (sec): 8.89 - samples/sec: 9953.49 - lr: 0.000022 - momentum: 0.000000
2023-10-18 22:41:09,325 epoch 4 - iter 432/723 - loss 0.18116754 - time (sec): 10.64 - samples/sec: 9991.78 - lr: 0.000021 - momentum: 0.000000
2023-10-18 22:41:11,053 epoch 4 - iter 504/723 - loss 0.18093047 - time (sec): 12.37 - samples/sec: 9946.49 - lr: 0.000021 - momentum: 0.000000
2023-10-18 22:41:12,847 epoch 4 - iter 576/723 - loss 0.18201373 - time (sec): 14.16 - samples/sec: 9930.77 - lr: 0.000021 - momentum: 0.000000
2023-10-18 22:41:14,597 epoch 4 - iter 648/723 - loss 0.18085010 - time (sec): 15.91 - samples/sec: 9954.75 - lr: 0.000020 - momentum: 0.000000
2023-10-18 22:41:16,329 epoch 4 - iter 720/723 - loss 0.17999634 - time (sec): 17.64 - samples/sec: 9957.84 - lr: 0.000020 - momentum: 0.000000
2023-10-18 22:41:16,401 ----------------------------------------------------------------------------------------------------
2023-10-18 22:41:16,401 EPOCH 4 done: loss 0.1797 - lr: 0.000020
2023-10-18 22:41:18,476 DEV : loss 0.20678313076496124 - f1-score (micro avg) 0.3891
2023-10-18 22:41:18,490 saving best model
2023-10-18 22:41:18,525 ----------------------------------------------------------------------------------------------------
2023-10-18 22:41:20,277 epoch 5 - iter 72/723 - loss 0.17834339 - time (sec): 1.75 - samples/sec: 9709.97 - lr: 0.000020 - momentum: 0.000000
2023-10-18 22:41:22,062 epoch 5 - iter 144/723 - loss 0.16479979 - time (sec): 3.54 - samples/sec: 9719.79 - lr: 0.000019 - momentum: 0.000000
2023-10-18 22:41:23,813 epoch 5 - iter 216/723 - loss 0.16544296 - time (sec): 5.29 - samples/sec: 9775.51 - lr: 0.000019 - momentum: 0.000000
2023-10-18 22:41:25,599 epoch 5 - iter 288/723 - loss 0.16908757 - time (sec): 7.07 - samples/sec: 9698.46 - lr: 0.000019 - momentum: 0.000000
2023-10-18 22:41:27,411 epoch 5 - iter 360/723 - loss 0.16724889 - time (sec): 8.89 - samples/sec: 9833.64 - lr: 0.000018 - momentum: 0.000000
2023-10-18 22:41:29,173 epoch 5 - iter 432/723 - loss 0.16559838 - time (sec): 10.65 - samples/sec: 9937.96 - lr: 0.000018 - momentum: 0.000000
2023-10-18 22:41:30,855 epoch 5 - iter 504/723 - loss 0.16547999 - time (sec): 12.33 - samples/sec: 9999.51 - lr: 0.000018 - momentum: 0.000000
2023-10-18 22:41:32,671 epoch 5 - iter 576/723 - loss 0.16944376 - time (sec): 14.15 - samples/sec: 10007.08 - lr: 0.000017 - momentum: 0.000000
2023-10-18 22:41:34,392 epoch 5 - iter 648/723 - loss 0.17032973 - time (sec): 15.87 - samples/sec: 9953.67 - lr: 0.000017 - momentum: 0.000000
2023-10-18 22:41:36,124 epoch 5 - iter 720/723 - loss 0.16793137 - time (sec): 17.60 - samples/sec: 9971.25 - lr: 0.000017 - momentum: 0.000000
2023-10-18 22:41:36,187 ----------------------------------------------------------------------------------------------------
2023-10-18 22:41:36,187 EPOCH 5 done: loss 0.1682 - lr: 0.000017
2023-10-18 22:41:37,950 DEV : loss 0.21079857647418976 - f1-score (micro avg) 0.4172
2023-10-18 22:41:37,965 saving best model
2023-10-18 22:41:38,003 ----------------------------------------------------------------------------------------------------
2023-10-18 22:41:39,687 epoch 6 - iter 72/723 - loss 0.15977195 - time (sec): 1.68 - samples/sec: 9982.07 - lr: 0.000016 - momentum: 0.000000
2023-10-18 22:41:41,451 epoch 6 - iter 144/723 - loss 0.16016708 - time (sec): 3.45 - samples/sec: 10061.70 - lr: 0.000016 - momentum: 0.000000
2023-10-18 22:41:43,215 epoch 6 - iter 216/723 - loss 0.16708014 - time (sec): 5.21 - samples/sec: 10028.21 - lr: 0.000016 - momentum: 0.000000
2023-10-18 22:41:45,039 epoch 6 - iter 288/723 - loss 0.16529748 - time (sec): 7.04 - samples/sec: 10064.56 - lr: 0.000015 - momentum: 0.000000
2023-10-18 22:41:46,864 epoch 6 - iter 360/723 - loss 0.16782442 - time (sec): 8.86 - samples/sec: 10167.58 - lr: 0.000015 - momentum: 0.000000
2023-10-18 22:41:48,558 epoch 6 - iter 432/723 - loss 0.16816204 - time (sec): 10.55 - samples/sec: 10059.28 - lr: 0.000015 - momentum: 0.000000
2023-10-18 22:41:50,301 epoch 6 - iter 504/723 - loss 0.16538144 - time (sec): 12.30 - samples/sec: 10045.86 - lr: 0.000014 - momentum: 0.000000
2023-10-18 22:41:52,379 epoch 6 - iter 576/723 - loss 0.16214967 - time (sec): 14.38 - samples/sec: 9747.49 - lr: 0.000014 - momentum: 0.000000
2023-10-18 22:41:53,863 epoch 6 - iter 648/723 - loss 0.16245142 - time (sec): 15.86 - samples/sec: 9935.83 - lr: 0.000014 - momentum: 0.000000
2023-10-18 22:41:55,341 epoch 6 - iter 720/723 - loss 0.16180617 - time (sec): 17.34 - samples/sec: 10132.02 - lr: 0.000013 - momentum: 0.000000
2023-10-18 22:41:55,395 ----------------------------------------------------------------------------------------------------
2023-10-18 22:41:55,395 EPOCH 6 done: loss 0.1620 - lr: 0.000013
2023-10-18 22:41:57,177 DEV : loss 0.19605979323387146 - f1-score (micro avg) 0.4388
2023-10-18 22:41:57,192 saving best model
2023-10-18 22:41:57,229 ----------------------------------------------------------------------------------------------------
2023-10-18 22:41:59,142 epoch 7 - iter 72/723 - loss 0.15681213 - time (sec): 1.91 - samples/sec: 9891.47 - lr: 0.000013 - momentum: 0.000000
2023-10-18 22:42:01,017 epoch 7 - iter 144/723 - loss 0.16028128 - time (sec): 3.79 - samples/sec: 9988.93 - lr: 0.000013 - momentum: 0.000000
2023-10-18 22:42:02,852 epoch 7 - iter 216/723 - loss 0.15737959 - time (sec): 5.62 - samples/sec: 9661.12 - lr: 0.000012 - momentum: 0.000000
2023-10-18 22:42:04,713 epoch 7 - iter 288/723 - loss 0.15461758 - time (sec): 7.48 - samples/sec: 9652.08 - lr: 0.000012 - momentum: 0.000000
2023-10-18 22:42:06,639 epoch 7 - iter 360/723 - loss 0.15562415 - time (sec): 9.41 - samples/sec: 9633.34 - lr: 0.000012 - momentum: 0.000000
2023-10-18 22:42:08,393 epoch 7 - iter 432/723 - loss 0.15601858 - time (sec): 11.16 - samples/sec: 9547.00 - lr: 0.000011 - momentum: 0.000000
2023-10-18 22:42:10,195 epoch 7 - iter 504/723 - loss 0.15634551 - time (sec): 12.97 - samples/sec: 9604.06 - lr: 0.000011 - momentum: 0.000000
2023-10-18 22:42:11,917 epoch 7 - iter 576/723 - loss 0.15605432 - time (sec): 14.69 - samples/sec: 9593.42 - lr: 0.000011 - momentum: 0.000000
2023-10-18 22:42:13,742 epoch 7 - iter 648/723 - loss 0.15788652 - time (sec): 16.51 - samples/sec: 9581.30 - lr: 0.000010 - momentum: 0.000000
2023-10-18 22:42:15,506 epoch 7 - iter 720/723 - loss 0.15585343 - time (sec): 18.28 - samples/sec: 9595.68 - lr: 0.000010 - momentum: 0.000000
2023-10-18 22:42:15,576 ----------------------------------------------------------------------------------------------------
2023-10-18 22:42:15,577 EPOCH 7 done: loss 0.1557 - lr: 0.000010
2023-10-18 22:42:17,340 DEV : loss 0.18651294708251953 - f1-score (micro avg) 0.4956
2023-10-18 22:42:17,355 saving best model
2023-10-18 22:42:17,391 ----------------------------------------------------------------------------------------------------
2023-10-18 22:42:19,153 epoch 8 - iter 72/723 - loss 0.14374815 - time (sec): 1.76 - samples/sec: 10673.65 - lr: 0.000010 - momentum: 0.000000
2023-10-18 22:42:20,917 epoch 8 - iter 144/723 - loss 0.14794415 - time (sec): 3.53 - samples/sec: 10127.24 - lr: 0.000009 - momentum: 0.000000
2023-10-18 22:42:22,804 epoch 8 - iter 216/723 - loss 0.14360194 - time (sec): 5.41 - samples/sec: 10006.46 - lr: 0.000009 - momentum: 0.000000
2023-10-18 22:42:25,061 epoch 8 - iter 288/723 - loss 0.14897909 - time (sec): 7.67 - samples/sec: 9460.82 - lr: 0.000009 - momentum: 0.000000
2023-10-18 22:42:26,865 epoch 8 - iter 360/723 - loss 0.14900531 - time (sec): 9.47 - samples/sec: 9464.33 - lr: 0.000008 - momentum: 0.000000
2023-10-18 22:42:28,695 epoch 8 - iter 432/723 - loss 0.15163665 - time (sec): 11.30 - samples/sec: 9484.24 - lr: 0.000008 - momentum: 0.000000
2023-10-18 22:42:30,451 epoch 8 - iter 504/723 - loss 0.15049384 - time (sec): 13.06 - samples/sec: 9426.82 - lr: 0.000008 - momentum: 0.000000
2023-10-18 22:42:32,371 epoch 8 - iter 576/723 - loss 0.15391483 - time (sec): 14.98 - samples/sec: 9457.19 - lr: 0.000007 - momentum: 0.000000
2023-10-18 22:42:34,142 epoch 8 - iter 648/723 - loss 0.15242765 - time (sec): 16.75 - samples/sec: 9459.25 - lr: 0.000007 - momentum: 0.000000
2023-10-18 22:42:35,955 epoch 8 - iter 720/723 - loss 0.15035247 - time (sec): 18.56 - samples/sec: 9469.65 - lr: 0.000007 - momentum: 0.000000
2023-10-18 22:42:36,018 ----------------------------------------------------------------------------------------------------
2023-10-18 22:42:36,018 EPOCH 8 done: loss 0.1509 - lr: 0.000007
2023-10-18 22:42:37,781 DEV : loss 0.1935824304819107 - f1-score (micro avg) 0.4684
2023-10-18 22:42:37,796 ----------------------------------------------------------------------------------------------------
2023-10-18 22:42:39,582 epoch 9 - iter 72/723 - loss 0.14444690 - time (sec): 1.78 - samples/sec: 10509.51 - lr: 0.000006 - momentum: 0.000000
2023-10-18 22:42:41,335 epoch 9 - iter 144/723 - loss 0.14963003 - time (sec): 3.54 - samples/sec: 10394.85 - lr: 0.000006 - momentum: 0.000000
2023-10-18 22:42:43,151 epoch 9 - iter 216/723 - loss 0.14951345 - time (sec): 5.35 - samples/sec: 10242.28 - lr: 0.000006 - momentum: 0.000000
2023-10-18 22:42:45,037 epoch 9 - iter 288/723 - loss 0.15149133 - time (sec): 7.24 - samples/sec: 10096.19 - lr: 0.000005 - momentum: 0.000000
2023-10-18 22:42:46,793 epoch 9 - iter 360/723 - loss 0.15160429 - time (sec): 9.00 - samples/sec: 9962.62 - lr: 0.000005 - momentum: 0.000000
2023-10-18 22:42:48,624 epoch 9 - iter 432/723 - loss 0.15049137 - time (sec): 10.83 - samples/sec: 9947.91 - lr: 0.000005 - momentum: 0.000000
2023-10-18 22:42:50,372 epoch 9 - iter 504/723 - loss 0.15011278 - time (sec): 12.58 - samples/sec: 9938.17 - lr: 0.000004 - momentum: 0.000000
2023-10-18 22:42:52,084 epoch 9 - iter 576/723 - loss 0.14872092 - time (sec): 14.29 - samples/sec: 9907.26 - lr: 0.000004 - momentum: 0.000000
2023-10-18 22:42:53,983 epoch 9 - iter 648/723 - loss 0.14660520 - time (sec): 16.19 - samples/sec: 9860.50 - lr: 0.000004 - momentum: 0.000000
2023-10-18 22:42:55,746 epoch 9 - iter 720/723 - loss 0.14745168 - time (sec): 17.95 - samples/sec: 9787.18 - lr: 0.000003 - momentum: 0.000000
2023-10-18 22:42:55,805 ----------------------------------------------------------------------------------------------------
2023-10-18 22:42:55,805 EPOCH 9 done: loss 0.1474 - lr: 0.000003
2023-10-18 22:42:57,583 DEV : loss 0.18601642549037933 - f1-score (micro avg) 0.4934
2023-10-18 22:42:57,598 ----------------------------------------------------------------------------------------------------
2023-10-18 22:42:59,400 epoch 10 - iter 72/723 - loss 0.14058348 - time (sec): 1.80 - samples/sec: 9685.58 - lr: 0.000003 - momentum: 0.000000
2023-10-18 22:43:01,619 epoch 10 - iter 144/723 - loss 0.12600196 - time (sec): 4.02 - samples/sec: 8799.97 - lr: 0.000003 - momentum: 0.000000
2023-10-18 22:43:03,414 epoch 10 - iter 216/723 - loss 0.13939819 - time (sec): 5.82 - samples/sec: 9148.79 - lr: 0.000002 - momentum: 0.000000
2023-10-18 22:43:05,189 epoch 10 - iter 288/723 - loss 0.14322160 - time (sec): 7.59 - samples/sec: 9259.14 - lr: 0.000002 - momentum: 0.000000
2023-10-18 22:43:07,030 epoch 10 - iter 360/723 - loss 0.14226797 - time (sec): 9.43 - samples/sec: 9420.73 - lr: 0.000002 - momentum: 0.000000
2023-10-18 22:43:08,799 epoch 10 - iter 432/723 - loss 0.14213505 - time (sec): 11.20 - samples/sec: 9483.74 - lr: 0.000001 - momentum: 0.000000
2023-10-18 22:43:10,537 epoch 10 - iter 504/723 - loss 0.14173779 - time (sec): 12.94 - samples/sec: 9470.36 - lr: 0.000001 - momentum: 0.000000
2023-10-18 22:43:12,404 epoch 10 - iter 576/723 - loss 0.14185805 - time (sec): 14.81 - samples/sec: 9513.49 - lr: 0.000001 - momentum: 0.000000
2023-10-18 22:43:14,188 epoch 10 - iter 648/723 - loss 0.14424631 - time (sec): 16.59 - samples/sec: 9497.25 - lr: 0.000000 - momentum: 0.000000
2023-10-18 22:43:16,019 epoch 10 - iter 720/723 - loss 0.14588941 - time (sec): 18.42 - samples/sec: 9528.18 - lr: 0.000000 - momentum: 0.000000
2023-10-18 22:43:16,079 ----------------------------------------------------------------------------------------------------
2023-10-18 22:43:16,080 EPOCH 10 done: loss 0.1456 - lr: 0.000000
2023-10-18 22:43:17,855 DEV : loss 0.1870052069425583 - f1-score (micro avg) 0.4892
2023-10-18 22:43:17,901 ----------------------------------------------------------------------------------------------------
2023-10-18 22:43:17,902 Loading model from best epoch ...
2023-10-18 22:43:17,984 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-18 22:43:19,327
Results:
- F-score (micro) 0.5152
- F-score (macro) 0.3532
- Accuracy 0.3613
By class:
precision recall f1-score support
LOC 0.5805 0.5983 0.5892 458
PER 0.6744 0.3610 0.4703 482
ORG 0.0000 0.0000 0.0000 69
micro avg 0.6137 0.4440 0.5152 1009
macro avg 0.4183 0.3197 0.3532 1009
weighted avg 0.5857 0.4440 0.4921 1009
2023-10-18 22:43:19,327 ----------------------------------------------------------------------------------------------------