stefan-it's picture
Upload folder using huggingface_hub
4044bce
2023-10-19 23:57:29,912 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:29,913 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-19 23:57:29,913 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:29,913 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-19 23:57:29,913 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:29,913 Train: 1166 sentences
2023-10-19 23:57:29,913 (train_with_dev=False, train_with_test=False)
2023-10-19 23:57:29,913 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:29,913 Training Params:
2023-10-19 23:57:29,913 - learning_rate: "3e-05"
2023-10-19 23:57:29,913 - mini_batch_size: "4"
2023-10-19 23:57:29,913 - max_epochs: "10"
2023-10-19 23:57:29,913 - shuffle: "True"
2023-10-19 23:57:29,913 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:29,913 Plugins:
2023-10-19 23:57:29,913 - TensorboardLogger
2023-10-19 23:57:29,913 - LinearScheduler | warmup_fraction: '0.1'
2023-10-19 23:57:29,913 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:29,913 Final evaluation on model from best epoch (best-model.pt)
2023-10-19 23:57:29,913 - metric: "('micro avg', 'f1-score')"
2023-10-19 23:57:29,914 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:29,914 Computation:
2023-10-19 23:57:29,914 - compute on device: cuda:0
2023-10-19 23:57:29,914 - embedding storage: none
2023-10-19 23:57:29,914 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:29,914 Model training base path: "hmbench-newseye/fi-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-19 23:57:29,914 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:29,914 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:29,914 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-19 23:57:30,355 epoch 1 - iter 29/292 - loss 3.15858994 - time (sec): 0.44 - samples/sec: 8983.13 - lr: 0.000003 - momentum: 0.000000
2023-10-19 23:57:30,831 epoch 1 - iter 58/292 - loss 3.12098443 - time (sec): 0.92 - samples/sec: 8389.04 - lr: 0.000006 - momentum: 0.000000
2023-10-19 23:57:31,365 epoch 1 - iter 87/292 - loss 3.09355439 - time (sec): 1.45 - samples/sec: 8367.32 - lr: 0.000009 - momentum: 0.000000
2023-10-19 23:57:31,877 epoch 1 - iter 116/292 - loss 2.97640957 - time (sec): 1.96 - samples/sec: 8214.22 - lr: 0.000012 - momentum: 0.000000
2023-10-19 23:57:32,386 epoch 1 - iter 145/292 - loss 2.82557112 - time (sec): 2.47 - samples/sec: 8349.50 - lr: 0.000015 - momentum: 0.000000
2023-10-19 23:57:32,887 epoch 1 - iter 174/292 - loss 2.65406568 - time (sec): 2.97 - samples/sec: 8321.21 - lr: 0.000018 - momentum: 0.000000
2023-10-19 23:57:33,409 epoch 1 - iter 203/292 - loss 2.39656343 - time (sec): 3.50 - samples/sec: 8608.21 - lr: 0.000021 - momentum: 0.000000
2023-10-19 23:57:33,935 epoch 1 - iter 232/292 - loss 2.21048062 - time (sec): 4.02 - samples/sec: 8588.98 - lr: 0.000024 - momentum: 0.000000
2023-10-19 23:57:34,473 epoch 1 - iter 261/292 - loss 2.03861344 - time (sec): 4.56 - samples/sec: 8670.12 - lr: 0.000027 - momentum: 0.000000
2023-10-19 23:57:34,988 epoch 1 - iter 290/292 - loss 1.90732826 - time (sec): 5.07 - samples/sec: 8730.43 - lr: 0.000030 - momentum: 0.000000
2023-10-19 23:57:35,016 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:35,016 EPOCH 1 done: loss 1.9026 - lr: 0.000030
2023-10-19 23:57:35,275 DEV : loss 0.4726361334323883 - f1-score (micro avg) 0.0
2023-10-19 23:57:35,279 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:35,786 epoch 2 - iter 29/292 - loss 0.88527386 - time (sec): 0.51 - samples/sec: 9692.47 - lr: 0.000030 - momentum: 0.000000
2023-10-19 23:57:36,304 epoch 2 - iter 58/292 - loss 0.82889086 - time (sec): 1.02 - samples/sec: 9394.52 - lr: 0.000029 - momentum: 0.000000
2023-10-19 23:57:36,803 epoch 2 - iter 87/292 - loss 0.78547384 - time (sec): 1.52 - samples/sec: 9030.81 - lr: 0.000029 - momentum: 0.000000
2023-10-19 23:57:37,269 epoch 2 - iter 116/292 - loss 0.78010344 - time (sec): 1.99 - samples/sec: 8830.24 - lr: 0.000029 - momentum: 0.000000
2023-10-19 23:57:37,757 epoch 2 - iter 145/292 - loss 0.75316303 - time (sec): 2.48 - samples/sec: 8742.69 - lr: 0.000028 - momentum: 0.000000
2023-10-19 23:57:38,256 epoch 2 - iter 174/292 - loss 0.73185571 - time (sec): 2.98 - samples/sec: 8727.51 - lr: 0.000028 - momentum: 0.000000
2023-10-19 23:57:38,755 epoch 2 - iter 203/292 - loss 0.71216728 - time (sec): 3.48 - samples/sec: 8641.50 - lr: 0.000028 - momentum: 0.000000
2023-10-19 23:57:39,287 epoch 2 - iter 232/292 - loss 0.67823755 - time (sec): 4.01 - samples/sec: 8867.99 - lr: 0.000027 - momentum: 0.000000
2023-10-19 23:57:39,818 epoch 2 - iter 261/292 - loss 0.66528632 - time (sec): 4.54 - samples/sec: 8918.28 - lr: 0.000027 - momentum: 0.000000
2023-10-19 23:57:40,304 epoch 2 - iter 290/292 - loss 0.66538611 - time (sec): 5.02 - samples/sec: 8778.91 - lr: 0.000027 - momentum: 0.000000
2023-10-19 23:57:40,338 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:40,338 EPOCH 2 done: loss 0.6640 - lr: 0.000027
2023-10-19 23:57:40,964 DEV : loss 0.4068935811519623 - f1-score (micro avg) 0.0
2023-10-19 23:57:40,968 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:41,468 epoch 3 - iter 29/292 - loss 0.50362782 - time (sec): 0.50 - samples/sec: 8760.21 - lr: 0.000026 - momentum: 0.000000
2023-10-19 23:57:41,968 epoch 3 - iter 58/292 - loss 0.52872549 - time (sec): 1.00 - samples/sec: 8770.81 - lr: 0.000026 - momentum: 0.000000
2023-10-19 23:57:42,484 epoch 3 - iter 87/292 - loss 0.55048926 - time (sec): 1.51 - samples/sec: 8999.06 - lr: 0.000026 - momentum: 0.000000
2023-10-19 23:57:43,018 epoch 3 - iter 116/292 - loss 0.59262551 - time (sec): 2.05 - samples/sec: 8752.53 - lr: 0.000025 - momentum: 0.000000
2023-10-19 23:57:43,690 epoch 3 - iter 145/292 - loss 0.58792425 - time (sec): 2.72 - samples/sec: 8189.27 - lr: 0.000025 - momentum: 0.000000
2023-10-19 23:57:44,225 epoch 3 - iter 174/292 - loss 0.57780683 - time (sec): 3.26 - samples/sec: 8332.04 - lr: 0.000025 - momentum: 0.000000
2023-10-19 23:57:44,708 epoch 3 - iter 203/292 - loss 0.57106053 - time (sec): 3.74 - samples/sec: 8318.01 - lr: 0.000024 - momentum: 0.000000
2023-10-19 23:57:45,293 epoch 3 - iter 232/292 - loss 0.56074661 - time (sec): 4.32 - samples/sec: 8263.42 - lr: 0.000024 - momentum: 0.000000
2023-10-19 23:57:45,792 epoch 3 - iter 261/292 - loss 0.55453111 - time (sec): 4.82 - samples/sec: 8194.65 - lr: 0.000024 - momentum: 0.000000
2023-10-19 23:57:46,308 epoch 3 - iter 290/292 - loss 0.55053773 - time (sec): 5.34 - samples/sec: 8260.74 - lr: 0.000023 - momentum: 0.000000
2023-10-19 23:57:46,345 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:46,345 EPOCH 3 done: loss 0.5490 - lr: 0.000023
2023-10-19 23:57:46,967 DEV : loss 0.3729143738746643 - f1-score (micro avg) 0.0
2023-10-19 23:57:46,971 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:47,455 epoch 4 - iter 29/292 - loss 0.44360140 - time (sec): 0.48 - samples/sec: 8111.28 - lr: 0.000023 - momentum: 0.000000
2023-10-19 23:57:47,971 epoch 4 - iter 58/292 - loss 0.46127904 - time (sec): 1.00 - samples/sec: 8078.16 - lr: 0.000023 - momentum: 0.000000
2023-10-19 23:57:48,506 epoch 4 - iter 87/292 - loss 0.45386077 - time (sec): 1.53 - samples/sec: 8329.82 - lr: 0.000022 - momentum: 0.000000
2023-10-19 23:57:49,025 epoch 4 - iter 116/292 - loss 0.45066227 - time (sec): 2.05 - samples/sec: 8285.59 - lr: 0.000022 - momentum: 0.000000
2023-10-19 23:57:49,555 epoch 4 - iter 145/292 - loss 0.45473106 - time (sec): 2.58 - samples/sec: 8211.85 - lr: 0.000022 - momentum: 0.000000
2023-10-19 23:57:50,065 epoch 4 - iter 174/292 - loss 0.45803790 - time (sec): 3.09 - samples/sec: 8220.40 - lr: 0.000021 - momentum: 0.000000
2023-10-19 23:57:50,606 epoch 4 - iter 203/292 - loss 0.47679097 - time (sec): 3.63 - samples/sec: 8472.15 - lr: 0.000021 - momentum: 0.000000
2023-10-19 23:57:51,100 epoch 4 - iter 232/292 - loss 0.47649244 - time (sec): 4.13 - samples/sec: 8360.31 - lr: 0.000021 - momentum: 0.000000
2023-10-19 23:57:51,602 epoch 4 - iter 261/292 - loss 0.47283559 - time (sec): 4.63 - samples/sec: 8370.13 - lr: 0.000020 - momentum: 0.000000
2023-10-19 23:57:52,144 epoch 4 - iter 290/292 - loss 0.47776120 - time (sec): 5.17 - samples/sec: 8569.51 - lr: 0.000020 - momentum: 0.000000
2023-10-19 23:57:52,176 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:52,176 EPOCH 4 done: loss 0.4775 - lr: 0.000020
2023-10-19 23:57:52,799 DEV : loss 0.33535653352737427 - f1-score (micro avg) 0.0368
2023-10-19 23:57:52,803 saving best model
2023-10-19 23:57:52,832 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:53,377 epoch 5 - iter 29/292 - loss 0.45846940 - time (sec): 0.54 - samples/sec: 9610.96 - lr: 0.000020 - momentum: 0.000000
2023-10-19 23:57:53,907 epoch 5 - iter 58/292 - loss 0.51705999 - time (sec): 1.07 - samples/sec: 9124.70 - lr: 0.000019 - momentum: 0.000000
2023-10-19 23:57:54,398 epoch 5 - iter 87/292 - loss 0.48221681 - time (sec): 1.57 - samples/sec: 8759.51 - lr: 0.000019 - momentum: 0.000000
2023-10-19 23:57:54,929 epoch 5 - iter 116/292 - loss 0.46910511 - time (sec): 2.10 - samples/sec: 8518.22 - lr: 0.000019 - momentum: 0.000000
2023-10-19 23:57:55,498 epoch 5 - iter 145/292 - loss 0.46148846 - time (sec): 2.67 - samples/sec: 8533.84 - lr: 0.000018 - momentum: 0.000000
2023-10-19 23:57:56,036 epoch 5 - iter 174/292 - loss 0.44684775 - time (sec): 3.20 - samples/sec: 8422.22 - lr: 0.000018 - momentum: 0.000000
2023-10-19 23:57:56,551 epoch 5 - iter 203/292 - loss 0.44853019 - time (sec): 3.72 - samples/sec: 8311.04 - lr: 0.000018 - momentum: 0.000000
2023-10-19 23:57:57,125 epoch 5 - iter 232/292 - loss 0.46105102 - time (sec): 4.29 - samples/sec: 8270.59 - lr: 0.000017 - momentum: 0.000000
2023-10-19 23:57:57,639 epoch 5 - iter 261/292 - loss 0.45781694 - time (sec): 4.81 - samples/sec: 8194.08 - lr: 0.000017 - momentum: 0.000000
2023-10-19 23:57:58,166 epoch 5 - iter 290/292 - loss 0.44694408 - time (sec): 5.33 - samples/sec: 8314.90 - lr: 0.000017 - momentum: 0.000000
2023-10-19 23:57:58,196 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:58,196 EPOCH 5 done: loss 0.4463 - lr: 0.000017
2023-10-19 23:57:58,831 DEV : loss 0.33983153104782104 - f1-score (micro avg) 0.0687
2023-10-19 23:57:58,835 saving best model
2023-10-19 23:57:58,868 ----------------------------------------------------------------------------------------------------
2023-10-19 23:57:59,383 epoch 6 - iter 29/292 - loss 0.44689006 - time (sec): 0.51 - samples/sec: 9067.63 - lr: 0.000016 - momentum: 0.000000
2023-10-19 23:57:59,890 epoch 6 - iter 58/292 - loss 0.42022345 - time (sec): 1.02 - samples/sec: 8248.87 - lr: 0.000016 - momentum: 0.000000
2023-10-19 23:58:00,381 epoch 6 - iter 87/292 - loss 0.43547237 - time (sec): 1.51 - samples/sec: 8374.78 - lr: 0.000016 - momentum: 0.000000
2023-10-19 23:58:00,902 epoch 6 - iter 116/292 - loss 0.42767037 - time (sec): 2.03 - samples/sec: 8496.53 - lr: 0.000015 - momentum: 0.000000
2023-10-19 23:58:01,413 epoch 6 - iter 145/292 - loss 0.41360136 - time (sec): 2.54 - samples/sec: 8648.10 - lr: 0.000015 - momentum: 0.000000
2023-10-19 23:58:01,950 epoch 6 - iter 174/292 - loss 0.41434310 - time (sec): 3.08 - samples/sec: 8717.91 - lr: 0.000015 - momentum: 0.000000
2023-10-19 23:58:02,459 epoch 6 - iter 203/292 - loss 0.40041453 - time (sec): 3.59 - samples/sec: 8764.22 - lr: 0.000014 - momentum: 0.000000
2023-10-19 23:58:02,969 epoch 6 - iter 232/292 - loss 0.40130692 - time (sec): 4.10 - samples/sec: 8773.71 - lr: 0.000014 - momentum: 0.000000
2023-10-19 23:58:03,485 epoch 6 - iter 261/292 - loss 0.40438520 - time (sec): 4.62 - samples/sec: 8665.74 - lr: 0.000014 - momentum: 0.000000
2023-10-19 23:58:04,000 epoch 6 - iter 290/292 - loss 0.41311196 - time (sec): 5.13 - samples/sec: 8614.21 - lr: 0.000013 - momentum: 0.000000
2023-10-19 23:58:04,028 ----------------------------------------------------------------------------------------------------
2023-10-19 23:58:04,028 EPOCH 6 done: loss 0.4149 - lr: 0.000013
2023-10-19 23:58:04,664 DEV : loss 0.327802449464798 - f1-score (micro avg) 0.1477
2023-10-19 23:58:04,668 saving best model
2023-10-19 23:58:04,701 ----------------------------------------------------------------------------------------------------
2023-10-19 23:58:05,249 epoch 7 - iter 29/292 - loss 0.32104118 - time (sec): 0.55 - samples/sec: 10152.81 - lr: 0.000013 - momentum: 0.000000
2023-10-19 23:58:05,738 epoch 7 - iter 58/292 - loss 0.39113470 - time (sec): 1.04 - samples/sec: 9059.74 - lr: 0.000013 - momentum: 0.000000
2023-10-19 23:58:06,250 epoch 7 - iter 87/292 - loss 0.40749106 - time (sec): 1.55 - samples/sec: 8684.18 - lr: 0.000012 - momentum: 0.000000
2023-10-19 23:58:06,752 epoch 7 - iter 116/292 - loss 0.38645042 - time (sec): 2.05 - samples/sec: 8722.49 - lr: 0.000012 - momentum: 0.000000
2023-10-19 23:58:07,237 epoch 7 - iter 145/292 - loss 0.39391474 - time (sec): 2.53 - samples/sec: 8539.01 - lr: 0.000012 - momentum: 0.000000
2023-10-19 23:58:07,756 epoch 7 - iter 174/292 - loss 0.40980659 - time (sec): 3.05 - samples/sec: 8761.86 - lr: 0.000011 - momentum: 0.000000
2023-10-19 23:58:08,271 epoch 7 - iter 203/292 - loss 0.40209976 - time (sec): 3.57 - samples/sec: 8856.92 - lr: 0.000011 - momentum: 0.000000
2023-10-19 23:58:08,783 epoch 7 - iter 232/292 - loss 0.40744235 - time (sec): 4.08 - samples/sec: 8804.31 - lr: 0.000011 - momentum: 0.000000
2023-10-19 23:58:09,293 epoch 7 - iter 261/292 - loss 0.39712786 - time (sec): 4.59 - samples/sec: 8721.50 - lr: 0.000010 - momentum: 0.000000
2023-10-19 23:58:09,787 epoch 7 - iter 290/292 - loss 0.39302474 - time (sec): 5.09 - samples/sec: 8678.71 - lr: 0.000010 - momentum: 0.000000
2023-10-19 23:58:09,819 ----------------------------------------------------------------------------------------------------
2023-10-19 23:58:09,819 EPOCH 7 done: loss 0.3938 - lr: 0.000010
2023-10-19 23:58:10,451 DEV : loss 0.31045615673065186 - f1-score (micro avg) 0.1753
2023-10-19 23:58:10,455 saving best model
2023-10-19 23:58:10,487 ----------------------------------------------------------------------------------------------------
2023-10-19 23:58:10,978 epoch 8 - iter 29/292 - loss 0.36499034 - time (sec): 0.49 - samples/sec: 8858.41 - lr: 0.000010 - momentum: 0.000000
2023-10-19 23:58:11,501 epoch 8 - iter 58/292 - loss 0.39731528 - time (sec): 1.01 - samples/sec: 9008.12 - lr: 0.000009 - momentum: 0.000000
2023-10-19 23:58:12,045 epoch 8 - iter 87/292 - loss 0.35849483 - time (sec): 1.56 - samples/sec: 9379.00 - lr: 0.000009 - momentum: 0.000000
2023-10-19 23:58:12,536 epoch 8 - iter 116/292 - loss 0.36888550 - time (sec): 2.05 - samples/sec: 8977.23 - lr: 0.000009 - momentum: 0.000000
2023-10-19 23:58:13,008 epoch 8 - iter 145/292 - loss 0.37562375 - time (sec): 2.52 - samples/sec: 8673.85 - lr: 0.000008 - momentum: 0.000000
2023-10-19 23:58:13,513 epoch 8 - iter 174/292 - loss 0.38262840 - time (sec): 3.03 - samples/sec: 8619.98 - lr: 0.000008 - momentum: 0.000000
2023-10-19 23:58:14,029 epoch 8 - iter 203/292 - loss 0.37658732 - time (sec): 3.54 - samples/sec: 8551.23 - lr: 0.000008 - momentum: 0.000000
2023-10-19 23:58:14,533 epoch 8 - iter 232/292 - loss 0.38303644 - time (sec): 4.05 - samples/sec: 8498.09 - lr: 0.000007 - momentum: 0.000000
2023-10-19 23:58:15,048 epoch 8 - iter 261/292 - loss 0.37867777 - time (sec): 4.56 - samples/sec: 8501.41 - lr: 0.000007 - momentum: 0.000000
2023-10-19 23:58:15,578 epoch 8 - iter 290/292 - loss 0.39456462 - time (sec): 5.09 - samples/sec: 8667.21 - lr: 0.000007 - momentum: 0.000000
2023-10-19 23:58:15,611 ----------------------------------------------------------------------------------------------------
2023-10-19 23:58:15,611 EPOCH 8 done: loss 0.3925 - lr: 0.000007
2023-10-19 23:58:16,250 DEV : loss 0.31813567876815796 - f1-score (micro avg) 0.1717
2023-10-19 23:58:16,254 ----------------------------------------------------------------------------------------------------
2023-10-19 23:58:16,758 epoch 9 - iter 29/292 - loss 0.40428918 - time (sec): 0.50 - samples/sec: 7980.78 - lr: 0.000006 - momentum: 0.000000
2023-10-19 23:58:17,287 epoch 9 - iter 58/292 - loss 0.36836772 - time (sec): 1.03 - samples/sec: 7894.99 - lr: 0.000006 - momentum: 0.000000
2023-10-19 23:58:17,850 epoch 9 - iter 87/292 - loss 0.37138841 - time (sec): 1.60 - samples/sec: 8060.88 - lr: 0.000006 - momentum: 0.000000
2023-10-19 23:58:18,362 epoch 9 - iter 116/292 - loss 0.35584352 - time (sec): 2.11 - samples/sec: 8099.67 - lr: 0.000005 - momentum: 0.000000
2023-10-19 23:58:18,884 epoch 9 - iter 145/292 - loss 0.36577051 - time (sec): 2.63 - samples/sec: 7966.11 - lr: 0.000005 - momentum: 0.000000
2023-10-19 23:58:19,552 epoch 9 - iter 174/292 - loss 0.36762424 - time (sec): 3.30 - samples/sec: 7731.22 - lr: 0.000005 - momentum: 0.000000
2023-10-19 23:58:20,069 epoch 9 - iter 203/292 - loss 0.36914510 - time (sec): 3.82 - samples/sec: 7877.82 - lr: 0.000004 - momentum: 0.000000
2023-10-19 23:58:20,623 epoch 9 - iter 232/292 - loss 0.38112081 - time (sec): 4.37 - samples/sec: 8101.40 - lr: 0.000004 - momentum: 0.000000
2023-10-19 23:58:21,126 epoch 9 - iter 261/292 - loss 0.37727984 - time (sec): 4.87 - samples/sec: 8146.24 - lr: 0.000004 - momentum: 0.000000
2023-10-19 23:58:21,648 epoch 9 - iter 290/292 - loss 0.38320637 - time (sec): 5.39 - samples/sec: 8211.37 - lr: 0.000003 - momentum: 0.000000
2023-10-19 23:58:21,676 ----------------------------------------------------------------------------------------------------
2023-10-19 23:58:21,676 EPOCH 9 done: loss 0.3842 - lr: 0.000003
2023-10-19 23:58:22,306 DEV : loss 0.31648480892181396 - f1-score (micro avg) 0.1803
2023-10-19 23:58:22,310 saving best model
2023-10-19 23:58:22,343 ----------------------------------------------------------------------------------------------------
2023-10-19 23:58:22,869 epoch 10 - iter 29/292 - loss 0.31764849 - time (sec): 0.53 - samples/sec: 9812.80 - lr: 0.000003 - momentum: 0.000000
2023-10-19 23:58:23,401 epoch 10 - iter 58/292 - loss 0.35718997 - time (sec): 1.06 - samples/sec: 10149.09 - lr: 0.000003 - momentum: 0.000000
2023-10-19 23:58:23,875 epoch 10 - iter 87/292 - loss 0.37047487 - time (sec): 1.53 - samples/sec: 9426.05 - lr: 0.000002 - momentum: 0.000000
2023-10-19 23:58:24,360 epoch 10 - iter 116/292 - loss 0.36595183 - time (sec): 2.02 - samples/sec: 9236.35 - lr: 0.000002 - momentum: 0.000000
2023-10-19 23:58:24,860 epoch 10 - iter 145/292 - loss 0.36830394 - time (sec): 2.52 - samples/sec: 8935.26 - lr: 0.000002 - momentum: 0.000000
2023-10-19 23:58:25,373 epoch 10 - iter 174/292 - loss 0.37044997 - time (sec): 3.03 - samples/sec: 8789.73 - lr: 0.000001 - momentum: 0.000000
2023-10-19 23:58:25,857 epoch 10 - iter 203/292 - loss 0.36942048 - time (sec): 3.51 - samples/sec: 8685.67 - lr: 0.000001 - momentum: 0.000000
2023-10-19 23:58:26,366 epoch 10 - iter 232/292 - loss 0.37678363 - time (sec): 4.02 - samples/sec: 8658.91 - lr: 0.000001 - momentum: 0.000000
2023-10-19 23:58:26,886 epoch 10 - iter 261/292 - loss 0.37899069 - time (sec): 4.54 - samples/sec: 8531.89 - lr: 0.000000 - momentum: 0.000000
2023-10-19 23:58:27,446 epoch 10 - iter 290/292 - loss 0.37973279 - time (sec): 5.10 - samples/sec: 8676.83 - lr: 0.000000 - momentum: 0.000000
2023-10-19 23:58:27,478 ----------------------------------------------------------------------------------------------------
2023-10-19 23:58:27,478 EPOCH 10 done: loss 0.3791 - lr: 0.000000
2023-10-19 23:58:28,125 DEV : loss 0.31487029790878296 - f1-score (micro avg) 0.1848
2023-10-19 23:58:28,129 saving best model
2023-10-19 23:58:28,189 ----------------------------------------------------------------------------------------------------
2023-10-19 23:58:28,190 Loading model from best epoch ...
2023-10-19 23:58:28,270 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-19 23:58:29,173
Results:
- F-score (micro) 0.2928
- F-score (macro) 0.1505
- Accuracy 0.1777
By class:
precision recall f1-score support
PER 0.3594 0.3563 0.3579 348
LOC 0.2658 0.2261 0.2443 261
ORG 0.0000 0.0000 0.0000 52
HumanProd 0.0000 0.0000 0.0000 22
micro avg 0.3228 0.2679 0.2928 683
macro avg 0.1563 0.1456 0.1505 683
weighted avg 0.2847 0.2679 0.2757 683
2023-10-19 23:58:29,173 ----------------------------------------------------------------------------------------------------