stefan-it's picture
Upload folder using huggingface_hub
071dfd7
2023-10-19 23:43:12,500 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:12,501 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-19 23:43:12,501 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:12,501 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-19 23:43:12,501 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:12,501 Train: 1166 sentences
2023-10-19 23:43:12,502 (train_with_dev=False, train_with_test=False)
2023-10-19 23:43:12,502 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:12,502 Training Params:
2023-10-19 23:43:12,502 - learning_rate: "5e-05"
2023-10-19 23:43:12,502 - mini_batch_size: "4"
2023-10-19 23:43:12,502 - max_epochs: "10"
2023-10-19 23:43:12,502 - shuffle: "True"
2023-10-19 23:43:12,502 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:12,502 Plugins:
2023-10-19 23:43:12,502 - TensorboardLogger
2023-10-19 23:43:12,502 - LinearScheduler | warmup_fraction: '0.1'
2023-10-19 23:43:12,502 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:12,502 Final evaluation on model from best epoch (best-model.pt)
2023-10-19 23:43:12,502 - metric: "('micro avg', 'f1-score')"
2023-10-19 23:43:12,502 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:12,502 Computation:
2023-10-19 23:43:12,502 - compute on device: cuda:0
2023-10-19 23:43:12,502 - embedding storage: none
2023-10-19 23:43:12,502 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:12,502 Model training base path: "hmbench-newseye/fi-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-19 23:43:12,502 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:12,502 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:12,503 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-19 23:43:12,963 epoch 1 - iter 29/292 - loss 3.07409706 - time (sec): 0.46 - samples/sec: 10330.84 - lr: 0.000005 - momentum: 0.000000
2023-10-19 23:43:13,451 epoch 1 - iter 58/292 - loss 3.03705751 - time (sec): 0.95 - samples/sec: 9437.03 - lr: 0.000010 - momentum: 0.000000
2023-10-19 23:43:13,902 epoch 1 - iter 87/292 - loss 2.94951085 - time (sec): 1.40 - samples/sec: 9793.60 - lr: 0.000015 - momentum: 0.000000
2023-10-19 23:43:14,481 epoch 1 - iter 116/292 - loss 2.84219173 - time (sec): 1.98 - samples/sec: 9562.88 - lr: 0.000020 - momentum: 0.000000
2023-10-19 23:43:15,004 epoch 1 - iter 145/292 - loss 2.58749868 - time (sec): 2.50 - samples/sec: 9267.09 - lr: 0.000025 - momentum: 0.000000
2023-10-19 23:43:15,501 epoch 1 - iter 174/292 - loss 2.38570226 - time (sec): 3.00 - samples/sec: 8972.13 - lr: 0.000030 - momentum: 0.000000
2023-10-19 23:43:15,978 epoch 1 - iter 203/292 - loss 2.22219622 - time (sec): 3.47 - samples/sec: 8692.25 - lr: 0.000035 - momentum: 0.000000
2023-10-19 23:43:16,494 epoch 1 - iter 232/292 - loss 2.00089252 - time (sec): 3.99 - samples/sec: 8828.25 - lr: 0.000040 - momentum: 0.000000
2023-10-19 23:43:17,020 epoch 1 - iter 261/292 - loss 1.85422614 - time (sec): 4.52 - samples/sec: 8745.36 - lr: 0.000045 - momentum: 0.000000
2023-10-19 23:43:17,538 epoch 1 - iter 290/292 - loss 1.72578060 - time (sec): 5.03 - samples/sec: 8805.58 - lr: 0.000049 - momentum: 0.000000
2023-10-19 23:43:17,570 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:17,570 EPOCH 1 done: loss 1.7241 - lr: 0.000049
2023-10-19 23:43:17,835 DEV : loss 0.4558045566082001 - f1-score (micro avg) 0.0
2023-10-19 23:43:17,838 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:18,351 epoch 2 - iter 29/292 - loss 0.70043765 - time (sec): 0.51 - samples/sec: 10228.74 - lr: 0.000049 - momentum: 0.000000
2023-10-19 23:43:18,881 epoch 2 - iter 58/292 - loss 0.75391994 - time (sec): 1.04 - samples/sec: 9507.76 - lr: 0.000049 - momentum: 0.000000
2023-10-19 23:43:19,408 epoch 2 - iter 87/292 - loss 0.72665306 - time (sec): 1.57 - samples/sec: 8976.10 - lr: 0.000048 - momentum: 0.000000
2023-10-19 23:43:19,973 epoch 2 - iter 116/292 - loss 0.68840114 - time (sec): 2.13 - samples/sec: 8885.13 - lr: 0.000048 - momentum: 0.000000
2023-10-19 23:43:20,452 epoch 2 - iter 145/292 - loss 0.68621072 - time (sec): 2.61 - samples/sec: 8709.27 - lr: 0.000047 - momentum: 0.000000
2023-10-19 23:43:20,959 epoch 2 - iter 174/292 - loss 0.65696092 - time (sec): 3.12 - samples/sec: 8612.91 - lr: 0.000047 - momentum: 0.000000
2023-10-19 23:43:21,469 epoch 2 - iter 203/292 - loss 0.64245263 - time (sec): 3.63 - samples/sec: 8646.67 - lr: 0.000046 - momentum: 0.000000
2023-10-19 23:43:21,972 epoch 2 - iter 232/292 - loss 0.63067680 - time (sec): 4.13 - samples/sec: 8695.67 - lr: 0.000046 - momentum: 0.000000
2023-10-19 23:43:22,456 epoch 2 - iter 261/292 - loss 0.61228065 - time (sec): 4.62 - samples/sec: 8698.69 - lr: 0.000045 - momentum: 0.000000
2023-10-19 23:43:22,965 epoch 2 - iter 290/292 - loss 0.59628468 - time (sec): 5.13 - samples/sec: 8645.76 - lr: 0.000045 - momentum: 0.000000
2023-10-19 23:43:22,994 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:22,994 EPOCH 2 done: loss 0.5963 - lr: 0.000045
2023-10-19 23:43:23,621 DEV : loss 0.36393484473228455 - f1-score (micro avg) 0.0
2023-10-19 23:43:23,624 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:24,125 epoch 3 - iter 29/292 - loss 0.42006601 - time (sec): 0.50 - samples/sec: 8608.40 - lr: 0.000044 - momentum: 0.000000
2023-10-19 23:43:24,629 epoch 3 - iter 58/292 - loss 0.60259921 - time (sec): 1.00 - samples/sec: 8567.90 - lr: 0.000043 - momentum: 0.000000
2023-10-19 23:43:25,153 epoch 3 - iter 87/292 - loss 0.56103013 - time (sec): 1.53 - samples/sec: 8460.68 - lr: 0.000043 - momentum: 0.000000
2023-10-19 23:43:25,667 epoch 3 - iter 116/292 - loss 0.52110974 - time (sec): 2.04 - samples/sec: 8655.47 - lr: 0.000042 - momentum: 0.000000
2023-10-19 23:43:26,153 epoch 3 - iter 145/292 - loss 0.51867745 - time (sec): 2.53 - samples/sec: 8450.82 - lr: 0.000042 - momentum: 0.000000
2023-10-19 23:43:26,680 epoch 3 - iter 174/292 - loss 0.51950404 - time (sec): 3.05 - samples/sec: 8737.22 - lr: 0.000041 - momentum: 0.000000
2023-10-19 23:43:27,230 epoch 3 - iter 203/292 - loss 0.49933801 - time (sec): 3.61 - samples/sec: 8709.81 - lr: 0.000041 - momentum: 0.000000
2023-10-19 23:43:27,747 epoch 3 - iter 232/292 - loss 0.49192529 - time (sec): 4.12 - samples/sec: 8550.14 - lr: 0.000040 - momentum: 0.000000
2023-10-19 23:43:28,405 epoch 3 - iter 261/292 - loss 0.48732688 - time (sec): 4.78 - samples/sec: 8306.02 - lr: 0.000040 - momentum: 0.000000
2023-10-19 23:43:28,884 epoch 3 - iter 290/292 - loss 0.48097519 - time (sec): 5.26 - samples/sec: 8415.89 - lr: 0.000039 - momentum: 0.000000
2023-10-19 23:43:28,918 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:28,918 EPOCH 3 done: loss 0.4799 - lr: 0.000039
2023-10-19 23:43:29,554 DEV : loss 0.30687329173088074 - f1-score (micro avg) 0.1411
2023-10-19 23:43:29,558 saving best model
2023-10-19 23:43:29,585 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:30,106 epoch 4 - iter 29/292 - loss 0.40558778 - time (sec): 0.52 - samples/sec: 8449.83 - lr: 0.000038 - momentum: 0.000000
2023-10-19 23:43:30,632 epoch 4 - iter 58/292 - loss 0.40595574 - time (sec): 1.05 - samples/sec: 8411.38 - lr: 0.000038 - momentum: 0.000000
2023-10-19 23:43:31,153 epoch 4 - iter 87/292 - loss 0.39747205 - time (sec): 1.57 - samples/sec: 8586.41 - lr: 0.000037 - momentum: 0.000000
2023-10-19 23:43:31,648 epoch 4 - iter 116/292 - loss 0.39387173 - time (sec): 2.06 - samples/sec: 8558.51 - lr: 0.000037 - momentum: 0.000000
2023-10-19 23:43:32,172 epoch 4 - iter 145/292 - loss 0.39381093 - time (sec): 2.59 - samples/sec: 8536.17 - lr: 0.000036 - momentum: 0.000000
2023-10-19 23:43:32,717 epoch 4 - iter 174/292 - loss 0.40560545 - time (sec): 3.13 - samples/sec: 8748.46 - lr: 0.000036 - momentum: 0.000000
2023-10-19 23:43:33,242 epoch 4 - iter 203/292 - loss 0.41292067 - time (sec): 3.66 - samples/sec: 8567.43 - lr: 0.000035 - momentum: 0.000000
2023-10-19 23:43:33,741 epoch 4 - iter 232/292 - loss 0.41164578 - time (sec): 4.16 - samples/sec: 8577.30 - lr: 0.000035 - momentum: 0.000000
2023-10-19 23:43:34,261 epoch 4 - iter 261/292 - loss 0.41094464 - time (sec): 4.68 - samples/sec: 8642.29 - lr: 0.000034 - momentum: 0.000000
2023-10-19 23:43:34,748 epoch 4 - iter 290/292 - loss 0.40702589 - time (sec): 5.16 - samples/sec: 8532.69 - lr: 0.000033 - momentum: 0.000000
2023-10-19 23:43:34,787 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:34,787 EPOCH 4 done: loss 0.4052 - lr: 0.000033
2023-10-19 23:43:35,428 DEV : loss 0.3078775703907013 - f1-score (micro avg) 0.1785
2023-10-19 23:43:35,432 saving best model
2023-10-19 23:43:35,464 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:35,984 epoch 5 - iter 29/292 - loss 0.39458319 - time (sec): 0.52 - samples/sec: 8373.87 - lr: 0.000033 - momentum: 0.000000
2023-10-19 23:43:36,501 epoch 5 - iter 58/292 - loss 0.37791867 - time (sec): 1.04 - samples/sec: 8911.74 - lr: 0.000032 - momentum: 0.000000
2023-10-19 23:43:37,090 epoch 5 - iter 87/292 - loss 0.41175438 - time (sec): 1.63 - samples/sec: 8990.57 - lr: 0.000032 - momentum: 0.000000
2023-10-19 23:43:37,656 epoch 5 - iter 116/292 - loss 0.41745175 - time (sec): 2.19 - samples/sec: 8617.37 - lr: 0.000031 - momentum: 0.000000
2023-10-19 23:43:38,154 epoch 5 - iter 145/292 - loss 0.40362718 - time (sec): 2.69 - samples/sec: 8507.94 - lr: 0.000031 - momentum: 0.000000
2023-10-19 23:43:38,669 epoch 5 - iter 174/292 - loss 0.38856766 - time (sec): 3.20 - samples/sec: 8423.13 - lr: 0.000030 - momentum: 0.000000
2023-10-19 23:43:39,173 epoch 5 - iter 203/292 - loss 0.39176537 - time (sec): 3.71 - samples/sec: 8451.86 - lr: 0.000030 - momentum: 0.000000
2023-10-19 23:43:39,665 epoch 5 - iter 232/292 - loss 0.38688994 - time (sec): 4.20 - samples/sec: 8488.81 - lr: 0.000029 - momentum: 0.000000
2023-10-19 23:43:40,175 epoch 5 - iter 261/292 - loss 0.37929994 - time (sec): 4.71 - samples/sec: 8411.79 - lr: 0.000028 - momentum: 0.000000
2023-10-19 23:43:40,695 epoch 5 - iter 290/292 - loss 0.37114615 - time (sec): 5.23 - samples/sec: 8448.93 - lr: 0.000028 - momentum: 0.000000
2023-10-19 23:43:40,724 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:40,724 EPOCH 5 done: loss 0.3711 - lr: 0.000028
2023-10-19 23:43:41,364 DEV : loss 0.30972999334335327 - f1-score (micro avg) 0.19
2023-10-19 23:43:41,368 saving best model
2023-10-19 23:43:41,398 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:41,947 epoch 6 - iter 29/292 - loss 0.37287899 - time (sec): 0.55 - samples/sec: 9227.43 - lr: 0.000027 - momentum: 0.000000
2023-10-19 23:43:42,433 epoch 6 - iter 58/292 - loss 0.34716342 - time (sec): 1.03 - samples/sec: 8909.79 - lr: 0.000027 - momentum: 0.000000
2023-10-19 23:43:42,986 epoch 6 - iter 87/292 - loss 0.38873255 - time (sec): 1.59 - samples/sec: 9117.96 - lr: 0.000026 - momentum: 0.000000
2023-10-19 23:43:43,496 epoch 6 - iter 116/292 - loss 0.38158312 - time (sec): 2.10 - samples/sec: 8828.62 - lr: 0.000026 - momentum: 0.000000
2023-10-19 23:43:44,004 epoch 6 - iter 145/292 - loss 0.38197586 - time (sec): 2.61 - samples/sec: 8792.59 - lr: 0.000025 - momentum: 0.000000
2023-10-19 23:43:44,542 epoch 6 - iter 174/292 - loss 0.36548885 - time (sec): 3.14 - samples/sec: 8816.96 - lr: 0.000025 - momentum: 0.000000
2023-10-19 23:43:45,056 epoch 6 - iter 203/292 - loss 0.35118741 - time (sec): 3.66 - samples/sec: 8679.41 - lr: 0.000024 - momentum: 0.000000
2023-10-19 23:43:45,589 epoch 6 - iter 232/292 - loss 0.35101908 - time (sec): 4.19 - samples/sec: 8508.10 - lr: 0.000023 - momentum: 0.000000
2023-10-19 23:43:46,145 epoch 6 - iter 261/292 - loss 0.35347404 - time (sec): 4.75 - samples/sec: 8468.76 - lr: 0.000023 - momentum: 0.000000
2023-10-19 23:43:46,662 epoch 6 - iter 290/292 - loss 0.35128884 - time (sec): 5.26 - samples/sec: 8425.23 - lr: 0.000022 - momentum: 0.000000
2023-10-19 23:43:46,689 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:46,689 EPOCH 6 done: loss 0.3514 - lr: 0.000022
2023-10-19 23:43:47,337 DEV : loss 0.30355098843574524 - f1-score (micro avg) 0.227
2023-10-19 23:43:47,341 saving best model
2023-10-19 23:43:47,373 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:47,886 epoch 7 - iter 29/292 - loss 0.30998691 - time (sec): 0.51 - samples/sec: 10436.77 - lr: 0.000022 - momentum: 0.000000
2023-10-19 23:43:48,370 epoch 7 - iter 58/292 - loss 0.32974135 - time (sec): 1.00 - samples/sec: 9324.77 - lr: 0.000021 - momentum: 0.000000
2023-10-19 23:43:48,863 epoch 7 - iter 87/292 - loss 0.33209891 - time (sec): 1.49 - samples/sec: 8769.78 - lr: 0.000021 - momentum: 0.000000
2023-10-19 23:43:49,392 epoch 7 - iter 116/292 - loss 0.31542787 - time (sec): 2.02 - samples/sec: 8823.99 - lr: 0.000020 - momentum: 0.000000
2023-10-19 23:43:49,913 epoch 7 - iter 145/292 - loss 0.31288929 - time (sec): 2.54 - samples/sec: 8784.33 - lr: 0.000020 - momentum: 0.000000
2023-10-19 23:43:50,406 epoch 7 - iter 174/292 - loss 0.32458005 - time (sec): 3.03 - samples/sec: 8624.77 - lr: 0.000019 - momentum: 0.000000
2023-10-19 23:43:50,868 epoch 7 - iter 203/292 - loss 0.32310007 - time (sec): 3.49 - samples/sec: 8510.83 - lr: 0.000018 - momentum: 0.000000
2023-10-19 23:43:51,332 epoch 7 - iter 232/292 - loss 0.34671277 - time (sec): 3.96 - samples/sec: 8724.77 - lr: 0.000018 - momentum: 0.000000
2023-10-19 23:43:51,779 epoch 7 - iter 261/292 - loss 0.33592061 - time (sec): 4.41 - samples/sec: 8940.65 - lr: 0.000017 - momentum: 0.000000
2023-10-19 23:43:52,238 epoch 7 - iter 290/292 - loss 0.33539636 - time (sec): 4.86 - samples/sec: 9086.71 - lr: 0.000017 - momentum: 0.000000
2023-10-19 23:43:52,265 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:52,265 EPOCH 7 done: loss 0.3355 - lr: 0.000017
2023-10-19 23:43:52,901 DEV : loss 0.29472851753234863 - f1-score (micro avg) 0.2633
2023-10-19 23:43:52,904 saving best model
2023-10-19 23:43:52,935 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:53,394 epoch 8 - iter 29/292 - loss 0.31109226 - time (sec): 0.46 - samples/sec: 9513.15 - lr: 0.000016 - momentum: 0.000000
2023-10-19 23:43:53,831 epoch 8 - iter 58/292 - loss 0.32511565 - time (sec): 0.89 - samples/sec: 9542.18 - lr: 0.000016 - momentum: 0.000000
2023-10-19 23:43:54,280 epoch 8 - iter 87/292 - loss 0.30650012 - time (sec): 1.34 - samples/sec: 9604.75 - lr: 0.000015 - momentum: 0.000000
2023-10-19 23:43:54,693 epoch 8 - iter 116/292 - loss 0.30110761 - time (sec): 1.76 - samples/sec: 9746.12 - lr: 0.000015 - momentum: 0.000000
2023-10-19 23:43:55,184 epoch 8 - iter 145/292 - loss 0.31099699 - time (sec): 2.25 - samples/sec: 9782.19 - lr: 0.000014 - momentum: 0.000000
2023-10-19 23:43:55,683 epoch 8 - iter 174/292 - loss 0.30267235 - time (sec): 2.75 - samples/sec: 9391.31 - lr: 0.000013 - momentum: 0.000000
2023-10-19 23:43:56,181 epoch 8 - iter 203/292 - loss 0.31451102 - time (sec): 3.24 - samples/sec: 9379.35 - lr: 0.000013 - momentum: 0.000000
2023-10-19 23:43:56,743 epoch 8 - iter 232/292 - loss 0.32954413 - time (sec): 3.81 - samples/sec: 9378.59 - lr: 0.000012 - momentum: 0.000000
2023-10-19 23:43:57,244 epoch 8 - iter 261/292 - loss 0.32898065 - time (sec): 4.31 - samples/sec: 9255.20 - lr: 0.000012 - momentum: 0.000000
2023-10-19 23:43:57,781 epoch 8 - iter 290/292 - loss 0.32544017 - time (sec): 4.85 - samples/sec: 9119.53 - lr: 0.000011 - momentum: 0.000000
2023-10-19 23:43:57,810 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:57,810 EPOCH 8 done: loss 0.3255 - lr: 0.000011
2023-10-19 23:43:58,448 DEV : loss 0.29057008028030396 - f1-score (micro avg) 0.276
2023-10-19 23:43:58,451 saving best model
2023-10-19 23:43:58,485 ----------------------------------------------------------------------------------------------------
2023-10-19 23:43:59,002 epoch 9 - iter 29/292 - loss 0.24069591 - time (sec): 0.52 - samples/sec: 9113.36 - lr: 0.000011 - momentum: 0.000000
2023-10-19 23:43:59,509 epoch 9 - iter 58/292 - loss 0.28968391 - time (sec): 1.02 - samples/sec: 8707.45 - lr: 0.000010 - momentum: 0.000000
2023-10-19 23:44:00,032 epoch 9 - iter 87/292 - loss 0.28979454 - time (sec): 1.55 - samples/sec: 8802.28 - lr: 0.000010 - momentum: 0.000000
2023-10-19 23:44:00,536 epoch 9 - iter 116/292 - loss 0.29644815 - time (sec): 2.05 - samples/sec: 8488.33 - lr: 0.000009 - momentum: 0.000000
2023-10-19 23:44:01,186 epoch 9 - iter 145/292 - loss 0.29585821 - time (sec): 2.70 - samples/sec: 8068.97 - lr: 0.000008 - momentum: 0.000000
2023-10-19 23:44:01,701 epoch 9 - iter 174/292 - loss 0.29293220 - time (sec): 3.22 - samples/sec: 8304.34 - lr: 0.000008 - momentum: 0.000000
2023-10-19 23:44:02,246 epoch 9 - iter 203/292 - loss 0.30033879 - time (sec): 3.76 - samples/sec: 8413.12 - lr: 0.000007 - momentum: 0.000000
2023-10-19 23:44:02,738 epoch 9 - iter 232/292 - loss 0.30375698 - time (sec): 4.25 - samples/sec: 8450.86 - lr: 0.000007 - momentum: 0.000000
2023-10-19 23:44:03,250 epoch 9 - iter 261/292 - loss 0.31294749 - time (sec): 4.76 - samples/sec: 8536.53 - lr: 0.000006 - momentum: 0.000000
2023-10-19 23:44:03,737 epoch 9 - iter 290/292 - loss 0.31643018 - time (sec): 5.25 - samples/sec: 8438.37 - lr: 0.000006 - momentum: 0.000000
2023-10-19 23:44:03,769 ----------------------------------------------------------------------------------------------------
2023-10-19 23:44:03,769 EPOCH 9 done: loss 0.3170 - lr: 0.000006
2023-10-19 23:44:04,410 DEV : loss 0.2984329164028168 - f1-score (micro avg) 0.2593
2023-10-19 23:44:04,413 ----------------------------------------------------------------------------------------------------
2023-10-19 23:44:04,936 epoch 10 - iter 29/292 - loss 0.22234356 - time (sec): 0.52 - samples/sec: 8410.17 - lr: 0.000005 - momentum: 0.000000
2023-10-19 23:44:05,451 epoch 10 - iter 58/292 - loss 0.26495283 - time (sec): 1.04 - samples/sec: 8656.76 - lr: 0.000005 - momentum: 0.000000
2023-10-19 23:44:05,960 epoch 10 - iter 87/292 - loss 0.28908115 - time (sec): 1.55 - samples/sec: 8288.55 - lr: 0.000004 - momentum: 0.000000
2023-10-19 23:44:06,504 epoch 10 - iter 116/292 - loss 0.30910542 - time (sec): 2.09 - samples/sec: 8108.06 - lr: 0.000003 - momentum: 0.000000
2023-10-19 23:44:07,024 epoch 10 - iter 145/292 - loss 0.32482006 - time (sec): 2.61 - samples/sec: 8037.99 - lr: 0.000003 - momentum: 0.000000
2023-10-19 23:44:07,513 epoch 10 - iter 174/292 - loss 0.31126506 - time (sec): 3.10 - samples/sec: 8178.22 - lr: 0.000002 - momentum: 0.000000
2023-10-19 23:44:08,024 epoch 10 - iter 203/292 - loss 0.30298861 - time (sec): 3.61 - samples/sec: 8233.68 - lr: 0.000002 - momentum: 0.000000
2023-10-19 23:44:08,565 epoch 10 - iter 232/292 - loss 0.30263054 - time (sec): 4.15 - samples/sec: 8385.24 - lr: 0.000001 - momentum: 0.000000
2023-10-19 23:44:09,066 epoch 10 - iter 261/292 - loss 0.29752653 - time (sec): 4.65 - samples/sec: 8414.36 - lr: 0.000001 - momentum: 0.000000
2023-10-19 23:44:09,589 epoch 10 - iter 290/292 - loss 0.31684602 - time (sec): 5.17 - samples/sec: 8559.07 - lr: 0.000000 - momentum: 0.000000
2023-10-19 23:44:09,617 ----------------------------------------------------------------------------------------------------
2023-10-19 23:44:09,617 EPOCH 10 done: loss 0.3165 - lr: 0.000000
2023-10-19 23:44:10,261 DEV : loss 0.2949487566947937 - f1-score (micro avg) 0.287
2023-10-19 23:44:10,265 saving best model
2023-10-19 23:44:10,323 ----------------------------------------------------------------------------------------------------
2023-10-19 23:44:10,324 Loading model from best epoch ...
2023-10-19 23:44:10,397 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-19 23:44:11,306
Results:
- F-score (micro) 0.3268
- F-score (macro) 0.1729
- Accuracy 0.2036
By class:
precision recall f1-score support
PER 0.3365 0.3046 0.3198 348
LOC 0.3140 0.4559 0.3719 261
ORG 0.0000 0.0000 0.0000 52
HumanProd 0.0000 0.0000 0.0000 22
micro avg 0.3242 0.3294 0.3268 683
macro avg 0.1626 0.1901 0.1729 683
weighted avg 0.2914 0.3294 0.3050 683
2023-10-19 23:44:11,306 ----------------------------------------------------------------------------------------------------