stefan-it's picture
Upload folder using huggingface_hub
36688c2
2023-10-18 16:04:45,006 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:45,007 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 16:04:45,007 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:45,007 MultiCorpus: 1214 train + 266 dev + 251 test sentences
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator
2023-10-18 16:04:45,007 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:45,007 Train: 1214 sentences
2023-10-18 16:04:45,007 (train_with_dev=False, train_with_test=False)
2023-10-18 16:04:45,007 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:45,007 Training Params:
2023-10-18 16:04:45,007 - learning_rate: "5e-05"
2023-10-18 16:04:45,007 - mini_batch_size: "8"
2023-10-18 16:04:45,007 - max_epochs: "10"
2023-10-18 16:04:45,007 - shuffle: "True"
2023-10-18 16:04:45,007 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:45,007 Plugins:
2023-10-18 16:04:45,007 - TensorboardLogger
2023-10-18 16:04:45,007 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 16:04:45,007 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:45,007 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 16:04:45,008 - metric: "('micro avg', 'f1-score')"
2023-10-18 16:04:45,008 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:45,008 Computation:
2023-10-18 16:04:45,008 - compute on device: cuda:0
2023-10-18 16:04:45,008 - embedding storage: none
2023-10-18 16:04:45,008 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:45,008 Model training base path: "hmbench-ajmc/en-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-18 16:04:45,008 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:45,008 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:45,008 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 16:04:45,328 epoch 1 - iter 15/152 - loss 3.55491273 - time (sec): 0.32 - samples/sec: 8438.20 - lr: 0.000005 - momentum: 0.000000
2023-10-18 16:04:45,643 epoch 1 - iter 30/152 - loss 3.47558306 - time (sec): 0.63 - samples/sec: 9227.83 - lr: 0.000010 - momentum: 0.000000
2023-10-18 16:04:45,954 epoch 1 - iter 45/152 - loss 3.37625327 - time (sec): 0.95 - samples/sec: 9191.87 - lr: 0.000014 - momentum: 0.000000
2023-10-18 16:04:46,276 epoch 1 - iter 60/152 - loss 3.21161944 - time (sec): 1.27 - samples/sec: 9416.38 - lr: 0.000019 - momentum: 0.000000
2023-10-18 16:04:46,601 epoch 1 - iter 75/152 - loss 3.03719629 - time (sec): 1.59 - samples/sec: 9404.01 - lr: 0.000024 - momentum: 0.000000
2023-10-18 16:04:46,933 epoch 1 - iter 90/152 - loss 2.84316742 - time (sec): 1.92 - samples/sec: 9482.16 - lr: 0.000029 - momentum: 0.000000
2023-10-18 16:04:47,264 epoch 1 - iter 105/152 - loss 2.63926958 - time (sec): 2.26 - samples/sec: 9379.69 - lr: 0.000034 - momentum: 0.000000
2023-10-18 16:04:47,584 epoch 1 - iter 120/152 - loss 2.42786078 - time (sec): 2.58 - samples/sec: 9506.82 - lr: 0.000039 - momentum: 0.000000
2023-10-18 16:04:47,890 epoch 1 - iter 135/152 - loss 2.26021411 - time (sec): 2.88 - samples/sec: 9482.46 - lr: 0.000044 - momentum: 0.000000
2023-10-18 16:04:48,212 epoch 1 - iter 150/152 - loss 2.11391684 - time (sec): 3.20 - samples/sec: 9555.77 - lr: 0.000049 - momentum: 0.000000
2023-10-18 16:04:48,253 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:48,253 EPOCH 1 done: loss 2.1041 - lr: 0.000049
2023-10-18 16:04:48,589 DEV : loss 0.8210452198982239 - f1-score (micro avg) 0.0
2023-10-18 16:04:48,595 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:48,900 epoch 2 - iter 15/152 - loss 0.82480804 - time (sec): 0.30 - samples/sec: 10411.99 - lr: 0.000049 - momentum: 0.000000
2023-10-18 16:04:49,230 epoch 2 - iter 30/152 - loss 0.79322320 - time (sec): 0.63 - samples/sec: 9916.20 - lr: 0.000049 - momentum: 0.000000
2023-10-18 16:04:49,558 epoch 2 - iter 45/152 - loss 0.78621530 - time (sec): 0.96 - samples/sec: 9763.38 - lr: 0.000048 - momentum: 0.000000
2023-10-18 16:04:49,881 epoch 2 - iter 60/152 - loss 0.75039960 - time (sec): 1.29 - samples/sec: 9766.47 - lr: 0.000048 - momentum: 0.000000
2023-10-18 16:04:50,169 epoch 2 - iter 75/152 - loss 0.74715332 - time (sec): 1.57 - samples/sec: 9860.97 - lr: 0.000047 - momentum: 0.000000
2023-10-18 16:04:50,474 epoch 2 - iter 90/152 - loss 0.72694581 - time (sec): 1.88 - samples/sec: 9997.73 - lr: 0.000047 - momentum: 0.000000
2023-10-18 16:04:50,801 epoch 2 - iter 105/152 - loss 0.68749465 - time (sec): 2.21 - samples/sec: 9811.87 - lr: 0.000046 - momentum: 0.000000
2023-10-18 16:04:51,111 epoch 2 - iter 120/152 - loss 0.67904757 - time (sec): 2.52 - samples/sec: 9910.22 - lr: 0.000046 - momentum: 0.000000
2023-10-18 16:04:51,437 epoch 2 - iter 135/152 - loss 0.66227355 - time (sec): 2.84 - samples/sec: 9818.68 - lr: 0.000045 - momentum: 0.000000
2023-10-18 16:04:51,749 epoch 2 - iter 150/152 - loss 0.64609491 - time (sec): 3.15 - samples/sec: 9712.60 - lr: 0.000045 - momentum: 0.000000
2023-10-18 16:04:51,791 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:51,791 EPOCH 2 done: loss 0.6437 - lr: 0.000045
2023-10-18 16:04:52,294 DEV : loss 0.46711266040802 - f1-score (micro avg) 0.201
2023-10-18 16:04:52,303 saving best model
2023-10-18 16:04:52,335 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:52,674 epoch 3 - iter 15/152 - loss 0.55280524 - time (sec): 0.34 - samples/sec: 8723.64 - lr: 0.000044 - momentum: 0.000000
2023-10-18 16:04:52,990 epoch 3 - iter 30/152 - loss 0.52600948 - time (sec): 0.65 - samples/sec: 9306.17 - lr: 0.000043 - momentum: 0.000000
2023-10-18 16:04:53,313 epoch 3 - iter 45/152 - loss 0.53658685 - time (sec): 0.98 - samples/sec: 9598.63 - lr: 0.000043 - momentum: 0.000000
2023-10-18 16:04:53,630 epoch 3 - iter 60/152 - loss 0.50973011 - time (sec): 1.29 - samples/sec: 9669.50 - lr: 0.000042 - momentum: 0.000000
2023-10-18 16:04:53,951 epoch 3 - iter 75/152 - loss 0.49269238 - time (sec): 1.61 - samples/sec: 9621.88 - lr: 0.000042 - momentum: 0.000000
2023-10-18 16:04:54,284 epoch 3 - iter 90/152 - loss 0.48607402 - time (sec): 1.95 - samples/sec: 9418.81 - lr: 0.000041 - momentum: 0.000000
2023-10-18 16:04:54,608 epoch 3 - iter 105/152 - loss 0.46661200 - time (sec): 2.27 - samples/sec: 9387.20 - lr: 0.000041 - momentum: 0.000000
2023-10-18 16:04:55,100 epoch 3 - iter 120/152 - loss 0.45393692 - time (sec): 2.76 - samples/sec: 8717.62 - lr: 0.000040 - momentum: 0.000000
2023-10-18 16:04:55,437 epoch 3 - iter 135/152 - loss 0.46267037 - time (sec): 3.10 - samples/sec: 8805.49 - lr: 0.000040 - momentum: 0.000000
2023-10-18 16:04:55,780 epoch 3 - iter 150/152 - loss 0.45957840 - time (sec): 3.44 - samples/sec: 8881.86 - lr: 0.000039 - momentum: 0.000000
2023-10-18 16:04:55,827 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:55,827 EPOCH 3 done: loss 0.4618 - lr: 0.000039
2023-10-18 16:04:56,332 DEV : loss 0.3763718903064728 - f1-score (micro avg) 0.3079
2023-10-18 16:04:56,338 saving best model
2023-10-18 16:04:56,369 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:56,698 epoch 4 - iter 15/152 - loss 0.52573925 - time (sec): 0.33 - samples/sec: 8778.20 - lr: 0.000038 - momentum: 0.000000
2023-10-18 16:04:57,043 epoch 4 - iter 30/152 - loss 0.45798899 - time (sec): 0.67 - samples/sec: 8851.09 - lr: 0.000038 - momentum: 0.000000
2023-10-18 16:04:57,383 epoch 4 - iter 45/152 - loss 0.43859763 - time (sec): 1.01 - samples/sec: 9031.46 - lr: 0.000037 - momentum: 0.000000
2023-10-18 16:04:57,709 epoch 4 - iter 60/152 - loss 0.43302646 - time (sec): 1.34 - samples/sec: 8975.68 - lr: 0.000037 - momentum: 0.000000
2023-10-18 16:04:58,029 epoch 4 - iter 75/152 - loss 0.41824130 - time (sec): 1.66 - samples/sec: 9106.04 - lr: 0.000036 - momentum: 0.000000
2023-10-18 16:04:58,349 epoch 4 - iter 90/152 - loss 0.41263334 - time (sec): 1.98 - samples/sec: 9119.30 - lr: 0.000036 - momentum: 0.000000
2023-10-18 16:04:58,674 epoch 4 - iter 105/152 - loss 0.40907522 - time (sec): 2.30 - samples/sec: 9127.41 - lr: 0.000035 - momentum: 0.000000
2023-10-18 16:04:58,993 epoch 4 - iter 120/152 - loss 0.39940132 - time (sec): 2.62 - samples/sec: 9096.85 - lr: 0.000035 - momentum: 0.000000
2023-10-18 16:04:59,314 epoch 4 - iter 135/152 - loss 0.40404518 - time (sec): 2.94 - samples/sec: 9168.53 - lr: 0.000034 - momentum: 0.000000
2023-10-18 16:04:59,626 epoch 4 - iter 150/152 - loss 0.39664101 - time (sec): 3.26 - samples/sec: 9396.38 - lr: 0.000034 - momentum: 0.000000
2023-10-18 16:04:59,656 ----------------------------------------------------------------------------------------------------
2023-10-18 16:04:59,656 EPOCH 4 done: loss 0.3953 - lr: 0.000034
2023-10-18 16:05:00,176 DEV : loss 0.3355642557144165 - f1-score (micro avg) 0.3672
2023-10-18 16:05:00,181 saving best model
2023-10-18 16:05:00,216 ----------------------------------------------------------------------------------------------------
2023-10-18 16:05:00,545 epoch 5 - iter 15/152 - loss 0.36220235 - time (sec): 0.33 - samples/sec: 9174.40 - lr: 0.000033 - momentum: 0.000000
2023-10-18 16:05:00,885 epoch 5 - iter 30/152 - loss 0.41524884 - time (sec): 0.67 - samples/sec: 9345.90 - lr: 0.000032 - momentum: 0.000000
2023-10-18 16:05:01,257 epoch 5 - iter 45/152 - loss 0.36981838 - time (sec): 1.04 - samples/sec: 8962.68 - lr: 0.000032 - momentum: 0.000000
2023-10-18 16:05:01,618 epoch 5 - iter 60/152 - loss 0.36763614 - time (sec): 1.40 - samples/sec: 8836.21 - lr: 0.000031 - momentum: 0.000000
2023-10-18 16:05:01,956 epoch 5 - iter 75/152 - loss 0.36971291 - time (sec): 1.74 - samples/sec: 8895.40 - lr: 0.000031 - momentum: 0.000000
2023-10-18 16:05:02,290 epoch 5 - iter 90/152 - loss 0.37106258 - time (sec): 2.07 - samples/sec: 8842.41 - lr: 0.000030 - momentum: 0.000000
2023-10-18 16:05:02,638 epoch 5 - iter 105/152 - loss 0.35989213 - time (sec): 2.42 - samples/sec: 8892.32 - lr: 0.000030 - momentum: 0.000000
2023-10-18 16:05:02,989 epoch 5 - iter 120/152 - loss 0.36121116 - time (sec): 2.77 - samples/sec: 8914.91 - lr: 0.000029 - momentum: 0.000000
2023-10-18 16:05:03,321 epoch 5 - iter 135/152 - loss 0.35095211 - time (sec): 3.10 - samples/sec: 8932.51 - lr: 0.000029 - momentum: 0.000000
2023-10-18 16:05:03,641 epoch 5 - iter 150/152 - loss 0.34848002 - time (sec): 3.43 - samples/sec: 8936.06 - lr: 0.000028 - momentum: 0.000000
2023-10-18 16:05:03,680 ----------------------------------------------------------------------------------------------------
2023-10-18 16:05:03,680 EPOCH 5 done: loss 0.3479 - lr: 0.000028
2023-10-18 16:05:04,193 DEV : loss 0.31115421652793884 - f1-score (micro avg) 0.3844
2023-10-18 16:05:04,198 saving best model
2023-10-18 16:05:04,231 ----------------------------------------------------------------------------------------------------
2023-10-18 16:05:04,575 epoch 6 - iter 15/152 - loss 0.32690293 - time (sec): 0.34 - samples/sec: 8592.24 - lr: 0.000027 - momentum: 0.000000
2023-10-18 16:05:04,901 epoch 6 - iter 30/152 - loss 0.32630321 - time (sec): 0.67 - samples/sec: 8807.67 - lr: 0.000027 - momentum: 0.000000
2023-10-18 16:05:05,226 epoch 6 - iter 45/152 - loss 0.31329981 - time (sec): 0.99 - samples/sec: 8860.44 - lr: 0.000026 - momentum: 0.000000
2023-10-18 16:05:05,558 epoch 6 - iter 60/152 - loss 0.32232580 - time (sec): 1.33 - samples/sec: 8830.30 - lr: 0.000026 - momentum: 0.000000
2023-10-18 16:05:05,884 epoch 6 - iter 75/152 - loss 0.31910210 - time (sec): 1.65 - samples/sec: 8932.52 - lr: 0.000025 - momentum: 0.000000
2023-10-18 16:05:06,204 epoch 6 - iter 90/152 - loss 0.32736766 - time (sec): 1.97 - samples/sec: 9052.21 - lr: 0.000025 - momentum: 0.000000
2023-10-18 16:05:06,525 epoch 6 - iter 105/152 - loss 0.33292481 - time (sec): 2.29 - samples/sec: 9162.50 - lr: 0.000024 - momentum: 0.000000
2023-10-18 16:05:06,838 epoch 6 - iter 120/152 - loss 0.32538367 - time (sec): 2.61 - samples/sec: 9174.32 - lr: 0.000024 - momentum: 0.000000
2023-10-18 16:05:07,169 epoch 6 - iter 135/152 - loss 0.32694796 - time (sec): 2.94 - samples/sec: 9220.95 - lr: 0.000023 - momentum: 0.000000
2023-10-18 16:05:07,494 epoch 6 - iter 150/152 - loss 0.32681647 - time (sec): 3.26 - samples/sec: 9372.20 - lr: 0.000022 - momentum: 0.000000
2023-10-18 16:05:07,533 ----------------------------------------------------------------------------------------------------
2023-10-18 16:05:07,533 EPOCH 6 done: loss 0.3252 - lr: 0.000022
2023-10-18 16:05:08,059 DEV : loss 0.2971973717212677 - f1-score (micro avg) 0.407
2023-10-18 16:05:08,064 saving best model
2023-10-18 16:05:08,097 ----------------------------------------------------------------------------------------------------
2023-10-18 16:05:08,422 epoch 7 - iter 15/152 - loss 0.31865948 - time (sec): 0.32 - samples/sec: 8928.80 - lr: 0.000022 - momentum: 0.000000
2023-10-18 16:05:08,764 epoch 7 - iter 30/152 - loss 0.30993718 - time (sec): 0.67 - samples/sec: 8897.69 - lr: 0.000021 - momentum: 0.000000
2023-10-18 16:05:09,096 epoch 7 - iter 45/152 - loss 0.31161449 - time (sec): 1.00 - samples/sec: 9253.60 - lr: 0.000021 - momentum: 0.000000
2023-10-18 16:05:09,418 epoch 7 - iter 60/152 - loss 0.31404095 - time (sec): 1.32 - samples/sec: 9292.49 - lr: 0.000020 - momentum: 0.000000
2023-10-18 16:05:09,734 epoch 7 - iter 75/152 - loss 0.32224128 - time (sec): 1.64 - samples/sec: 9309.62 - lr: 0.000020 - momentum: 0.000000
2023-10-18 16:05:10,081 epoch 7 - iter 90/152 - loss 0.31368658 - time (sec): 1.98 - samples/sec: 9337.93 - lr: 0.000019 - momentum: 0.000000
2023-10-18 16:05:10,405 epoch 7 - iter 105/152 - loss 0.31187673 - time (sec): 2.31 - samples/sec: 9366.22 - lr: 0.000019 - momentum: 0.000000
2023-10-18 16:05:10,721 epoch 7 - iter 120/152 - loss 0.31348426 - time (sec): 2.62 - samples/sec: 9355.10 - lr: 0.000018 - momentum: 0.000000
2023-10-18 16:05:11,054 epoch 7 - iter 135/152 - loss 0.31235664 - time (sec): 2.96 - samples/sec: 9386.33 - lr: 0.000017 - momentum: 0.000000
2023-10-18 16:05:11,385 epoch 7 - iter 150/152 - loss 0.30337842 - time (sec): 3.29 - samples/sec: 9315.33 - lr: 0.000017 - momentum: 0.000000
2023-10-18 16:05:11,433 ----------------------------------------------------------------------------------------------------
2023-10-18 16:05:11,433 EPOCH 7 done: loss 0.3019 - lr: 0.000017
2023-10-18 16:05:11,950 DEV : loss 0.2880619764328003 - f1-score (micro avg) 0.4264
2023-10-18 16:05:11,956 saving best model
2023-10-18 16:05:11,989 ----------------------------------------------------------------------------------------------------
2023-10-18 16:05:12,310 epoch 8 - iter 15/152 - loss 0.24299427 - time (sec): 0.32 - samples/sec: 9043.99 - lr: 0.000016 - momentum: 0.000000
2023-10-18 16:05:12,640 epoch 8 - iter 30/152 - loss 0.27954609 - time (sec): 0.65 - samples/sec: 8961.96 - lr: 0.000016 - momentum: 0.000000
2023-10-18 16:05:12,962 epoch 8 - iter 45/152 - loss 0.29235058 - time (sec): 0.97 - samples/sec: 9173.19 - lr: 0.000015 - momentum: 0.000000
2023-10-18 16:05:13,286 epoch 8 - iter 60/152 - loss 0.29197778 - time (sec): 1.30 - samples/sec: 9199.29 - lr: 0.000015 - momentum: 0.000000
2023-10-18 16:05:13,624 epoch 8 - iter 75/152 - loss 0.29671397 - time (sec): 1.63 - samples/sec: 9290.52 - lr: 0.000014 - momentum: 0.000000
2023-10-18 16:05:13,977 epoch 8 - iter 90/152 - loss 0.29113199 - time (sec): 1.99 - samples/sec: 9295.14 - lr: 0.000014 - momentum: 0.000000
2023-10-18 16:05:14,302 epoch 8 - iter 105/152 - loss 0.29128096 - time (sec): 2.31 - samples/sec: 9121.49 - lr: 0.000013 - momentum: 0.000000
2023-10-18 16:05:14,630 epoch 8 - iter 120/152 - loss 0.28758044 - time (sec): 2.64 - samples/sec: 9218.74 - lr: 0.000012 - momentum: 0.000000
2023-10-18 16:05:14,954 epoch 8 - iter 135/152 - loss 0.29023127 - time (sec): 2.96 - samples/sec: 9310.80 - lr: 0.000012 - momentum: 0.000000
2023-10-18 16:05:15,285 epoch 8 - iter 150/152 - loss 0.28969709 - time (sec): 3.29 - samples/sec: 9296.86 - lr: 0.000011 - momentum: 0.000000
2023-10-18 16:05:15,328 ----------------------------------------------------------------------------------------------------
2023-10-18 16:05:15,329 EPOCH 8 done: loss 0.2890 - lr: 0.000011
2023-10-18 16:05:15,865 DEV : loss 0.2770352065563202 - f1-score (micro avg) 0.4598
2023-10-18 16:05:15,871 saving best model
2023-10-18 16:05:15,904 ----------------------------------------------------------------------------------------------------
2023-10-18 16:05:16,236 epoch 9 - iter 15/152 - loss 0.25596487 - time (sec): 0.33 - samples/sec: 9464.90 - lr: 0.000011 - momentum: 0.000000
2023-10-18 16:05:16,581 epoch 9 - iter 30/152 - loss 0.25922984 - time (sec): 0.68 - samples/sec: 9566.27 - lr: 0.000010 - momentum: 0.000000
2023-10-18 16:05:16,913 epoch 9 - iter 45/152 - loss 0.28068483 - time (sec): 1.01 - samples/sec: 9388.86 - lr: 0.000010 - momentum: 0.000000
2023-10-18 16:05:17,245 epoch 9 - iter 60/152 - loss 0.27771263 - time (sec): 1.34 - samples/sec: 9165.93 - lr: 0.000009 - momentum: 0.000000
2023-10-18 16:05:17,569 epoch 9 - iter 75/152 - loss 0.28699937 - time (sec): 1.66 - samples/sec: 9347.91 - lr: 0.000009 - momentum: 0.000000
2023-10-18 16:05:17,900 epoch 9 - iter 90/152 - loss 0.28319719 - time (sec): 2.00 - samples/sec: 9308.42 - lr: 0.000008 - momentum: 0.000000
2023-10-18 16:05:18,224 epoch 9 - iter 105/152 - loss 0.28380563 - time (sec): 2.32 - samples/sec: 9356.64 - lr: 0.000007 - momentum: 0.000000
2023-10-18 16:05:18,549 epoch 9 - iter 120/152 - loss 0.29008897 - time (sec): 2.64 - samples/sec: 9346.69 - lr: 0.000007 - momentum: 0.000000
2023-10-18 16:05:18,879 epoch 9 - iter 135/152 - loss 0.28838066 - time (sec): 2.97 - samples/sec: 9323.34 - lr: 0.000006 - momentum: 0.000000
2023-10-18 16:05:19,198 epoch 9 - iter 150/152 - loss 0.28526994 - time (sec): 3.29 - samples/sec: 9302.36 - lr: 0.000006 - momentum: 0.000000
2023-10-18 16:05:19,237 ----------------------------------------------------------------------------------------------------
2023-10-18 16:05:19,237 EPOCH 9 done: loss 0.2867 - lr: 0.000006
2023-10-18 16:05:19,751 DEV : loss 0.2709275782108307 - f1-score (micro avg) 0.468
2023-10-18 16:05:19,757 saving best model
2023-10-18 16:05:19,794 ----------------------------------------------------------------------------------------------------
2023-10-18 16:05:20,119 epoch 10 - iter 15/152 - loss 0.23999848 - time (sec): 0.33 - samples/sec: 9586.53 - lr: 0.000005 - momentum: 0.000000
2023-10-18 16:05:20,465 epoch 10 - iter 30/152 - loss 0.25875882 - time (sec): 0.67 - samples/sec: 9130.45 - lr: 0.000005 - momentum: 0.000000
2023-10-18 16:05:20,801 epoch 10 - iter 45/152 - loss 0.27044887 - time (sec): 1.01 - samples/sec: 9143.72 - lr: 0.000004 - momentum: 0.000000
2023-10-18 16:05:21,135 epoch 10 - iter 60/152 - loss 0.27241444 - time (sec): 1.34 - samples/sec: 9258.75 - lr: 0.000004 - momentum: 0.000000
2023-10-18 16:05:21,454 epoch 10 - iter 75/152 - loss 0.26243794 - time (sec): 1.66 - samples/sec: 9414.71 - lr: 0.000003 - momentum: 0.000000
2023-10-18 16:05:21,778 epoch 10 - iter 90/152 - loss 0.25711397 - time (sec): 1.98 - samples/sec: 9355.29 - lr: 0.000003 - momentum: 0.000000
2023-10-18 16:05:22,115 epoch 10 - iter 105/152 - loss 0.26696933 - time (sec): 2.32 - samples/sec: 9356.35 - lr: 0.000002 - momentum: 0.000000
2023-10-18 16:05:22,444 epoch 10 - iter 120/152 - loss 0.27120824 - time (sec): 2.65 - samples/sec: 9365.82 - lr: 0.000001 - momentum: 0.000000
2023-10-18 16:05:22,770 epoch 10 - iter 135/152 - loss 0.27912026 - time (sec): 2.98 - samples/sec: 9288.59 - lr: 0.000001 - momentum: 0.000000
2023-10-18 16:05:23,084 epoch 10 - iter 150/152 - loss 0.27645946 - time (sec): 3.29 - samples/sec: 9290.86 - lr: 0.000000 - momentum: 0.000000
2023-10-18 16:05:23,125 ----------------------------------------------------------------------------------------------------
2023-10-18 16:05:23,125 EPOCH 10 done: loss 0.2746 - lr: 0.000000
2023-10-18 16:05:23,645 DEV : loss 0.26880595088005066 - f1-score (micro avg) 0.4642
2023-10-18 16:05:23,681 ----------------------------------------------------------------------------------------------------
2023-10-18 16:05:23,681 Loading model from best epoch ...
2023-10-18 16:05:23,761 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object
2023-10-18 16:05:24,238
Results:
- F-score (micro) 0.4738
- F-score (macro) 0.2912
- Accuracy 0.3246
By class:
precision recall f1-score support
scope 0.4526 0.5695 0.5044 151
work 0.2919 0.4947 0.3672 95
pers 0.6341 0.5417 0.5843 96
loc 0.0000 0.0000 0.0000 3
date 0.0000 0.0000 0.0000 3
micro avg 0.4273 0.5316 0.4738 348
macro avg 0.2757 0.3212 0.2912 348
weighted avg 0.4510 0.5316 0.4803 348
2023-10-18 16:05:24,238 ----------------------------------------------------------------------------------------------------