stefan-it's picture
Upload folder using huggingface_hub
0a45092
2023-10-18 16:08:07,153 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:07,153 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 16:08:07,153 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:07,153 MultiCorpus: 1214 train + 266 dev + 251 test sentences
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator
2023-10-18 16:08:07,153 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:07,153 Train: 1214 sentences
2023-10-18 16:08:07,153 (train_with_dev=False, train_with_test=False)
2023-10-18 16:08:07,153 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:07,153 Training Params:
2023-10-18 16:08:07,154 - learning_rate: "5e-05"
2023-10-18 16:08:07,154 - mini_batch_size: "8"
2023-10-18 16:08:07,154 - max_epochs: "10"
2023-10-18 16:08:07,154 - shuffle: "True"
2023-10-18 16:08:07,154 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:07,154 Plugins:
2023-10-18 16:08:07,154 - TensorboardLogger
2023-10-18 16:08:07,154 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 16:08:07,154 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:07,154 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 16:08:07,154 - metric: "('micro avg', 'f1-score')"
2023-10-18 16:08:07,154 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:07,154 Computation:
2023-10-18 16:08:07,154 - compute on device: cuda:0
2023-10-18 16:08:07,154 - embedding storage: none
2023-10-18 16:08:07,154 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:07,154 Model training base path: "hmbench-ajmc/en-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-18 16:08:07,154 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:07,154 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:07,154 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 16:08:07,454 epoch 1 - iter 15/152 - loss 3.62707823 - time (sec): 0.30 - samples/sec: 9676.84 - lr: 0.000005 - momentum: 0.000000
2023-10-18 16:08:07,789 epoch 1 - iter 30/152 - loss 3.64460924 - time (sec): 0.63 - samples/sec: 9170.30 - lr: 0.000010 - momentum: 0.000000
2023-10-18 16:08:08,137 epoch 1 - iter 45/152 - loss 3.57389121 - time (sec): 0.98 - samples/sec: 9275.78 - lr: 0.000014 - momentum: 0.000000
2023-10-18 16:08:08,476 epoch 1 - iter 60/152 - loss 3.41963611 - time (sec): 1.32 - samples/sec: 9338.86 - lr: 0.000019 - momentum: 0.000000
2023-10-18 16:08:08,806 epoch 1 - iter 75/152 - loss 3.23473728 - time (sec): 1.65 - samples/sec: 9270.17 - lr: 0.000024 - momentum: 0.000000
2023-10-18 16:08:09,158 epoch 1 - iter 90/152 - loss 3.01336743 - time (sec): 2.00 - samples/sec: 9158.44 - lr: 0.000029 - momentum: 0.000000
2023-10-18 16:08:09,500 epoch 1 - iter 105/152 - loss 2.77192523 - time (sec): 2.35 - samples/sec: 9213.57 - lr: 0.000034 - momentum: 0.000000
2023-10-18 16:08:09,831 epoch 1 - iter 120/152 - loss 2.56102448 - time (sec): 2.68 - samples/sec: 9202.44 - lr: 0.000039 - momentum: 0.000000
2023-10-18 16:08:10,175 epoch 1 - iter 135/152 - loss 2.36857399 - time (sec): 3.02 - samples/sec: 9225.63 - lr: 0.000044 - momentum: 0.000000
2023-10-18 16:08:10,507 epoch 1 - iter 150/152 - loss 2.24365079 - time (sec): 3.35 - samples/sec: 9150.70 - lr: 0.000049 - momentum: 0.000000
2023-10-18 16:08:10,547 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:10,547 EPOCH 1 done: loss 2.2353 - lr: 0.000049
2023-10-18 16:08:11,150 DEV : loss 0.7518473267555237 - f1-score (micro avg) 0.0
2023-10-18 16:08:11,155 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:11,481 epoch 2 - iter 15/152 - loss 0.69165701 - time (sec): 0.32 - samples/sec: 9495.06 - lr: 0.000049 - momentum: 0.000000
2023-10-18 16:08:11,824 epoch 2 - iter 30/152 - loss 0.73770337 - time (sec): 0.67 - samples/sec: 9576.65 - lr: 0.000049 - momentum: 0.000000
2023-10-18 16:08:12,169 epoch 2 - iter 45/152 - loss 0.70083391 - time (sec): 1.01 - samples/sec: 9196.54 - lr: 0.000048 - momentum: 0.000000
2023-10-18 16:08:12,495 epoch 2 - iter 60/152 - loss 0.70506531 - time (sec): 1.34 - samples/sec: 8966.33 - lr: 0.000048 - momentum: 0.000000
2023-10-18 16:08:12,815 epoch 2 - iter 75/152 - loss 0.69758281 - time (sec): 1.66 - samples/sec: 9079.58 - lr: 0.000047 - momentum: 0.000000
2023-10-18 16:08:13,151 epoch 2 - iter 90/152 - loss 0.68454078 - time (sec): 2.00 - samples/sec: 9092.53 - lr: 0.000047 - momentum: 0.000000
2023-10-18 16:08:13,490 epoch 2 - iter 105/152 - loss 0.67998012 - time (sec): 2.33 - samples/sec: 9080.72 - lr: 0.000046 - momentum: 0.000000
2023-10-18 16:08:13,840 epoch 2 - iter 120/152 - loss 0.66276630 - time (sec): 2.68 - samples/sec: 9111.10 - lr: 0.000046 - momentum: 0.000000
2023-10-18 16:08:14,173 epoch 2 - iter 135/152 - loss 0.64648337 - time (sec): 3.02 - samples/sec: 9174.60 - lr: 0.000045 - momentum: 0.000000
2023-10-18 16:08:14,505 epoch 2 - iter 150/152 - loss 0.63479222 - time (sec): 3.35 - samples/sec: 9179.69 - lr: 0.000045 - momentum: 0.000000
2023-10-18 16:08:14,543 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:14,543 EPOCH 2 done: loss 0.6321 - lr: 0.000045
2023-10-18 16:08:15,060 DEV : loss 0.4407750070095062 - f1-score (micro avg) 0.153
2023-10-18 16:08:15,067 saving best model
2023-10-18 16:08:15,095 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:15,421 epoch 3 - iter 15/152 - loss 0.43673414 - time (sec): 0.33 - samples/sec: 9589.93 - lr: 0.000044 - momentum: 0.000000
2023-10-18 16:08:15,759 epoch 3 - iter 30/152 - loss 0.47781286 - time (sec): 0.66 - samples/sec: 9483.53 - lr: 0.000043 - momentum: 0.000000
2023-10-18 16:08:16,107 epoch 3 - iter 45/152 - loss 0.49762388 - time (sec): 1.01 - samples/sec: 9193.32 - lr: 0.000043 - momentum: 0.000000
2023-10-18 16:08:16,463 epoch 3 - iter 60/152 - loss 0.47653350 - time (sec): 1.37 - samples/sec: 9108.11 - lr: 0.000042 - momentum: 0.000000
2023-10-18 16:08:16,813 epoch 3 - iter 75/152 - loss 0.45774352 - time (sec): 1.72 - samples/sec: 9184.31 - lr: 0.000042 - momentum: 0.000000
2023-10-18 16:08:17,147 epoch 3 - iter 90/152 - loss 0.46438960 - time (sec): 2.05 - samples/sec: 9228.99 - lr: 0.000041 - momentum: 0.000000
2023-10-18 16:08:17,497 epoch 3 - iter 105/152 - loss 0.45619656 - time (sec): 2.40 - samples/sec: 9220.20 - lr: 0.000041 - momentum: 0.000000
2023-10-18 16:08:17,874 epoch 3 - iter 120/152 - loss 0.46731334 - time (sec): 2.78 - samples/sec: 9057.15 - lr: 0.000040 - momentum: 0.000000
2023-10-18 16:08:18,231 epoch 3 - iter 135/152 - loss 0.45898427 - time (sec): 3.14 - samples/sec: 8909.82 - lr: 0.000040 - momentum: 0.000000
2023-10-18 16:08:18,570 epoch 3 - iter 150/152 - loss 0.45468700 - time (sec): 3.47 - samples/sec: 8831.59 - lr: 0.000039 - momentum: 0.000000
2023-10-18 16:08:18,611 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:18,611 EPOCH 3 done: loss 0.4551 - lr: 0.000039
2023-10-18 16:08:19,136 DEV : loss 0.3566431701183319 - f1-score (micro avg) 0.3407
2023-10-18 16:08:19,141 saving best model
2023-10-18 16:08:19,176 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:19,529 epoch 4 - iter 15/152 - loss 0.36679282 - time (sec): 0.35 - samples/sec: 8511.69 - lr: 0.000038 - momentum: 0.000000
2023-10-18 16:08:19,868 epoch 4 - iter 30/152 - loss 0.42536314 - time (sec): 0.69 - samples/sec: 8854.59 - lr: 0.000038 - momentum: 0.000000
2023-10-18 16:08:20,213 epoch 4 - iter 45/152 - loss 0.40536877 - time (sec): 1.04 - samples/sec: 8867.38 - lr: 0.000037 - momentum: 0.000000
2023-10-18 16:08:20,558 epoch 4 - iter 60/152 - loss 0.41303659 - time (sec): 1.38 - samples/sec: 8904.74 - lr: 0.000037 - momentum: 0.000000
2023-10-18 16:08:20,889 epoch 4 - iter 75/152 - loss 0.40912747 - time (sec): 1.71 - samples/sec: 9064.00 - lr: 0.000036 - momentum: 0.000000
2023-10-18 16:08:21,230 epoch 4 - iter 90/152 - loss 0.40599711 - time (sec): 2.05 - samples/sec: 9093.04 - lr: 0.000036 - momentum: 0.000000
2023-10-18 16:08:21,561 epoch 4 - iter 105/152 - loss 0.40046678 - time (sec): 2.38 - samples/sec: 9032.82 - lr: 0.000035 - momentum: 0.000000
2023-10-18 16:08:21,892 epoch 4 - iter 120/152 - loss 0.39779267 - time (sec): 2.72 - samples/sec: 9021.02 - lr: 0.000035 - momentum: 0.000000
2023-10-18 16:08:22,221 epoch 4 - iter 135/152 - loss 0.39931684 - time (sec): 3.04 - samples/sec: 8986.05 - lr: 0.000034 - momentum: 0.000000
2023-10-18 16:08:22,552 epoch 4 - iter 150/152 - loss 0.38436845 - time (sec): 3.38 - samples/sec: 9073.29 - lr: 0.000034 - momentum: 0.000000
2023-10-18 16:08:22,593 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:22,593 EPOCH 4 done: loss 0.3831 - lr: 0.000034
2023-10-18 16:08:23,116 DEV : loss 0.32094815373420715 - f1-score (micro avg) 0.3789
2023-10-18 16:08:23,121 saving best model
2023-10-18 16:08:23,153 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:23,485 epoch 5 - iter 15/152 - loss 0.31681598 - time (sec): 0.33 - samples/sec: 9191.84 - lr: 0.000033 - momentum: 0.000000
2023-10-18 16:08:23,817 epoch 5 - iter 30/152 - loss 0.31552534 - time (sec): 0.66 - samples/sec: 9102.87 - lr: 0.000032 - momentum: 0.000000
2023-10-18 16:08:24,144 epoch 5 - iter 45/152 - loss 0.30686019 - time (sec): 0.99 - samples/sec: 9021.32 - lr: 0.000032 - momentum: 0.000000
2023-10-18 16:08:24,492 epoch 5 - iter 60/152 - loss 0.31107674 - time (sec): 1.34 - samples/sec: 8811.20 - lr: 0.000031 - momentum: 0.000000
2023-10-18 16:08:24,837 epoch 5 - iter 75/152 - loss 0.32270256 - time (sec): 1.68 - samples/sec: 8925.20 - lr: 0.000031 - momentum: 0.000000
2023-10-18 16:08:25,160 epoch 5 - iter 90/152 - loss 0.33129467 - time (sec): 2.01 - samples/sec: 9060.25 - lr: 0.000030 - momentum: 0.000000
2023-10-18 16:08:25,509 epoch 5 - iter 105/152 - loss 0.33382968 - time (sec): 2.36 - samples/sec: 9080.75 - lr: 0.000030 - momentum: 0.000000
2023-10-18 16:08:25,852 epoch 5 - iter 120/152 - loss 0.34241757 - time (sec): 2.70 - samples/sec: 9137.28 - lr: 0.000029 - momentum: 0.000000
2023-10-18 16:08:26,175 epoch 5 - iter 135/152 - loss 0.34621140 - time (sec): 3.02 - samples/sec: 9129.97 - lr: 0.000029 - momentum: 0.000000
2023-10-18 16:08:26,492 epoch 5 - iter 150/152 - loss 0.34638964 - time (sec): 3.34 - samples/sec: 9178.00 - lr: 0.000028 - momentum: 0.000000
2023-10-18 16:08:26,535 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:26,535 EPOCH 5 done: loss 0.3452 - lr: 0.000028
2023-10-18 16:08:27,064 DEV : loss 0.29360231757164 - f1-score (micro avg) 0.4571
2023-10-18 16:08:27,069 saving best model
2023-10-18 16:08:27,101 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:27,431 epoch 6 - iter 15/152 - loss 0.24967879 - time (sec): 0.33 - samples/sec: 8384.89 - lr: 0.000027 - momentum: 0.000000
2023-10-18 16:08:27,761 epoch 6 - iter 30/152 - loss 0.30903542 - time (sec): 0.66 - samples/sec: 9026.26 - lr: 0.000027 - momentum: 0.000000
2023-10-18 16:08:28,113 epoch 6 - iter 45/152 - loss 0.32159215 - time (sec): 1.01 - samples/sec: 9179.19 - lr: 0.000026 - momentum: 0.000000
2023-10-18 16:08:28,445 epoch 6 - iter 60/152 - loss 0.31080077 - time (sec): 1.34 - samples/sec: 9063.10 - lr: 0.000026 - momentum: 0.000000
2023-10-18 16:08:28,775 epoch 6 - iter 75/152 - loss 0.30371423 - time (sec): 1.67 - samples/sec: 9032.50 - lr: 0.000025 - momentum: 0.000000
2023-10-18 16:08:29,109 epoch 6 - iter 90/152 - loss 0.30581694 - time (sec): 2.01 - samples/sec: 9092.69 - lr: 0.000025 - momentum: 0.000000
2023-10-18 16:08:29,429 epoch 6 - iter 105/152 - loss 0.29761890 - time (sec): 2.33 - samples/sec: 9019.39 - lr: 0.000024 - momentum: 0.000000
2023-10-18 16:08:29,774 epoch 6 - iter 120/152 - loss 0.30472801 - time (sec): 2.67 - samples/sec: 9088.83 - lr: 0.000024 - momentum: 0.000000
2023-10-18 16:08:30,100 epoch 6 - iter 135/152 - loss 0.31332687 - time (sec): 3.00 - samples/sec: 9170.33 - lr: 0.000023 - momentum: 0.000000
2023-10-18 16:08:30,430 epoch 6 - iter 150/152 - loss 0.31454660 - time (sec): 3.33 - samples/sec: 9223.26 - lr: 0.000022 - momentum: 0.000000
2023-10-18 16:08:30,470 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:30,470 EPOCH 6 done: loss 0.3152 - lr: 0.000022
2023-10-18 16:08:31,127 DEV : loss 0.281490296125412 - f1-score (micro avg) 0.4792
2023-10-18 16:08:31,132 saving best model
2023-10-18 16:08:31,165 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:31,501 epoch 7 - iter 15/152 - loss 0.30629822 - time (sec): 0.34 - samples/sec: 8975.06 - lr: 0.000022 - momentum: 0.000000
2023-10-18 16:08:31,837 epoch 7 - iter 30/152 - loss 0.31139948 - time (sec): 0.67 - samples/sec: 9341.76 - lr: 0.000021 - momentum: 0.000000
2023-10-18 16:08:32,155 epoch 7 - iter 45/152 - loss 0.29338805 - time (sec): 0.99 - samples/sec: 9666.42 - lr: 0.000021 - momentum: 0.000000
2023-10-18 16:08:32,471 epoch 7 - iter 60/152 - loss 0.28946763 - time (sec): 1.31 - samples/sec: 9633.38 - lr: 0.000020 - momentum: 0.000000
2023-10-18 16:08:32,821 epoch 7 - iter 75/152 - loss 0.30011249 - time (sec): 1.66 - samples/sec: 9328.06 - lr: 0.000020 - momentum: 0.000000
2023-10-18 16:08:33,134 epoch 7 - iter 90/152 - loss 0.29653268 - time (sec): 1.97 - samples/sec: 9284.55 - lr: 0.000019 - momentum: 0.000000
2023-10-18 16:08:33,468 epoch 7 - iter 105/152 - loss 0.29512415 - time (sec): 2.30 - samples/sec: 9184.81 - lr: 0.000019 - momentum: 0.000000
2023-10-18 16:08:33,808 epoch 7 - iter 120/152 - loss 0.28708901 - time (sec): 2.64 - samples/sec: 9128.86 - lr: 0.000018 - momentum: 0.000000
2023-10-18 16:08:34,138 epoch 7 - iter 135/152 - loss 0.29878818 - time (sec): 2.97 - samples/sec: 9192.29 - lr: 0.000017 - momentum: 0.000000
2023-10-18 16:08:34,472 epoch 7 - iter 150/152 - loss 0.29704267 - time (sec): 3.31 - samples/sec: 9255.40 - lr: 0.000017 - momentum: 0.000000
2023-10-18 16:08:34,514 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:34,514 EPOCH 7 done: loss 0.2985 - lr: 0.000017
2023-10-18 16:08:35,035 DEV : loss 0.2697525918483734 - f1-score (micro avg) 0.5053
2023-10-18 16:08:35,040 saving best model
2023-10-18 16:08:35,070 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:35,400 epoch 8 - iter 15/152 - loss 0.26163246 - time (sec): 0.33 - samples/sec: 8811.91 - lr: 0.000016 - momentum: 0.000000
2023-10-18 16:08:35,733 epoch 8 - iter 30/152 - loss 0.27301989 - time (sec): 0.66 - samples/sec: 8905.79 - lr: 0.000016 - momentum: 0.000000
2023-10-18 16:08:36,075 epoch 8 - iter 45/152 - loss 0.27716224 - time (sec): 1.00 - samples/sec: 8798.61 - lr: 0.000015 - momentum: 0.000000
2023-10-18 16:08:36,429 epoch 8 - iter 60/152 - loss 0.28085642 - time (sec): 1.36 - samples/sec: 8796.25 - lr: 0.000015 - momentum: 0.000000
2023-10-18 16:08:36,786 epoch 8 - iter 75/152 - loss 0.27666616 - time (sec): 1.72 - samples/sec: 8831.23 - lr: 0.000014 - momentum: 0.000000
2023-10-18 16:08:37,129 epoch 8 - iter 90/152 - loss 0.27229703 - time (sec): 2.06 - samples/sec: 8921.19 - lr: 0.000014 - momentum: 0.000000
2023-10-18 16:08:37,470 epoch 8 - iter 105/152 - loss 0.27327002 - time (sec): 2.40 - samples/sec: 8957.79 - lr: 0.000013 - momentum: 0.000000
2023-10-18 16:08:37,804 epoch 8 - iter 120/152 - loss 0.27460704 - time (sec): 2.73 - samples/sec: 8975.96 - lr: 0.000012 - momentum: 0.000000
2023-10-18 16:08:38,137 epoch 8 - iter 135/152 - loss 0.27673887 - time (sec): 3.07 - samples/sec: 9013.61 - lr: 0.000012 - momentum: 0.000000
2023-10-18 16:08:38,461 epoch 8 - iter 150/152 - loss 0.27952486 - time (sec): 3.39 - samples/sec: 9010.00 - lr: 0.000011 - momentum: 0.000000
2023-10-18 16:08:38,501 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:38,501 EPOCH 8 done: loss 0.2793 - lr: 0.000011
2023-10-18 16:08:39,022 DEV : loss 0.26368987560272217 - f1-score (micro avg) 0.5146
2023-10-18 16:08:39,028 saving best model
2023-10-18 16:08:39,061 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:39,390 epoch 9 - iter 15/152 - loss 0.28345351 - time (sec): 0.33 - samples/sec: 9676.85 - lr: 0.000011 - momentum: 0.000000
2023-10-18 16:08:39,722 epoch 9 - iter 30/152 - loss 0.25552662 - time (sec): 0.66 - samples/sec: 9293.39 - lr: 0.000010 - momentum: 0.000000
2023-10-18 16:08:40,073 epoch 9 - iter 45/152 - loss 0.29194379 - time (sec): 1.01 - samples/sec: 9301.13 - lr: 0.000010 - momentum: 0.000000
2023-10-18 16:08:40,409 epoch 9 - iter 60/152 - loss 0.27661519 - time (sec): 1.35 - samples/sec: 9134.23 - lr: 0.000009 - momentum: 0.000000
2023-10-18 16:08:40,751 epoch 9 - iter 75/152 - loss 0.28457449 - time (sec): 1.69 - samples/sec: 9077.37 - lr: 0.000009 - momentum: 0.000000
2023-10-18 16:08:41,083 epoch 9 - iter 90/152 - loss 0.27852315 - time (sec): 2.02 - samples/sec: 9072.51 - lr: 0.000008 - momentum: 0.000000
2023-10-18 16:08:41,412 epoch 9 - iter 105/152 - loss 0.28200134 - time (sec): 2.35 - samples/sec: 9116.18 - lr: 0.000007 - momentum: 0.000000
2023-10-18 16:08:41,738 epoch 9 - iter 120/152 - loss 0.28023949 - time (sec): 2.68 - samples/sec: 9169.45 - lr: 0.000007 - momentum: 0.000000
2023-10-18 16:08:42,062 epoch 9 - iter 135/152 - loss 0.27602297 - time (sec): 3.00 - samples/sec: 9158.84 - lr: 0.000006 - momentum: 0.000000
2023-10-18 16:08:42,401 epoch 9 - iter 150/152 - loss 0.27588255 - time (sec): 3.34 - samples/sec: 9163.83 - lr: 0.000006 - momentum: 0.000000
2023-10-18 16:08:42,445 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:42,445 EPOCH 9 done: loss 0.2760 - lr: 0.000006
2023-10-18 16:08:42,964 DEV : loss 0.2618214190006256 - f1-score (micro avg) 0.5198
2023-10-18 16:08:42,969 saving best model
2023-10-18 16:08:43,001 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:43,334 epoch 10 - iter 15/152 - loss 0.25856362 - time (sec): 0.33 - samples/sec: 9491.85 - lr: 0.000005 - momentum: 0.000000
2023-10-18 16:08:43,673 epoch 10 - iter 30/152 - loss 0.26320870 - time (sec): 0.67 - samples/sec: 9651.31 - lr: 0.000005 - momentum: 0.000000
2023-10-18 16:08:43,993 epoch 10 - iter 45/152 - loss 0.25442539 - time (sec): 0.99 - samples/sec: 9364.32 - lr: 0.000004 - momentum: 0.000000
2023-10-18 16:08:44,331 epoch 10 - iter 60/152 - loss 0.24983670 - time (sec): 1.33 - samples/sec: 9176.87 - lr: 0.000004 - momentum: 0.000000
2023-10-18 16:08:44,657 epoch 10 - iter 75/152 - loss 0.26041399 - time (sec): 1.66 - samples/sec: 9235.25 - lr: 0.000003 - momentum: 0.000000
2023-10-18 16:08:45,014 epoch 10 - iter 90/152 - loss 0.25817779 - time (sec): 2.01 - samples/sec: 9144.66 - lr: 0.000003 - momentum: 0.000000
2023-10-18 16:08:45,342 epoch 10 - iter 105/152 - loss 0.25587514 - time (sec): 2.34 - samples/sec: 9193.97 - lr: 0.000002 - momentum: 0.000000
2023-10-18 16:08:45,671 epoch 10 - iter 120/152 - loss 0.26708617 - time (sec): 2.67 - samples/sec: 9235.88 - lr: 0.000001 - momentum: 0.000000
2023-10-18 16:08:46,017 epoch 10 - iter 135/152 - loss 0.27557433 - time (sec): 3.02 - samples/sec: 9212.35 - lr: 0.000001 - momentum: 0.000000
2023-10-18 16:08:46,369 epoch 10 - iter 150/152 - loss 0.27131633 - time (sec): 3.37 - samples/sec: 9101.41 - lr: 0.000000 - momentum: 0.000000
2023-10-18 16:08:46,409 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:46,409 EPOCH 10 done: loss 0.2698 - lr: 0.000000
2023-10-18 16:08:46,926 DEV : loss 0.2598583996295929 - f1-score (micro avg) 0.525
2023-10-18 16:08:46,931 saving best model
2023-10-18 16:08:46,990 ----------------------------------------------------------------------------------------------------
2023-10-18 16:08:46,991 Loading model from best epoch ...
2023-10-18 16:08:47,059 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object
2023-10-18 16:08:47,396
Results:
- F-score (micro) 0.541
- F-score (macro) 0.3289
- Accuracy 0.3905
By class:
precision recall f1-score support
scope 0.5669 0.5894 0.5779 151
work 0.3767 0.5789 0.4564 95
pers 0.6667 0.5625 0.6102 96
loc 0.0000 0.0000 0.0000 3
date 0.0000 0.0000 0.0000 3
micro avg 0.5156 0.5690 0.5410 348
macro avg 0.3221 0.3462 0.3289 348
weighted avg 0.5327 0.5690 0.5437 348
2023-10-18 16:08:47,396 ----------------------------------------------------------------------------------------------------