|
2023-10-18 16:04:45,006 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:45,007 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=25, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-18 16:04:45,007 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:45,007 MultiCorpus: 1214 train + 266 dev + 251 test sentences |
|
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator |
|
2023-10-18 16:04:45,007 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:45,007 Train: 1214 sentences |
|
2023-10-18 16:04:45,007 (train_with_dev=False, train_with_test=False) |
|
2023-10-18 16:04:45,007 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:45,007 Training Params: |
|
2023-10-18 16:04:45,007 - learning_rate: "5e-05" |
|
2023-10-18 16:04:45,007 - mini_batch_size: "8" |
|
2023-10-18 16:04:45,007 - max_epochs: "10" |
|
2023-10-18 16:04:45,007 - shuffle: "True" |
|
2023-10-18 16:04:45,007 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:45,007 Plugins: |
|
2023-10-18 16:04:45,007 - TensorboardLogger |
|
2023-10-18 16:04:45,007 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-18 16:04:45,007 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:45,007 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-18 16:04:45,008 - metric: "('micro avg', 'f1-score')" |
|
2023-10-18 16:04:45,008 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:45,008 Computation: |
|
2023-10-18 16:04:45,008 - compute on device: cuda:0 |
|
2023-10-18 16:04:45,008 - embedding storage: none |
|
2023-10-18 16:04:45,008 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:45,008 Model training base path: "hmbench-ajmc/en-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-18 16:04:45,008 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:45,008 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:45,008 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-18 16:04:45,328 epoch 1 - iter 15/152 - loss 3.55491273 - time (sec): 0.32 - samples/sec: 8438.20 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 16:04:45,643 epoch 1 - iter 30/152 - loss 3.47558306 - time (sec): 0.63 - samples/sec: 9227.83 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 16:04:45,954 epoch 1 - iter 45/152 - loss 3.37625327 - time (sec): 0.95 - samples/sec: 9191.87 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 16:04:46,276 epoch 1 - iter 60/152 - loss 3.21161944 - time (sec): 1.27 - samples/sec: 9416.38 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 16:04:46,601 epoch 1 - iter 75/152 - loss 3.03719629 - time (sec): 1.59 - samples/sec: 9404.01 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 16:04:46,933 epoch 1 - iter 90/152 - loss 2.84316742 - time (sec): 1.92 - samples/sec: 9482.16 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 16:04:47,264 epoch 1 - iter 105/152 - loss 2.63926958 - time (sec): 2.26 - samples/sec: 9379.69 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 16:04:47,584 epoch 1 - iter 120/152 - loss 2.42786078 - time (sec): 2.58 - samples/sec: 9506.82 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 16:04:47,890 epoch 1 - iter 135/152 - loss 2.26021411 - time (sec): 2.88 - samples/sec: 9482.46 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 16:04:48,212 epoch 1 - iter 150/152 - loss 2.11391684 - time (sec): 3.20 - samples/sec: 9555.77 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 16:04:48,253 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:48,253 EPOCH 1 done: loss 2.1041 - lr: 0.000049 |
|
2023-10-18 16:04:48,589 DEV : loss 0.8210452198982239 - f1-score (micro avg) 0.0 |
|
2023-10-18 16:04:48,595 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:48,900 epoch 2 - iter 15/152 - loss 0.82480804 - time (sec): 0.30 - samples/sec: 10411.99 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 16:04:49,230 epoch 2 - iter 30/152 - loss 0.79322320 - time (sec): 0.63 - samples/sec: 9916.20 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-18 16:04:49,558 epoch 2 - iter 45/152 - loss 0.78621530 - time (sec): 0.96 - samples/sec: 9763.38 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 16:04:49,881 epoch 2 - iter 60/152 - loss 0.75039960 - time (sec): 1.29 - samples/sec: 9766.47 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-18 16:04:50,169 epoch 2 - iter 75/152 - loss 0.74715332 - time (sec): 1.57 - samples/sec: 9860.97 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 16:04:50,474 epoch 2 - iter 90/152 - loss 0.72694581 - time (sec): 1.88 - samples/sec: 9997.73 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-18 16:04:50,801 epoch 2 - iter 105/152 - loss 0.68749465 - time (sec): 2.21 - samples/sec: 9811.87 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 16:04:51,111 epoch 2 - iter 120/152 - loss 0.67904757 - time (sec): 2.52 - samples/sec: 9910.22 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-18 16:04:51,437 epoch 2 - iter 135/152 - loss 0.66227355 - time (sec): 2.84 - samples/sec: 9818.68 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 16:04:51,749 epoch 2 - iter 150/152 - loss 0.64609491 - time (sec): 3.15 - samples/sec: 9712.60 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-18 16:04:51,791 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:51,791 EPOCH 2 done: loss 0.6437 - lr: 0.000045 |
|
2023-10-18 16:04:52,294 DEV : loss 0.46711266040802 - f1-score (micro avg) 0.201 |
|
2023-10-18 16:04:52,303 saving best model |
|
2023-10-18 16:04:52,335 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:52,674 epoch 3 - iter 15/152 - loss 0.55280524 - time (sec): 0.34 - samples/sec: 8723.64 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-18 16:04:52,990 epoch 3 - iter 30/152 - loss 0.52600948 - time (sec): 0.65 - samples/sec: 9306.17 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 16:04:53,313 epoch 3 - iter 45/152 - loss 0.53658685 - time (sec): 0.98 - samples/sec: 9598.63 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-18 16:04:53,630 epoch 3 - iter 60/152 - loss 0.50973011 - time (sec): 1.29 - samples/sec: 9669.50 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 16:04:53,951 epoch 3 - iter 75/152 - loss 0.49269238 - time (sec): 1.61 - samples/sec: 9621.88 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-18 16:04:54,284 epoch 3 - iter 90/152 - loss 0.48607402 - time (sec): 1.95 - samples/sec: 9418.81 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 16:04:54,608 epoch 3 - iter 105/152 - loss 0.46661200 - time (sec): 2.27 - samples/sec: 9387.20 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-18 16:04:55,100 epoch 3 - iter 120/152 - loss 0.45393692 - time (sec): 2.76 - samples/sec: 8717.62 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 16:04:55,437 epoch 3 - iter 135/152 - loss 0.46267037 - time (sec): 3.10 - samples/sec: 8805.49 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-18 16:04:55,780 epoch 3 - iter 150/152 - loss 0.45957840 - time (sec): 3.44 - samples/sec: 8881.86 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-18 16:04:55,827 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:55,827 EPOCH 3 done: loss 0.4618 - lr: 0.000039 |
|
2023-10-18 16:04:56,332 DEV : loss 0.3763718903064728 - f1-score (micro avg) 0.3079 |
|
2023-10-18 16:04:56,338 saving best model |
|
2023-10-18 16:04:56,369 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:56,698 epoch 4 - iter 15/152 - loss 0.52573925 - time (sec): 0.33 - samples/sec: 8778.20 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 16:04:57,043 epoch 4 - iter 30/152 - loss 0.45798899 - time (sec): 0.67 - samples/sec: 8851.09 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-18 16:04:57,383 epoch 4 - iter 45/152 - loss 0.43859763 - time (sec): 1.01 - samples/sec: 9031.46 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 16:04:57,709 epoch 4 - iter 60/152 - loss 0.43302646 - time (sec): 1.34 - samples/sec: 8975.68 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-18 16:04:58,029 epoch 4 - iter 75/152 - loss 0.41824130 - time (sec): 1.66 - samples/sec: 9106.04 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 16:04:58,349 epoch 4 - iter 90/152 - loss 0.41263334 - time (sec): 1.98 - samples/sec: 9119.30 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-18 16:04:58,674 epoch 4 - iter 105/152 - loss 0.40907522 - time (sec): 2.30 - samples/sec: 9127.41 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 16:04:58,993 epoch 4 - iter 120/152 - loss 0.39940132 - time (sec): 2.62 - samples/sec: 9096.85 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-18 16:04:59,314 epoch 4 - iter 135/152 - loss 0.40404518 - time (sec): 2.94 - samples/sec: 9168.53 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 16:04:59,626 epoch 4 - iter 150/152 - loss 0.39664101 - time (sec): 3.26 - samples/sec: 9396.38 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-18 16:04:59,656 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:04:59,656 EPOCH 4 done: loss 0.3953 - lr: 0.000034 |
|
2023-10-18 16:05:00,176 DEV : loss 0.3355642557144165 - f1-score (micro avg) 0.3672 |
|
2023-10-18 16:05:00,181 saving best model |
|
2023-10-18 16:05:00,216 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:05:00,545 epoch 5 - iter 15/152 - loss 0.36220235 - time (sec): 0.33 - samples/sec: 9174.40 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-18 16:05:00,885 epoch 5 - iter 30/152 - loss 0.41524884 - time (sec): 0.67 - samples/sec: 9345.90 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 16:05:01,257 epoch 5 - iter 45/152 - loss 0.36981838 - time (sec): 1.04 - samples/sec: 8962.68 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-18 16:05:01,618 epoch 5 - iter 60/152 - loss 0.36763614 - time (sec): 1.40 - samples/sec: 8836.21 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 16:05:01,956 epoch 5 - iter 75/152 - loss 0.36971291 - time (sec): 1.74 - samples/sec: 8895.40 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-18 16:05:02,290 epoch 5 - iter 90/152 - loss 0.37106258 - time (sec): 2.07 - samples/sec: 8842.41 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 16:05:02,638 epoch 5 - iter 105/152 - loss 0.35989213 - time (sec): 2.42 - samples/sec: 8892.32 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-18 16:05:02,989 epoch 5 - iter 120/152 - loss 0.36121116 - time (sec): 2.77 - samples/sec: 8914.91 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 16:05:03,321 epoch 5 - iter 135/152 - loss 0.35095211 - time (sec): 3.10 - samples/sec: 8932.51 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-18 16:05:03,641 epoch 5 - iter 150/152 - loss 0.34848002 - time (sec): 3.43 - samples/sec: 8936.06 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-18 16:05:03,680 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:05:03,680 EPOCH 5 done: loss 0.3479 - lr: 0.000028 |
|
2023-10-18 16:05:04,193 DEV : loss 0.31115421652793884 - f1-score (micro avg) 0.3844 |
|
2023-10-18 16:05:04,198 saving best model |
|
2023-10-18 16:05:04,231 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:05:04,575 epoch 6 - iter 15/152 - loss 0.32690293 - time (sec): 0.34 - samples/sec: 8592.24 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 16:05:04,901 epoch 6 - iter 30/152 - loss 0.32630321 - time (sec): 0.67 - samples/sec: 8807.67 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-18 16:05:05,226 epoch 6 - iter 45/152 - loss 0.31329981 - time (sec): 0.99 - samples/sec: 8860.44 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 16:05:05,558 epoch 6 - iter 60/152 - loss 0.32232580 - time (sec): 1.33 - samples/sec: 8830.30 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-18 16:05:05,884 epoch 6 - iter 75/152 - loss 0.31910210 - time (sec): 1.65 - samples/sec: 8932.52 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 16:05:06,204 epoch 6 - iter 90/152 - loss 0.32736766 - time (sec): 1.97 - samples/sec: 9052.21 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-18 16:05:06,525 epoch 6 - iter 105/152 - loss 0.33292481 - time (sec): 2.29 - samples/sec: 9162.50 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 16:05:06,838 epoch 6 - iter 120/152 - loss 0.32538367 - time (sec): 2.61 - samples/sec: 9174.32 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-18 16:05:07,169 epoch 6 - iter 135/152 - loss 0.32694796 - time (sec): 2.94 - samples/sec: 9220.95 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-18 16:05:07,494 epoch 6 - iter 150/152 - loss 0.32681647 - time (sec): 3.26 - samples/sec: 9372.20 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 16:05:07,533 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:05:07,533 EPOCH 6 done: loss 0.3252 - lr: 0.000022 |
|
2023-10-18 16:05:08,059 DEV : loss 0.2971973717212677 - f1-score (micro avg) 0.407 |
|
2023-10-18 16:05:08,064 saving best model |
|
2023-10-18 16:05:08,097 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:05:08,422 epoch 7 - iter 15/152 - loss 0.31865948 - time (sec): 0.32 - samples/sec: 8928.80 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-18 16:05:08,764 epoch 7 - iter 30/152 - loss 0.30993718 - time (sec): 0.67 - samples/sec: 8897.69 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 16:05:09,096 epoch 7 - iter 45/152 - loss 0.31161449 - time (sec): 1.00 - samples/sec: 9253.60 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-18 16:05:09,418 epoch 7 - iter 60/152 - loss 0.31404095 - time (sec): 1.32 - samples/sec: 9292.49 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 16:05:09,734 epoch 7 - iter 75/152 - loss 0.32224128 - time (sec): 1.64 - samples/sec: 9309.62 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-18 16:05:10,081 epoch 7 - iter 90/152 - loss 0.31368658 - time (sec): 1.98 - samples/sec: 9337.93 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 16:05:10,405 epoch 7 - iter 105/152 - loss 0.31187673 - time (sec): 2.31 - samples/sec: 9366.22 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-18 16:05:10,721 epoch 7 - iter 120/152 - loss 0.31348426 - time (sec): 2.62 - samples/sec: 9355.10 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-18 16:05:11,054 epoch 7 - iter 135/152 - loss 0.31235664 - time (sec): 2.96 - samples/sec: 9386.33 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 16:05:11,385 epoch 7 - iter 150/152 - loss 0.30337842 - time (sec): 3.29 - samples/sec: 9315.33 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-18 16:05:11,433 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:05:11,433 EPOCH 7 done: loss 0.3019 - lr: 0.000017 |
|
2023-10-18 16:05:11,950 DEV : loss 0.2880619764328003 - f1-score (micro avg) 0.4264 |
|
2023-10-18 16:05:11,956 saving best model |
|
2023-10-18 16:05:11,989 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:05:12,310 epoch 8 - iter 15/152 - loss 0.24299427 - time (sec): 0.32 - samples/sec: 9043.99 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 16:05:12,640 epoch 8 - iter 30/152 - loss 0.27954609 - time (sec): 0.65 - samples/sec: 8961.96 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-18 16:05:12,962 epoch 8 - iter 45/152 - loss 0.29235058 - time (sec): 0.97 - samples/sec: 9173.19 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 16:05:13,286 epoch 8 - iter 60/152 - loss 0.29197778 - time (sec): 1.30 - samples/sec: 9199.29 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-18 16:05:13,624 epoch 8 - iter 75/152 - loss 0.29671397 - time (sec): 1.63 - samples/sec: 9290.52 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 16:05:13,977 epoch 8 - iter 90/152 - loss 0.29113199 - time (sec): 1.99 - samples/sec: 9295.14 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-18 16:05:14,302 epoch 8 - iter 105/152 - loss 0.29128096 - time (sec): 2.31 - samples/sec: 9121.49 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-18 16:05:14,630 epoch 8 - iter 120/152 - loss 0.28758044 - time (sec): 2.64 - samples/sec: 9218.74 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 16:05:14,954 epoch 8 - iter 135/152 - loss 0.29023127 - time (sec): 2.96 - samples/sec: 9310.80 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-18 16:05:15,285 epoch 8 - iter 150/152 - loss 0.28969709 - time (sec): 3.29 - samples/sec: 9296.86 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 16:05:15,328 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:05:15,329 EPOCH 8 done: loss 0.2890 - lr: 0.000011 |
|
2023-10-18 16:05:15,865 DEV : loss 0.2770352065563202 - f1-score (micro avg) 0.4598 |
|
2023-10-18 16:05:15,871 saving best model |
|
2023-10-18 16:05:15,904 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:05:16,236 epoch 9 - iter 15/152 - loss 0.25596487 - time (sec): 0.33 - samples/sec: 9464.90 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-18 16:05:16,581 epoch 9 - iter 30/152 - loss 0.25922984 - time (sec): 0.68 - samples/sec: 9566.27 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 16:05:16,913 epoch 9 - iter 45/152 - loss 0.28068483 - time (sec): 1.01 - samples/sec: 9388.86 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-18 16:05:17,245 epoch 9 - iter 60/152 - loss 0.27771263 - time (sec): 1.34 - samples/sec: 9165.93 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 16:05:17,569 epoch 9 - iter 75/152 - loss 0.28699937 - time (sec): 1.66 - samples/sec: 9347.91 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-18 16:05:17,900 epoch 9 - iter 90/152 - loss 0.28319719 - time (sec): 2.00 - samples/sec: 9308.42 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-18 16:05:18,224 epoch 9 - iter 105/152 - loss 0.28380563 - time (sec): 2.32 - samples/sec: 9356.64 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 16:05:18,549 epoch 9 - iter 120/152 - loss 0.29008897 - time (sec): 2.64 - samples/sec: 9346.69 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-18 16:05:18,879 epoch 9 - iter 135/152 - loss 0.28838066 - time (sec): 2.97 - samples/sec: 9323.34 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 16:05:19,198 epoch 9 - iter 150/152 - loss 0.28526994 - time (sec): 3.29 - samples/sec: 9302.36 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-18 16:05:19,237 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:05:19,237 EPOCH 9 done: loss 0.2867 - lr: 0.000006 |
|
2023-10-18 16:05:19,751 DEV : loss 0.2709275782108307 - f1-score (micro avg) 0.468 |
|
2023-10-18 16:05:19,757 saving best model |
|
2023-10-18 16:05:19,794 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:05:20,119 epoch 10 - iter 15/152 - loss 0.23999848 - time (sec): 0.33 - samples/sec: 9586.53 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 16:05:20,465 epoch 10 - iter 30/152 - loss 0.25875882 - time (sec): 0.67 - samples/sec: 9130.45 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-18 16:05:20,801 epoch 10 - iter 45/152 - loss 0.27044887 - time (sec): 1.01 - samples/sec: 9143.72 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 16:05:21,135 epoch 10 - iter 60/152 - loss 0.27241444 - time (sec): 1.34 - samples/sec: 9258.75 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-18 16:05:21,454 epoch 10 - iter 75/152 - loss 0.26243794 - time (sec): 1.66 - samples/sec: 9414.71 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 16:05:21,778 epoch 10 - iter 90/152 - loss 0.25711397 - time (sec): 1.98 - samples/sec: 9355.29 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-18 16:05:22,115 epoch 10 - iter 105/152 - loss 0.26696933 - time (sec): 2.32 - samples/sec: 9356.35 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-18 16:05:22,444 epoch 10 - iter 120/152 - loss 0.27120824 - time (sec): 2.65 - samples/sec: 9365.82 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 16:05:22,770 epoch 10 - iter 135/152 - loss 0.27912026 - time (sec): 2.98 - samples/sec: 9288.59 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-18 16:05:23,084 epoch 10 - iter 150/152 - loss 0.27645946 - time (sec): 3.29 - samples/sec: 9290.86 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-18 16:05:23,125 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:05:23,125 EPOCH 10 done: loss 0.2746 - lr: 0.000000 |
|
2023-10-18 16:05:23,645 DEV : loss 0.26880595088005066 - f1-score (micro avg) 0.4642 |
|
2023-10-18 16:05:23,681 ---------------------------------------------------------------------------------------------------- |
|
2023-10-18 16:05:23,681 Loading model from best epoch ... |
|
2023-10-18 16:05:23,761 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object |
|
2023-10-18 16:05:24,238 |
|
Results: |
|
- F-score (micro) 0.4738 |
|
- F-score (macro) 0.2912 |
|
- Accuracy 0.3246 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
scope 0.4526 0.5695 0.5044 151 |
|
work 0.2919 0.4947 0.3672 95 |
|
pers 0.6341 0.5417 0.5843 96 |
|
loc 0.0000 0.0000 0.0000 3 |
|
date 0.0000 0.0000 0.0000 3 |
|
|
|
micro avg 0.4273 0.5316 0.4738 348 |
|
macro avg 0.2757 0.3212 0.2912 348 |
|
weighted avg 0.4510 0.5316 0.4803 348 |
|
|
|
2023-10-18 16:05:24,238 ---------------------------------------------------------------------------------------------------- |
|
|