|
2023-10-19 23:43:12,500 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:12,501 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 128) |
|
(position_embeddings): Embedding(512, 128) |
|
(token_type_embeddings): Embedding(2, 128) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-1): 2 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=128, out_features=128, bias=True) |
|
(key): Linear(in_features=128, out_features=128, bias=True) |
|
(value): Linear(in_features=128, out_features=128, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=128, out_features=512, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=512, out_features=128, bias=True) |
|
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=128, out_features=128, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=128, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-19 23:43:12,501 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:12,501 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-19 23:43:12,501 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:12,501 Train: 1166 sentences |
|
2023-10-19 23:43:12,502 (train_with_dev=False, train_with_test=False) |
|
2023-10-19 23:43:12,502 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:12,502 Training Params: |
|
2023-10-19 23:43:12,502 - learning_rate: "5e-05" |
|
2023-10-19 23:43:12,502 - mini_batch_size: "4" |
|
2023-10-19 23:43:12,502 - max_epochs: "10" |
|
2023-10-19 23:43:12,502 - shuffle: "True" |
|
2023-10-19 23:43:12,502 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:12,502 Plugins: |
|
2023-10-19 23:43:12,502 - TensorboardLogger |
|
2023-10-19 23:43:12,502 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-19 23:43:12,502 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:12,502 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-19 23:43:12,502 - metric: "('micro avg', 'f1-score')" |
|
2023-10-19 23:43:12,502 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:12,502 Computation: |
|
2023-10-19 23:43:12,502 - compute on device: cuda:0 |
|
2023-10-19 23:43:12,502 - embedding storage: none |
|
2023-10-19 23:43:12,502 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:12,502 Model training base path: "hmbench-newseye/fi-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-19 23:43:12,502 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:12,502 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:12,503 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-19 23:43:12,963 epoch 1 - iter 29/292 - loss 3.07409706 - time (sec): 0.46 - samples/sec: 10330.84 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:43:13,451 epoch 1 - iter 58/292 - loss 3.03705751 - time (sec): 0.95 - samples/sec: 9437.03 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:43:13,902 epoch 1 - iter 87/292 - loss 2.94951085 - time (sec): 1.40 - samples/sec: 9793.60 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:43:14,481 epoch 1 - iter 116/292 - loss 2.84219173 - time (sec): 1.98 - samples/sec: 9562.88 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:43:15,004 epoch 1 - iter 145/292 - loss 2.58749868 - time (sec): 2.50 - samples/sec: 9267.09 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:43:15,501 epoch 1 - iter 174/292 - loss 2.38570226 - time (sec): 3.00 - samples/sec: 8972.13 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 23:43:15,978 epoch 1 - iter 203/292 - loss 2.22219622 - time (sec): 3.47 - samples/sec: 8692.25 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 23:43:16,494 epoch 1 - iter 232/292 - loss 2.00089252 - time (sec): 3.99 - samples/sec: 8828.25 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 23:43:17,020 epoch 1 - iter 261/292 - loss 1.85422614 - time (sec): 4.52 - samples/sec: 8745.36 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 23:43:17,538 epoch 1 - iter 290/292 - loss 1.72578060 - time (sec): 5.03 - samples/sec: 8805.58 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 23:43:17,570 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:17,570 EPOCH 1 done: loss 1.7241 - lr: 0.000049 |
|
2023-10-19 23:43:17,835 DEV : loss 0.4558045566082001 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:43:17,838 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:18,351 epoch 2 - iter 29/292 - loss 0.70043765 - time (sec): 0.51 - samples/sec: 10228.74 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 23:43:18,881 epoch 2 - iter 58/292 - loss 0.75391994 - time (sec): 1.04 - samples/sec: 9507.76 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-19 23:43:19,408 epoch 2 - iter 87/292 - loss 0.72665306 - time (sec): 1.57 - samples/sec: 8976.10 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-19 23:43:19,973 epoch 2 - iter 116/292 - loss 0.68840114 - time (sec): 2.13 - samples/sec: 8885.13 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-19 23:43:20,452 epoch 2 - iter 145/292 - loss 0.68621072 - time (sec): 2.61 - samples/sec: 8709.27 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-19 23:43:20,959 epoch 2 - iter 174/292 - loss 0.65696092 - time (sec): 3.12 - samples/sec: 8612.91 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-19 23:43:21,469 epoch 2 - iter 203/292 - loss 0.64245263 - time (sec): 3.63 - samples/sec: 8646.67 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-19 23:43:21,972 epoch 2 - iter 232/292 - loss 0.63067680 - time (sec): 4.13 - samples/sec: 8695.67 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-19 23:43:22,456 epoch 2 - iter 261/292 - loss 0.61228065 - time (sec): 4.62 - samples/sec: 8698.69 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 23:43:22,965 epoch 2 - iter 290/292 - loss 0.59628468 - time (sec): 5.13 - samples/sec: 8645.76 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-19 23:43:22,994 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:22,994 EPOCH 2 done: loss 0.5963 - lr: 0.000045 |
|
2023-10-19 23:43:23,621 DEV : loss 0.36393484473228455 - f1-score (micro avg) 0.0 |
|
2023-10-19 23:43:23,624 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:24,125 epoch 3 - iter 29/292 - loss 0.42006601 - time (sec): 0.50 - samples/sec: 8608.40 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-19 23:43:24,629 epoch 3 - iter 58/292 - loss 0.60259921 - time (sec): 1.00 - samples/sec: 8567.90 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-19 23:43:25,153 epoch 3 - iter 87/292 - loss 0.56103013 - time (sec): 1.53 - samples/sec: 8460.68 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-19 23:43:25,667 epoch 3 - iter 116/292 - loss 0.52110974 - time (sec): 2.04 - samples/sec: 8655.47 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-19 23:43:26,153 epoch 3 - iter 145/292 - loss 0.51867745 - time (sec): 2.53 - samples/sec: 8450.82 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-19 23:43:26,680 epoch 3 - iter 174/292 - loss 0.51950404 - time (sec): 3.05 - samples/sec: 8737.22 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-19 23:43:27,230 epoch 3 - iter 203/292 - loss 0.49933801 - time (sec): 3.61 - samples/sec: 8709.81 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-19 23:43:27,747 epoch 3 - iter 232/292 - loss 0.49192529 - time (sec): 4.12 - samples/sec: 8550.14 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 23:43:28,405 epoch 3 - iter 261/292 - loss 0.48732688 - time (sec): 4.78 - samples/sec: 8306.02 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-19 23:43:28,884 epoch 3 - iter 290/292 - loss 0.48097519 - time (sec): 5.26 - samples/sec: 8415.89 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-19 23:43:28,918 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:28,918 EPOCH 3 done: loss 0.4799 - lr: 0.000039 |
|
2023-10-19 23:43:29,554 DEV : loss 0.30687329173088074 - f1-score (micro avg) 0.1411 |
|
2023-10-19 23:43:29,558 saving best model |
|
2023-10-19 23:43:29,585 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:30,106 epoch 4 - iter 29/292 - loss 0.40558778 - time (sec): 0.52 - samples/sec: 8449.83 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-19 23:43:30,632 epoch 4 - iter 58/292 - loss 0.40595574 - time (sec): 1.05 - samples/sec: 8411.38 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-19 23:43:31,153 epoch 4 - iter 87/292 - loss 0.39747205 - time (sec): 1.57 - samples/sec: 8586.41 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-19 23:43:31,648 epoch 4 - iter 116/292 - loss 0.39387173 - time (sec): 2.06 - samples/sec: 8558.51 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-19 23:43:32,172 epoch 4 - iter 145/292 - loss 0.39381093 - time (sec): 2.59 - samples/sec: 8536.17 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-19 23:43:32,717 epoch 4 - iter 174/292 - loss 0.40560545 - time (sec): 3.13 - samples/sec: 8748.46 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-19 23:43:33,242 epoch 4 - iter 203/292 - loss 0.41292067 - time (sec): 3.66 - samples/sec: 8567.43 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 23:43:33,741 epoch 4 - iter 232/292 - loss 0.41164578 - time (sec): 4.16 - samples/sec: 8577.30 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-19 23:43:34,261 epoch 4 - iter 261/292 - loss 0.41094464 - time (sec): 4.68 - samples/sec: 8642.29 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-19 23:43:34,748 epoch 4 - iter 290/292 - loss 0.40702589 - time (sec): 5.16 - samples/sec: 8532.69 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-19 23:43:34,787 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:34,787 EPOCH 4 done: loss 0.4052 - lr: 0.000033 |
|
2023-10-19 23:43:35,428 DEV : loss 0.3078775703907013 - f1-score (micro avg) 0.1785 |
|
2023-10-19 23:43:35,432 saving best model |
|
2023-10-19 23:43:35,464 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:35,984 epoch 5 - iter 29/292 - loss 0.39458319 - time (sec): 0.52 - samples/sec: 8373.87 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-19 23:43:36,501 epoch 5 - iter 58/292 - loss 0.37791867 - time (sec): 1.04 - samples/sec: 8911.74 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-19 23:43:37,090 epoch 5 - iter 87/292 - loss 0.41175438 - time (sec): 1.63 - samples/sec: 8990.57 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-19 23:43:37,656 epoch 5 - iter 116/292 - loss 0.41745175 - time (sec): 2.19 - samples/sec: 8617.37 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-19 23:43:38,154 epoch 5 - iter 145/292 - loss 0.40362718 - time (sec): 2.69 - samples/sec: 8507.94 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-19 23:43:38,669 epoch 5 - iter 174/292 - loss 0.38856766 - time (sec): 3.20 - samples/sec: 8423.13 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 23:43:39,173 epoch 5 - iter 203/292 - loss 0.39176537 - time (sec): 3.71 - samples/sec: 8451.86 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-19 23:43:39,665 epoch 5 - iter 232/292 - loss 0.38688994 - time (sec): 4.20 - samples/sec: 8488.81 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-19 23:43:40,175 epoch 5 - iter 261/292 - loss 0.37929994 - time (sec): 4.71 - samples/sec: 8411.79 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:43:40,695 epoch 5 - iter 290/292 - loss 0.37114615 - time (sec): 5.23 - samples/sec: 8448.93 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-19 23:43:40,724 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:40,724 EPOCH 5 done: loss 0.3711 - lr: 0.000028 |
|
2023-10-19 23:43:41,364 DEV : loss 0.30972999334335327 - f1-score (micro avg) 0.19 |
|
2023-10-19 23:43:41,368 saving best model |
|
2023-10-19 23:43:41,398 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:41,947 epoch 6 - iter 29/292 - loss 0.37287899 - time (sec): 0.55 - samples/sec: 9227.43 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:43:42,433 epoch 6 - iter 58/292 - loss 0.34716342 - time (sec): 1.03 - samples/sec: 8909.79 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-19 23:43:42,986 epoch 6 - iter 87/292 - loss 0.38873255 - time (sec): 1.59 - samples/sec: 9117.96 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:43:43,496 epoch 6 - iter 116/292 - loss 0.38158312 - time (sec): 2.10 - samples/sec: 8828.62 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-19 23:43:44,004 epoch 6 - iter 145/292 - loss 0.38197586 - time (sec): 2.61 - samples/sec: 8792.59 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:43:44,542 epoch 6 - iter 174/292 - loss 0.36548885 - time (sec): 3.14 - samples/sec: 8816.96 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-19 23:43:45,056 epoch 6 - iter 203/292 - loss 0.35118741 - time (sec): 3.66 - samples/sec: 8679.41 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-19 23:43:45,589 epoch 6 - iter 232/292 - loss 0.35101908 - time (sec): 4.19 - samples/sec: 8508.10 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:43:46,145 epoch 6 - iter 261/292 - loss 0.35347404 - time (sec): 4.75 - samples/sec: 8468.76 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-19 23:43:46,662 epoch 6 - iter 290/292 - loss 0.35128884 - time (sec): 5.26 - samples/sec: 8425.23 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:43:46,689 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:46,689 EPOCH 6 done: loss 0.3514 - lr: 0.000022 |
|
2023-10-19 23:43:47,337 DEV : loss 0.30355098843574524 - f1-score (micro avg) 0.227 |
|
2023-10-19 23:43:47,341 saving best model |
|
2023-10-19 23:43:47,373 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:47,886 epoch 7 - iter 29/292 - loss 0.30998691 - time (sec): 0.51 - samples/sec: 10436.77 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-19 23:43:48,370 epoch 7 - iter 58/292 - loss 0.32974135 - time (sec): 1.00 - samples/sec: 9324.77 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:43:48,863 epoch 7 - iter 87/292 - loss 0.33209891 - time (sec): 1.49 - samples/sec: 8769.78 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-19 23:43:49,392 epoch 7 - iter 116/292 - loss 0.31542787 - time (sec): 2.02 - samples/sec: 8823.99 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:43:49,913 epoch 7 - iter 145/292 - loss 0.31288929 - time (sec): 2.54 - samples/sec: 8784.33 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-19 23:43:50,406 epoch 7 - iter 174/292 - loss 0.32458005 - time (sec): 3.03 - samples/sec: 8624.77 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-19 23:43:50,868 epoch 7 - iter 203/292 - loss 0.32310007 - time (sec): 3.49 - samples/sec: 8510.83 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:43:51,332 epoch 7 - iter 232/292 - loss 0.34671277 - time (sec): 3.96 - samples/sec: 8724.77 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-19 23:43:51,779 epoch 7 - iter 261/292 - loss 0.33592061 - time (sec): 4.41 - samples/sec: 8940.65 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:43:52,238 epoch 7 - iter 290/292 - loss 0.33539636 - time (sec): 4.86 - samples/sec: 9086.71 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-19 23:43:52,265 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:52,265 EPOCH 7 done: loss 0.3355 - lr: 0.000017 |
|
2023-10-19 23:43:52,901 DEV : loss 0.29472851753234863 - f1-score (micro avg) 0.2633 |
|
2023-10-19 23:43:52,904 saving best model |
|
2023-10-19 23:43:52,935 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:53,394 epoch 8 - iter 29/292 - loss 0.31109226 - time (sec): 0.46 - samples/sec: 9513.15 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:43:53,831 epoch 8 - iter 58/292 - loss 0.32511565 - time (sec): 0.89 - samples/sec: 9542.18 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-19 23:43:54,280 epoch 8 - iter 87/292 - loss 0.30650012 - time (sec): 1.34 - samples/sec: 9604.75 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:43:54,693 epoch 8 - iter 116/292 - loss 0.30110761 - time (sec): 1.76 - samples/sec: 9746.12 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-19 23:43:55,184 epoch 8 - iter 145/292 - loss 0.31099699 - time (sec): 2.25 - samples/sec: 9782.19 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-19 23:43:55,683 epoch 8 - iter 174/292 - loss 0.30267235 - time (sec): 2.75 - samples/sec: 9391.31 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 23:43:56,181 epoch 8 - iter 203/292 - loss 0.31451102 - time (sec): 3.24 - samples/sec: 9379.35 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-19 23:43:56,743 epoch 8 - iter 232/292 - loss 0.32954413 - time (sec): 3.81 - samples/sec: 9378.59 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:43:57,244 epoch 8 - iter 261/292 - loss 0.32898065 - time (sec): 4.31 - samples/sec: 9255.20 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-19 23:43:57,781 epoch 8 - iter 290/292 - loss 0.32544017 - time (sec): 4.85 - samples/sec: 9119.53 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 23:43:57,810 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:57,810 EPOCH 8 done: loss 0.3255 - lr: 0.000011 |
|
2023-10-19 23:43:58,448 DEV : loss 0.29057008028030396 - f1-score (micro avg) 0.276 |
|
2023-10-19 23:43:58,451 saving best model |
|
2023-10-19 23:43:58,485 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:43:59,002 epoch 9 - iter 29/292 - loss 0.24069591 - time (sec): 0.52 - samples/sec: 9113.36 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-19 23:43:59,509 epoch 9 - iter 58/292 - loss 0.28968391 - time (sec): 1.02 - samples/sec: 8707.45 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:44:00,032 epoch 9 - iter 87/292 - loss 0.28979454 - time (sec): 1.55 - samples/sec: 8802.28 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-19 23:44:00,536 epoch 9 - iter 116/292 - loss 0.29644815 - time (sec): 2.05 - samples/sec: 8488.33 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-19 23:44:01,186 epoch 9 - iter 145/292 - loss 0.29585821 - time (sec): 2.70 - samples/sec: 8068.97 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 23:44:01,701 epoch 9 - iter 174/292 - loss 0.29293220 - time (sec): 3.22 - samples/sec: 8304.34 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-19 23:44:02,246 epoch 9 - iter 203/292 - loss 0.30033879 - time (sec): 3.76 - samples/sec: 8413.12 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 23:44:02,738 epoch 9 - iter 232/292 - loss 0.30375698 - time (sec): 4.25 - samples/sec: 8450.86 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-19 23:44:03,250 epoch 9 - iter 261/292 - loss 0.31294749 - time (sec): 4.76 - samples/sec: 8536.53 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:44:03,737 epoch 9 - iter 290/292 - loss 0.31643018 - time (sec): 5.25 - samples/sec: 8438.37 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-19 23:44:03,769 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:44:03,769 EPOCH 9 done: loss 0.3170 - lr: 0.000006 |
|
2023-10-19 23:44:04,410 DEV : loss 0.2984329164028168 - f1-score (micro avg) 0.2593 |
|
2023-10-19 23:44:04,413 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:44:04,936 epoch 10 - iter 29/292 - loss 0.22234356 - time (sec): 0.52 - samples/sec: 8410.17 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:44:05,451 epoch 10 - iter 58/292 - loss 0.26495283 - time (sec): 1.04 - samples/sec: 8656.76 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-19 23:44:05,960 epoch 10 - iter 87/292 - loss 0.28908115 - time (sec): 1.55 - samples/sec: 8288.55 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-19 23:44:06,504 epoch 10 - iter 116/292 - loss 0.30910542 - time (sec): 2.09 - samples/sec: 8108.06 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:44:07,024 epoch 10 - iter 145/292 - loss 0.32482006 - time (sec): 2.61 - samples/sec: 8037.99 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-19 23:44:07,513 epoch 10 - iter 174/292 - loss 0.31126506 - time (sec): 3.10 - samples/sec: 8178.22 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 23:44:08,024 epoch 10 - iter 203/292 - loss 0.30298861 - time (sec): 3.61 - samples/sec: 8233.68 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-19 23:44:08,565 epoch 10 - iter 232/292 - loss 0.30263054 - time (sec): 4.15 - samples/sec: 8385.24 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 23:44:09,066 epoch 10 - iter 261/292 - loss 0.29752653 - time (sec): 4.65 - samples/sec: 8414.36 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-19 23:44:09,589 epoch 10 - iter 290/292 - loss 0.31684602 - time (sec): 5.17 - samples/sec: 8559.07 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-19 23:44:09,617 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:44:09,617 EPOCH 10 done: loss 0.3165 - lr: 0.000000 |
|
2023-10-19 23:44:10,261 DEV : loss 0.2949487566947937 - f1-score (micro avg) 0.287 |
|
2023-10-19 23:44:10,265 saving best model |
|
2023-10-19 23:44:10,323 ---------------------------------------------------------------------------------------------------- |
|
2023-10-19 23:44:10,324 Loading model from best epoch ... |
|
2023-10-19 23:44:10,397 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-19 23:44:11,306 |
|
Results: |
|
- F-score (micro) 0.3268 |
|
- F-score (macro) 0.1729 |
|
- Accuracy 0.2036 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.3365 0.3046 0.3198 348 |
|
LOC 0.3140 0.4559 0.3719 261 |
|
ORG 0.0000 0.0000 0.0000 52 |
|
HumanProd 0.0000 0.0000 0.0000 22 |
|
|
|
micro avg 0.3242 0.3294 0.3268 683 |
|
macro avg 0.1626 0.1901 0.1729 683 |
|
weighted avg 0.2914 0.3294 0.3050 683 |
|
|
|
2023-10-19 23:44:11,306 ---------------------------------------------------------------------------------------------------- |
|
|