|
2023-10-10 23:17:02,388 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:17:02,390 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-10 23:17:02,390 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:17:02,390 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-10 23:17:02,390 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:17:02,390 Train: 1166 sentences |
|
2023-10-10 23:17:02,391 (train_with_dev=False, train_with_test=False) |
|
2023-10-10 23:17:02,391 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:17:02,391 Training Params: |
|
2023-10-10 23:17:02,391 - learning_rate: "0.00016" |
|
2023-10-10 23:17:02,391 - mini_batch_size: "8" |
|
2023-10-10 23:17:02,391 - max_epochs: "10" |
|
2023-10-10 23:17:02,391 - shuffle: "True" |
|
2023-10-10 23:17:02,391 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:17:02,391 Plugins: |
|
2023-10-10 23:17:02,391 - TensorboardLogger |
|
2023-10-10 23:17:02,391 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-10 23:17:02,391 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:17:02,391 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-10 23:17:02,391 - metric: "('micro avg', 'f1-score')" |
|
2023-10-10 23:17:02,391 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:17:02,392 Computation: |
|
2023-10-10 23:17:02,392 - compute on device: cuda:0 |
|
2023-10-10 23:17:02,392 - embedding storage: none |
|
2023-10-10 23:17:02,392 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:17:02,392 Model training base path: "hmbench-newseye/fi-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-10 23:17:02,392 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:17:02,392 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:17:02,392 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-10 23:17:12,249 epoch 1 - iter 14/146 - loss 2.85068931 - time (sec): 9.85 - samples/sec: 507.67 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-10 23:17:21,019 epoch 1 - iter 28/146 - loss 2.84691647 - time (sec): 18.63 - samples/sec: 469.95 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-10 23:17:30,537 epoch 1 - iter 42/146 - loss 2.83473964 - time (sec): 28.14 - samples/sec: 472.52 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-10 23:17:39,867 epoch 1 - iter 56/146 - loss 2.81949073 - time (sec): 37.47 - samples/sec: 468.02 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-10 23:17:48,437 epoch 1 - iter 70/146 - loss 2.78568641 - time (sec): 46.04 - samples/sec: 463.24 - lr: 0.000076 - momentum: 0.000000 |
|
2023-10-10 23:17:56,794 epoch 1 - iter 84/146 - loss 2.72961033 - time (sec): 54.40 - samples/sec: 460.97 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-10 23:18:06,346 epoch 1 - iter 98/146 - loss 2.65786696 - time (sec): 63.95 - samples/sec: 455.80 - lr: 0.000106 - momentum: 0.000000 |
|
2023-10-10 23:18:16,236 epoch 1 - iter 112/146 - loss 2.57487198 - time (sec): 73.84 - samples/sec: 452.26 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-10 23:18:26,359 epoch 1 - iter 126/146 - loss 2.47816169 - time (sec): 83.97 - samples/sec: 452.64 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-10 23:18:36,630 epoch 1 - iter 140/146 - loss 2.38111585 - time (sec): 94.24 - samples/sec: 453.04 - lr: 0.000152 - momentum: 0.000000 |
|
2023-10-10 23:18:40,373 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:18:40,374 EPOCH 1 done: loss 2.3434 - lr: 0.000152 |
|
2023-10-10 23:18:46,381 DEV : loss 1.2865359783172607 - f1-score (micro avg) 0.0 |
|
2023-10-10 23:18:46,390 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:18:55,313 epoch 2 - iter 14/146 - loss 1.28415238 - time (sec): 8.92 - samples/sec: 472.05 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-10 23:19:05,700 epoch 2 - iter 28/146 - loss 1.19956105 - time (sec): 19.31 - samples/sec: 464.48 - lr: 0.000157 - momentum: 0.000000 |
|
2023-10-10 23:19:16,430 epoch 2 - iter 42/146 - loss 1.13288131 - time (sec): 30.04 - samples/sec: 458.33 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-10 23:19:26,012 epoch 2 - iter 56/146 - loss 1.07129114 - time (sec): 39.62 - samples/sec: 437.61 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-10 23:19:35,741 epoch 2 - iter 70/146 - loss 0.99550129 - time (sec): 49.35 - samples/sec: 438.17 - lr: 0.000152 - momentum: 0.000000 |
|
2023-10-10 23:19:44,108 epoch 2 - iter 84/146 - loss 0.96876458 - time (sec): 57.72 - samples/sec: 427.25 - lr: 0.000150 - momentum: 0.000000 |
|
2023-10-10 23:19:53,864 epoch 2 - iter 98/146 - loss 0.91377890 - time (sec): 67.47 - samples/sec: 432.59 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-10 23:20:04,816 epoch 2 - iter 112/146 - loss 0.85770767 - time (sec): 78.42 - samples/sec: 432.27 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-10 23:20:15,157 epoch 2 - iter 126/146 - loss 0.81157139 - time (sec): 88.76 - samples/sec: 430.30 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-10 23:20:25,002 epoch 2 - iter 140/146 - loss 0.78208038 - time (sec): 98.61 - samples/sec: 427.37 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-10 23:20:29,385 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:20:29,385 EPOCH 2 done: loss 0.8100 - lr: 0.000143 |
|
2023-10-10 23:20:36,275 DEV : loss 0.44310665130615234 - f1-score (micro avg) 0.0 |
|
2023-10-10 23:20:36,286 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:20:45,315 epoch 3 - iter 14/146 - loss 0.54953200 - time (sec): 9.03 - samples/sec: 403.67 - lr: 0.000141 - momentum: 0.000000 |
|
2023-10-10 23:20:55,048 epoch 3 - iter 28/146 - loss 0.47992717 - time (sec): 18.76 - samples/sec: 414.34 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-10 23:21:05,647 epoch 3 - iter 42/146 - loss 0.56595579 - time (sec): 29.36 - samples/sec: 427.02 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-10 23:21:15,050 epoch 3 - iter 56/146 - loss 0.54451042 - time (sec): 38.76 - samples/sec: 422.88 - lr: 0.000136 - momentum: 0.000000 |
|
2023-10-10 23:21:24,347 epoch 3 - iter 70/146 - loss 0.51838610 - time (sec): 48.06 - samples/sec: 429.97 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-10 23:21:33,558 epoch 3 - iter 84/146 - loss 0.50225077 - time (sec): 57.27 - samples/sec: 432.25 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-10 23:21:42,473 epoch 3 - iter 98/146 - loss 0.48025404 - time (sec): 66.19 - samples/sec: 441.08 - lr: 0.000131 - momentum: 0.000000 |
|
2023-10-10 23:21:52,350 epoch 3 - iter 112/146 - loss 0.46014641 - time (sec): 76.06 - samples/sec: 445.61 - lr: 0.000129 - momentum: 0.000000 |
|
2023-10-10 23:22:02,256 epoch 3 - iter 126/146 - loss 0.44598830 - time (sec): 85.97 - samples/sec: 445.28 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-10 23:22:12,230 epoch 3 - iter 140/146 - loss 0.43394233 - time (sec): 95.94 - samples/sec: 446.51 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-10 23:22:16,255 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:22:16,255 EPOCH 3 done: loss 0.4381 - lr: 0.000125 |
|
2023-10-10 23:22:22,564 DEV : loss 0.3388194143772125 - f1-score (micro avg) 0.233 |
|
2023-10-10 23:22:22,574 saving best model |
|
2023-10-10 23:22:23,549 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:22:32,339 epoch 4 - iter 14/146 - loss 0.35134012 - time (sec): 8.79 - samples/sec: 464.07 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-10 23:22:41,623 epoch 4 - iter 28/146 - loss 0.42932021 - time (sec): 18.07 - samples/sec: 467.42 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-10 23:22:50,468 epoch 4 - iter 42/146 - loss 0.35560185 - time (sec): 26.92 - samples/sec: 473.02 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-10 23:22:58,955 epoch 4 - iter 56/146 - loss 0.35249099 - time (sec): 35.40 - samples/sec: 469.36 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-10 23:23:07,930 epoch 4 - iter 70/146 - loss 0.35507799 - time (sec): 44.38 - samples/sec: 464.26 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-10 23:23:16,896 epoch 4 - iter 84/146 - loss 0.35517374 - time (sec): 53.34 - samples/sec: 462.94 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-10 23:23:26,090 epoch 4 - iter 98/146 - loss 0.33861094 - time (sec): 62.54 - samples/sec: 465.58 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-10 23:23:35,066 epoch 4 - iter 112/146 - loss 0.33627771 - time (sec): 71.51 - samples/sec: 462.76 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-10 23:23:44,314 epoch 4 - iter 126/146 - loss 0.33904883 - time (sec): 80.76 - samples/sec: 465.33 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-10 23:23:53,980 epoch 4 - iter 140/146 - loss 0.34205571 - time (sec): 90.43 - samples/sec: 469.00 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-10 23:23:58,178 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:23:58,178 EPOCH 4 done: loss 0.3373 - lr: 0.000108 |
|
2023-10-10 23:24:04,577 DEV : loss 0.25445958971977234 - f1-score (micro avg) 0.2262 |
|
2023-10-10 23:24:04,587 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:24:13,816 epoch 5 - iter 14/146 - loss 0.32229392 - time (sec): 9.23 - samples/sec: 439.47 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-10 23:24:23,678 epoch 5 - iter 28/146 - loss 0.26837025 - time (sec): 19.09 - samples/sec: 464.72 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-10 23:24:32,600 epoch 5 - iter 42/146 - loss 0.26326562 - time (sec): 28.01 - samples/sec: 458.17 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-10 23:24:41,405 epoch 5 - iter 56/146 - loss 0.25728661 - time (sec): 36.82 - samples/sec: 461.08 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-10 23:24:50,289 epoch 5 - iter 70/146 - loss 0.26829976 - time (sec): 45.70 - samples/sec: 456.49 - lr: 0.000099 - momentum: 0.000000 |
|
2023-10-10 23:25:00,492 epoch 5 - iter 84/146 - loss 0.29246429 - time (sec): 55.90 - samples/sec: 467.11 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-10 23:25:10,684 epoch 5 - iter 98/146 - loss 0.29542982 - time (sec): 66.10 - samples/sec: 467.01 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-10 23:25:20,273 epoch 5 - iter 112/146 - loss 0.28924318 - time (sec): 75.68 - samples/sec: 467.48 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-10 23:25:28,956 epoch 5 - iter 126/146 - loss 0.28713756 - time (sec): 84.37 - samples/sec: 463.52 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-10 23:25:37,924 epoch 5 - iter 140/146 - loss 0.28490061 - time (sec): 93.34 - samples/sec: 457.23 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-10 23:25:41,869 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:25:41,869 EPOCH 5 done: loss 0.2832 - lr: 0.000090 |
|
2023-10-10 23:25:48,099 DEV : loss 0.22307763993740082 - f1-score (micro avg) 0.2994 |
|
2023-10-10 23:25:48,108 saving best model |
|
2023-10-10 23:25:56,000 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:26:06,665 epoch 6 - iter 14/146 - loss 0.20640681 - time (sec): 10.66 - samples/sec: 436.38 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-10 23:26:15,474 epoch 6 - iter 28/146 - loss 0.23046142 - time (sec): 19.47 - samples/sec: 443.97 - lr: 0.000086 - momentum: 0.000000 |
|
2023-10-10 23:26:24,366 epoch 6 - iter 42/146 - loss 0.21524711 - time (sec): 28.36 - samples/sec: 452.86 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-10 23:26:33,110 epoch 6 - iter 56/146 - loss 0.22748057 - time (sec): 37.11 - samples/sec: 457.24 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-10 23:26:41,853 epoch 6 - iter 70/146 - loss 0.23191428 - time (sec): 45.85 - samples/sec: 462.58 - lr: 0.000081 - momentum: 0.000000 |
|
2023-10-10 23:26:50,581 epoch 6 - iter 84/146 - loss 0.23502256 - time (sec): 54.58 - samples/sec: 460.47 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-10 23:27:00,694 epoch 6 - iter 98/146 - loss 0.24896975 - time (sec): 64.69 - samples/sec: 467.42 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-10 23:27:09,453 epoch 6 - iter 112/146 - loss 0.24733184 - time (sec): 73.45 - samples/sec: 465.81 - lr: 0.000076 - momentum: 0.000000 |
|
2023-10-10 23:27:17,982 epoch 6 - iter 126/146 - loss 0.24239142 - time (sec): 81.98 - samples/sec: 466.69 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-10 23:27:26,637 epoch 6 - iter 140/146 - loss 0.23759475 - time (sec): 90.64 - samples/sec: 468.61 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-10 23:27:30,382 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:27:30,382 EPOCH 6 done: loss 0.2349 - lr: 0.000072 |
|
2023-10-10 23:27:36,129 DEV : loss 0.19796797633171082 - f1-score (micro avg) 0.4681 |
|
2023-10-10 23:27:36,139 saving best model |
|
2023-10-10 23:27:43,828 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:27:52,247 epoch 7 - iter 14/146 - loss 0.19250517 - time (sec): 8.41 - samples/sec: 495.70 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-10 23:28:01,366 epoch 7 - iter 28/146 - loss 0.18333791 - time (sec): 17.53 - samples/sec: 525.61 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-10 23:28:09,139 epoch 7 - iter 42/146 - loss 0.18322523 - time (sec): 25.31 - samples/sec: 500.87 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-10 23:28:17,468 epoch 7 - iter 56/146 - loss 0.19443327 - time (sec): 33.64 - samples/sec: 499.47 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-10 23:28:24,988 epoch 7 - iter 70/146 - loss 0.18783057 - time (sec): 41.16 - samples/sec: 487.71 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-10 23:28:33,260 epoch 7 - iter 84/146 - loss 0.19033179 - time (sec): 49.43 - samples/sec: 490.52 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-10 23:28:43,265 epoch 7 - iter 98/146 - loss 0.19173980 - time (sec): 59.43 - samples/sec: 494.50 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-10 23:28:52,163 epoch 7 - iter 112/146 - loss 0.18999195 - time (sec): 68.33 - samples/sec: 491.39 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-10 23:29:01,255 epoch 7 - iter 126/146 - loss 0.19723376 - time (sec): 77.42 - samples/sec: 488.30 - lr: 0.000056 - momentum: 0.000000 |
|
2023-10-10 23:29:11,917 epoch 7 - iter 140/146 - loss 0.19244750 - time (sec): 88.08 - samples/sec: 485.24 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-10 23:29:16,231 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:29:16,232 EPOCH 7 done: loss 0.1924 - lr: 0.000055 |
|
2023-10-10 23:29:22,237 DEV : loss 0.17689262330532074 - f1-score (micro avg) 0.5087 |
|
2023-10-10 23:29:22,248 saving best model |
|
2023-10-10 23:29:30,065 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:29:40,474 epoch 8 - iter 14/146 - loss 0.16935796 - time (sec): 10.40 - samples/sec: 406.27 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-10 23:29:50,068 epoch 8 - iter 28/146 - loss 0.18290286 - time (sec): 20.00 - samples/sec: 424.49 - lr: 0.000051 - momentum: 0.000000 |
|
2023-10-10 23:29:59,390 epoch 8 - iter 42/146 - loss 0.16793125 - time (sec): 29.32 - samples/sec: 434.61 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-10 23:30:07,662 epoch 8 - iter 56/146 - loss 0.17320389 - time (sec): 37.59 - samples/sec: 431.94 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-10 23:30:16,446 epoch 8 - iter 70/146 - loss 0.18227748 - time (sec): 46.38 - samples/sec: 452.40 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-10 23:30:24,610 epoch 8 - iter 84/146 - loss 0.18108910 - time (sec): 54.54 - samples/sec: 454.08 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-10 23:30:33,732 epoch 8 - iter 98/146 - loss 0.17092159 - time (sec): 63.66 - samples/sec: 464.89 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-10 23:30:41,860 epoch 8 - iter 112/146 - loss 0.16964323 - time (sec): 71.79 - samples/sec: 465.62 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-10 23:30:51,139 epoch 8 - iter 126/146 - loss 0.16593660 - time (sec): 81.07 - samples/sec: 471.56 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-10 23:31:00,500 epoch 8 - iter 140/146 - loss 0.16434625 - time (sec): 90.43 - samples/sec: 477.80 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-10 23:31:03,477 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:31:03,477 EPOCH 8 done: loss 0.1624 - lr: 0.000037 |
|
2023-10-10 23:31:09,101 DEV : loss 0.16704627871513367 - f1-score (micro avg) 0.5245 |
|
2023-10-10 23:31:09,111 saving best model |
|
2023-10-10 23:31:16,953 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:31:25,648 epoch 9 - iter 14/146 - loss 0.15024636 - time (sec): 8.69 - samples/sec: 495.48 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-10 23:31:33,764 epoch 9 - iter 28/146 - loss 0.14496549 - time (sec): 16.81 - samples/sec: 490.96 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-10 23:31:41,355 epoch 9 - iter 42/146 - loss 0.16254726 - time (sec): 24.40 - samples/sec: 478.09 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-10 23:31:49,685 epoch 9 - iter 56/146 - loss 0.15667159 - time (sec): 32.73 - samples/sec: 485.05 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-10 23:31:59,645 epoch 9 - iter 70/146 - loss 0.16060893 - time (sec): 42.69 - samples/sec: 507.84 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-10 23:32:07,172 epoch 9 - iter 84/146 - loss 0.15235930 - time (sec): 50.21 - samples/sec: 496.36 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-10 23:32:16,069 epoch 9 - iter 98/146 - loss 0.15236847 - time (sec): 59.11 - samples/sec: 502.54 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-10 23:32:24,523 epoch 9 - iter 112/146 - loss 0.15005669 - time (sec): 67.57 - samples/sec: 502.42 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-10 23:32:32,980 epoch 9 - iter 126/146 - loss 0.14889199 - time (sec): 76.02 - samples/sec: 501.27 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-10 23:32:42,174 epoch 9 - iter 140/146 - loss 0.14689391 - time (sec): 85.22 - samples/sec: 504.07 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-10 23:32:45,520 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:32:45,520 EPOCH 9 done: loss 0.1449 - lr: 0.000019 |
|
2023-10-10 23:32:51,520 DEV : loss 0.16200371086597443 - f1-score (micro avg) 0.5683 |
|
2023-10-10 23:32:51,529 saving best model |
|
2023-10-10 23:32:55,320 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:33:03,708 epoch 10 - iter 14/146 - loss 0.15671761 - time (sec): 8.38 - samples/sec: 484.05 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-10 23:33:12,911 epoch 10 - iter 28/146 - loss 0.14834250 - time (sec): 17.59 - samples/sec: 499.26 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-10 23:33:21,096 epoch 10 - iter 42/146 - loss 0.14012051 - time (sec): 25.77 - samples/sec: 483.92 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-10 23:33:29,718 epoch 10 - iter 56/146 - loss 0.13354190 - time (sec): 34.39 - samples/sec: 485.93 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-10 23:33:38,610 epoch 10 - iter 70/146 - loss 0.12759908 - time (sec): 43.29 - samples/sec: 486.33 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-10 23:33:46,661 epoch 10 - iter 84/146 - loss 0.12647792 - time (sec): 51.34 - samples/sec: 483.20 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-10 23:33:56,176 epoch 10 - iter 98/146 - loss 0.12934689 - time (sec): 60.85 - samples/sec: 487.90 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-10 23:34:05,223 epoch 10 - iter 112/146 - loss 0.13516861 - time (sec): 69.90 - samples/sec: 490.17 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-10 23:34:14,184 epoch 10 - iter 126/146 - loss 0.13301443 - time (sec): 78.86 - samples/sec: 486.70 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-10 23:34:23,644 epoch 10 - iter 140/146 - loss 0.13565286 - time (sec): 88.32 - samples/sec: 486.78 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-10 23:34:27,061 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:34:27,061 EPOCH 10 done: loss 0.1343 - lr: 0.000002 |
|
2023-10-10 23:34:33,022 DEV : loss 0.16090121865272522 - f1-score (micro avg) 0.5875 |
|
2023-10-10 23:34:33,032 saving best model |
|
2023-10-10 23:34:41,145 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 23:34:41,147 Loading model from best epoch ... |
|
2023-10-10 23:34:44,867 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-10 23:34:58,663 |
|
Results: |
|
- F-score (micro) 0.6529 |
|
- F-score (macro) 0.4086 |
|
- Accuracy 0.5355 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7455 0.7069 0.7257 348 |
|
LOC 0.6192 0.7663 0.6849 261 |
|
ORG 0.1918 0.2692 0.2240 52 |
|
HumanProd 0.0000 0.0000 0.0000 22 |
|
|
|
micro avg 0.6336 0.6735 0.6529 683 |
|
macro avg 0.3891 0.4356 0.4086 683 |
|
weighted avg 0.6310 0.6735 0.6485 683 |
|
|
|
2023-10-10 23:34:58,663 ---------------------------------------------------------------------------------------------------- |
|
|