|
2023-10-11 01:45:47,177 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:45:47,179 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 01:45:47,179 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:45:47,179 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-11 01:45:47,179 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:45:47,180 Train: 1166 sentences |
|
2023-10-11 01:45:47,180 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 01:45:47,180 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:45:47,180 Training Params: |
|
2023-10-11 01:45:47,180 - learning_rate: "0.00016" |
|
2023-10-11 01:45:47,180 - mini_batch_size: "8" |
|
2023-10-11 01:45:47,180 - max_epochs: "10" |
|
2023-10-11 01:45:47,180 - shuffle: "True" |
|
2023-10-11 01:45:47,180 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:45:47,180 Plugins: |
|
2023-10-11 01:45:47,180 - TensorboardLogger |
|
2023-10-11 01:45:47,180 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 01:45:47,180 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:45:47,180 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 01:45:47,180 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 01:45:47,181 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:45:47,181 Computation: |
|
2023-10-11 01:45:47,181 - compute on device: cuda:0 |
|
2023-10-11 01:45:47,181 - embedding storage: none |
|
2023-10-11 01:45:47,181 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:45:47,181 Model training base path: "hmbench-newseye/fi-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-11 01:45:47,181 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:45:47,181 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:45:47,181 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 01:45:55,978 epoch 1 - iter 14/146 - loss 2.84799591 - time (sec): 8.80 - samples/sec: 511.07 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 01:46:04,528 epoch 1 - iter 28/146 - loss 2.84070737 - time (sec): 17.35 - samples/sec: 495.81 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-11 01:46:12,675 epoch 1 - iter 42/146 - loss 2.82989666 - time (sec): 25.49 - samples/sec: 486.19 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 01:46:21,414 epoch 1 - iter 56/146 - loss 2.80652957 - time (sec): 34.23 - samples/sec: 492.38 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-11 01:46:29,846 epoch 1 - iter 70/146 - loss 2.76874544 - time (sec): 42.66 - samples/sec: 483.58 - lr: 0.000076 - momentum: 0.000000 |
|
2023-10-11 01:46:38,429 epoch 1 - iter 84/146 - loss 2.70903645 - time (sec): 51.25 - samples/sec: 482.63 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-11 01:46:47,869 epoch 1 - iter 98/146 - loss 2.62745353 - time (sec): 60.69 - samples/sec: 494.94 - lr: 0.000106 - momentum: 0.000000 |
|
2023-10-11 01:46:57,104 epoch 1 - iter 112/146 - loss 2.54807138 - time (sec): 69.92 - samples/sec: 494.47 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-11 01:47:05,555 epoch 1 - iter 126/146 - loss 2.47027428 - time (sec): 78.37 - samples/sec: 493.29 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 01:47:13,913 epoch 1 - iter 140/146 - loss 2.39191033 - time (sec): 86.73 - samples/sec: 489.96 - lr: 0.000152 - momentum: 0.000000 |
|
2023-10-11 01:47:17,745 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:47:17,745 EPOCH 1 done: loss 2.3510 - lr: 0.000152 |
|
2023-10-11 01:47:22,883 DEV : loss 1.284703254699707 - f1-score (micro avg) 0.0 |
|
2023-10-11 01:47:22,892 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:47:31,469 epoch 2 - iter 14/146 - loss 1.28576135 - time (sec): 8.57 - samples/sec: 484.20 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-11 01:47:40,053 epoch 2 - iter 28/146 - loss 1.19350508 - time (sec): 17.16 - samples/sec: 484.75 - lr: 0.000157 - momentum: 0.000000 |
|
2023-10-11 01:47:48,140 epoch 2 - iter 42/146 - loss 1.11461490 - time (sec): 25.25 - samples/sec: 487.08 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-11 01:47:56,860 epoch 2 - iter 56/146 - loss 1.01345969 - time (sec): 33.97 - samples/sec: 500.64 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-11 01:48:05,311 epoch 2 - iter 70/146 - loss 0.95460535 - time (sec): 42.42 - samples/sec: 501.66 - lr: 0.000152 - momentum: 0.000000 |
|
2023-10-11 01:48:13,908 epoch 2 - iter 84/146 - loss 0.90405139 - time (sec): 51.01 - samples/sec: 503.12 - lr: 0.000150 - momentum: 0.000000 |
|
2023-10-11 01:48:22,825 epoch 2 - iter 98/146 - loss 0.90347909 - time (sec): 59.93 - samples/sec: 508.28 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 01:48:31,635 epoch 2 - iter 112/146 - loss 0.87398082 - time (sec): 68.74 - samples/sec: 508.77 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-11 01:48:40,026 epoch 2 - iter 126/146 - loss 0.85172335 - time (sec): 77.13 - samples/sec: 505.74 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-11 01:48:48,193 epoch 2 - iter 140/146 - loss 0.81628531 - time (sec): 85.30 - samples/sec: 501.54 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-11 01:48:51,600 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:48:51,600 EPOCH 2 done: loss 0.8079 - lr: 0.000143 |
|
2023-10-11 01:48:57,050 DEV : loss 0.40683862566947937 - f1-score (micro avg) 0.0 |
|
2023-10-11 01:48:57,058 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:49:05,067 epoch 3 - iter 14/146 - loss 0.52447831 - time (sec): 8.01 - samples/sec: 448.66 - lr: 0.000141 - momentum: 0.000000 |
|
2023-10-11 01:49:13,729 epoch 3 - iter 28/146 - loss 0.46662481 - time (sec): 16.67 - samples/sec: 490.31 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-11 01:49:21,835 epoch 3 - iter 42/146 - loss 0.44349831 - time (sec): 24.77 - samples/sec: 491.32 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 01:49:30,221 epoch 3 - iter 56/146 - loss 0.48544884 - time (sec): 33.16 - samples/sec: 494.60 - lr: 0.000136 - momentum: 0.000000 |
|
2023-10-11 01:49:39,305 epoch 3 - iter 70/146 - loss 0.45625358 - time (sec): 42.24 - samples/sec: 505.20 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-11 01:49:48,487 epoch 3 - iter 84/146 - loss 0.44065904 - time (sec): 51.43 - samples/sec: 508.94 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 01:49:56,651 epoch 3 - iter 98/146 - loss 0.42778850 - time (sec): 59.59 - samples/sec: 504.64 - lr: 0.000131 - momentum: 0.000000 |
|
2023-10-11 01:50:04,334 epoch 3 - iter 112/146 - loss 0.41962827 - time (sec): 67.27 - samples/sec: 496.11 - lr: 0.000129 - momentum: 0.000000 |
|
2023-10-11 01:50:13,000 epoch 3 - iter 126/146 - loss 0.40808767 - time (sec): 75.94 - samples/sec: 496.87 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-11 01:50:22,043 epoch 3 - iter 140/146 - loss 0.40388098 - time (sec): 84.98 - samples/sec: 499.65 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-11 01:50:25,855 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:50:25,855 EPOCH 3 done: loss 0.4001 - lr: 0.000125 |
|
2023-10-11 01:50:31,482 DEV : loss 0.2588852643966675 - f1-score (micro avg) 0.276 |
|
2023-10-11 01:50:31,490 saving best model |
|
2023-10-11 01:50:32,587 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:50:41,007 epoch 4 - iter 14/146 - loss 0.24164107 - time (sec): 8.42 - samples/sec: 488.85 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-11 01:50:49,321 epoch 4 - iter 28/146 - loss 0.29146948 - time (sec): 16.73 - samples/sec: 500.12 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-11 01:50:57,496 epoch 4 - iter 42/146 - loss 0.28541708 - time (sec): 24.91 - samples/sec: 500.91 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-11 01:51:05,765 epoch 4 - iter 56/146 - loss 0.27322829 - time (sec): 33.18 - samples/sec: 502.12 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-11 01:51:13,936 epoch 4 - iter 70/146 - loss 0.27476808 - time (sec): 41.35 - samples/sec: 499.65 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-11 01:51:22,618 epoch 4 - iter 84/146 - loss 0.29740996 - time (sec): 50.03 - samples/sec: 502.25 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-11 01:51:31,373 epoch 4 - iter 98/146 - loss 0.30483075 - time (sec): 58.78 - samples/sec: 504.22 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-11 01:51:39,388 epoch 4 - iter 112/146 - loss 0.30252875 - time (sec): 66.80 - samples/sec: 498.73 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-11 01:51:47,732 epoch 4 - iter 126/146 - loss 0.29906065 - time (sec): 75.14 - samples/sec: 495.80 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-11 01:51:56,969 epoch 4 - iter 140/146 - loss 0.29067698 - time (sec): 84.38 - samples/sec: 501.64 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-11 01:52:00,782 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:52:00,782 EPOCH 4 done: loss 0.2895 - lr: 0.000108 |
|
2023-10-11 01:52:06,509 DEV : loss 0.2072305679321289 - f1-score (micro avg) 0.4454 |
|
2023-10-11 01:52:06,517 saving best model |
|
2023-10-11 01:52:09,093 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:52:18,749 epoch 5 - iter 14/146 - loss 0.30683651 - time (sec): 9.65 - samples/sec: 572.55 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 01:52:27,752 epoch 5 - iter 28/146 - loss 0.25854309 - time (sec): 18.65 - samples/sec: 545.82 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-11 01:52:35,868 epoch 5 - iter 42/146 - loss 0.25470375 - time (sec): 26.77 - samples/sec: 517.15 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-11 01:52:44,048 epoch 5 - iter 56/146 - loss 0.24566226 - time (sec): 34.95 - samples/sec: 511.09 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 01:52:52,786 epoch 5 - iter 70/146 - loss 0.24130166 - time (sec): 43.69 - samples/sec: 514.10 - lr: 0.000099 - momentum: 0.000000 |
|
2023-10-11 01:53:00,635 epoch 5 - iter 84/146 - loss 0.24670391 - time (sec): 51.54 - samples/sec: 507.09 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-11 01:53:09,102 epoch 5 - iter 98/146 - loss 0.23499454 - time (sec): 60.00 - samples/sec: 505.81 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-11 01:53:17,232 epoch 5 - iter 112/146 - loss 0.22971880 - time (sec): 68.13 - samples/sec: 503.34 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-11 01:53:25,052 epoch 5 - iter 126/146 - loss 0.22515418 - time (sec): 75.95 - samples/sec: 498.05 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-11 01:53:33,478 epoch 5 - iter 140/146 - loss 0.22370291 - time (sec): 84.38 - samples/sec: 498.10 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-11 01:53:37,472 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:53:37,472 EPOCH 5 done: loss 0.2231 - lr: 0.000090 |
|
2023-10-11 01:53:43,406 DEV : loss 0.18223470449447632 - f1-score (micro avg) 0.5156 |
|
2023-10-11 01:53:43,416 saving best model |
|
2023-10-11 01:53:45,986 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:53:55,576 epoch 6 - iter 14/146 - loss 0.21654807 - time (sec): 9.59 - samples/sec: 550.21 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-11 01:54:04,225 epoch 6 - iter 28/146 - loss 0.17901612 - time (sec): 18.23 - samples/sec: 535.03 - lr: 0.000086 - momentum: 0.000000 |
|
2023-10-11 01:54:12,266 epoch 6 - iter 42/146 - loss 0.17103159 - time (sec): 26.28 - samples/sec: 511.31 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-11 01:54:21,012 epoch 6 - iter 56/146 - loss 0.15995008 - time (sec): 35.02 - samples/sec: 508.91 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-11 01:54:29,434 epoch 6 - iter 70/146 - loss 0.16068347 - time (sec): 43.44 - samples/sec: 494.68 - lr: 0.000081 - momentum: 0.000000 |
|
2023-10-11 01:54:38,141 epoch 6 - iter 84/146 - loss 0.17578810 - time (sec): 52.15 - samples/sec: 494.74 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-11 01:54:48,217 epoch 6 - iter 98/146 - loss 0.16726653 - time (sec): 62.23 - samples/sec: 502.60 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-11 01:54:57,061 epoch 6 - iter 112/146 - loss 0.16948194 - time (sec): 71.07 - samples/sec: 491.93 - lr: 0.000076 - momentum: 0.000000 |
|
2023-10-11 01:55:06,565 epoch 6 - iter 126/146 - loss 0.16941279 - time (sec): 80.57 - samples/sec: 480.60 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-11 01:55:16,251 epoch 6 - iter 140/146 - loss 0.16809766 - time (sec): 90.26 - samples/sec: 472.65 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-11 01:55:20,370 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:55:20,371 EPOCH 6 done: loss 0.1675 - lr: 0.000072 |
|
2023-10-11 01:55:26,683 DEV : loss 0.15008610486984253 - f1-score (micro avg) 0.63 |
|
2023-10-11 01:55:26,693 saving best model |
|
2023-10-11 01:55:34,594 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:55:43,486 epoch 7 - iter 14/146 - loss 0.15174053 - time (sec): 8.89 - samples/sec: 406.96 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-11 01:55:52,482 epoch 7 - iter 28/146 - loss 0.13072503 - time (sec): 17.88 - samples/sec: 450.28 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-11 01:56:01,097 epoch 7 - iter 42/146 - loss 0.13179409 - time (sec): 26.50 - samples/sec: 442.24 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-11 01:56:11,670 epoch 7 - iter 56/146 - loss 0.12405555 - time (sec): 37.07 - samples/sec: 442.31 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-11 01:56:21,134 epoch 7 - iter 70/146 - loss 0.12815382 - time (sec): 46.54 - samples/sec: 461.77 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-11 01:56:29,304 epoch 7 - iter 84/146 - loss 0.13476612 - time (sec): 54.71 - samples/sec: 460.73 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-11 01:56:38,680 epoch 7 - iter 98/146 - loss 0.13398473 - time (sec): 64.08 - samples/sec: 469.26 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-11 01:56:47,734 epoch 7 - iter 112/146 - loss 0.13242540 - time (sec): 73.14 - samples/sec: 468.51 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-11 01:56:56,401 epoch 7 - iter 126/146 - loss 0.12813846 - time (sec): 81.80 - samples/sec: 469.65 - lr: 0.000056 - momentum: 0.000000 |
|
2023-10-11 01:57:05,171 epoch 7 - iter 140/146 - loss 0.13180592 - time (sec): 90.57 - samples/sec: 471.51 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-11 01:57:08,863 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:57:08,863 EPOCH 7 done: loss 0.1298 - lr: 0.000055 |
|
2023-10-11 01:57:14,752 DEV : loss 0.14380821585655212 - f1-score (micro avg) 0.6953 |
|
2023-10-11 01:57:14,762 saving best model |
|
2023-10-11 01:57:24,219 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:57:33,086 epoch 8 - iter 14/146 - loss 0.12831157 - time (sec): 8.86 - samples/sec: 474.89 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-11 01:57:42,240 epoch 8 - iter 28/146 - loss 0.10696556 - time (sec): 18.02 - samples/sec: 495.38 - lr: 0.000051 - momentum: 0.000000 |
|
2023-10-11 01:57:50,350 epoch 8 - iter 42/146 - loss 0.10681912 - time (sec): 26.13 - samples/sec: 473.89 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-11 01:57:59,658 epoch 8 - iter 56/146 - loss 0.11273648 - time (sec): 35.43 - samples/sec: 471.03 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-11 01:58:09,418 epoch 8 - iter 70/146 - loss 0.10310746 - time (sec): 45.19 - samples/sec: 478.00 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 01:58:18,799 epoch 8 - iter 84/146 - loss 0.10691084 - time (sec): 54.58 - samples/sec: 480.45 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-11 01:58:27,931 epoch 8 - iter 98/146 - loss 0.10879791 - time (sec): 63.71 - samples/sec: 482.22 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-11 01:58:36,616 epoch 8 - iter 112/146 - loss 0.10888580 - time (sec): 72.39 - samples/sec: 478.46 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-11 01:58:45,450 epoch 8 - iter 126/146 - loss 0.10711927 - time (sec): 81.23 - samples/sec: 471.58 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-11 01:58:55,017 epoch 8 - iter 140/146 - loss 0.10900138 - time (sec): 90.79 - samples/sec: 470.50 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-11 01:58:58,974 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:58:58,975 EPOCH 8 done: loss 0.1099 - lr: 0.000037 |
|
2023-10-11 01:59:04,887 DEV : loss 0.13584169745445251 - f1-score (micro avg) 0.7712 |
|
2023-10-11 01:59:04,897 saving best model |
|
2023-10-11 01:59:11,021 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:59:19,814 epoch 9 - iter 14/146 - loss 0.08415165 - time (sec): 8.79 - samples/sec: 475.62 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-11 01:59:28,488 epoch 9 - iter 28/146 - loss 0.09301484 - time (sec): 17.46 - samples/sec: 472.65 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-11 01:59:37,554 epoch 9 - iter 42/146 - loss 0.08023038 - time (sec): 26.53 - samples/sec: 483.47 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-11 01:59:46,510 epoch 9 - iter 56/146 - loss 0.08357733 - time (sec): 35.48 - samples/sec: 480.43 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 01:59:55,072 epoch 9 - iter 70/146 - loss 0.08569746 - time (sec): 44.05 - samples/sec: 485.46 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-11 02:00:03,593 epoch 9 - iter 84/146 - loss 0.09152495 - time (sec): 52.57 - samples/sec: 487.52 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-11 02:00:12,273 epoch 9 - iter 98/146 - loss 0.09293580 - time (sec): 61.25 - samples/sec: 487.05 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-11 02:00:21,641 epoch 9 - iter 112/146 - loss 0.09202586 - time (sec): 70.62 - samples/sec: 494.33 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-11 02:00:30,382 epoch 9 - iter 126/146 - loss 0.09556408 - time (sec): 79.36 - samples/sec: 489.70 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-11 02:00:39,304 epoch 9 - iter 140/146 - loss 0.09594021 - time (sec): 88.28 - samples/sec: 488.90 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-11 02:00:42,560 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 02:00:42,560 EPOCH 9 done: loss 0.0978 - lr: 0.000019 |
|
2023-10-11 02:00:48,591 DEV : loss 0.13022395968437195 - f1-score (micro avg) 0.7804 |
|
2023-10-11 02:00:48,600 saving best model |
|
2023-10-11 02:00:57,702 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 02:01:06,932 epoch 10 - iter 14/146 - loss 0.09334435 - time (sec): 9.23 - samples/sec: 547.15 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-11 02:01:15,693 epoch 10 - iter 28/146 - loss 0.09187192 - time (sec): 17.99 - samples/sec: 518.99 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-11 02:01:24,775 epoch 10 - iter 42/146 - loss 0.08406622 - time (sec): 27.07 - samples/sec: 515.09 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-11 02:01:32,761 epoch 10 - iter 56/146 - loss 0.09189053 - time (sec): 35.05 - samples/sec: 499.56 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-11 02:01:41,775 epoch 10 - iter 70/146 - loss 0.09213464 - time (sec): 44.07 - samples/sec: 501.69 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-11 02:01:51,096 epoch 10 - iter 84/146 - loss 0.08675532 - time (sec): 53.39 - samples/sec: 503.47 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-11 02:01:59,298 epoch 10 - iter 98/146 - loss 0.08770465 - time (sec): 61.59 - samples/sec: 496.31 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-11 02:02:07,665 epoch 10 - iter 112/146 - loss 0.08846977 - time (sec): 69.96 - samples/sec: 489.99 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-11 02:02:16,158 epoch 10 - iter 126/146 - loss 0.08936363 - time (sec): 78.45 - samples/sec: 488.05 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-11 02:02:24,963 epoch 10 - iter 140/146 - loss 0.09235372 - time (sec): 87.26 - samples/sec: 490.59 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-11 02:02:28,301 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 02:02:28,301 EPOCH 10 done: loss 0.0909 - lr: 0.000002 |
|
2023-10-11 02:02:34,184 DEV : loss 0.1293526142835617 - f1-score (micro avg) 0.7702 |
|
2023-10-11 02:02:35,042 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 02:02:35,044 Loading model from best epoch ... |
|
2023-10-11 02:02:39,160 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-11 02:02:52,020 |
|
Results: |
|
- F-score (micro) 0.7274 |
|
- F-score (macro) 0.6818 |
|
- Accuracy 0.5901 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7745 0.8190 0.7961 348 |
|
LOC 0.6258 0.7816 0.6951 261 |
|
ORG 0.3800 0.3654 0.3725 52 |
|
HumanProd 0.8636 0.8636 0.8636 22 |
|
|
|
micro avg 0.6880 0.7716 0.7274 683 |
|
macro avg 0.6610 0.7074 0.6818 683 |
|
weighted avg 0.6905 0.7716 0.7274 683 |
|
|
|
2023-10-11 02:02:52,020 ---------------------------------------------------------------------------------------------------- |
|
|