|
2023-10-11 00:14:02,384 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:14:02,386 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 00:14:02,386 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:14:02,386 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-11 00:14:02,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:14:02,387 Train: 1166 sentences |
|
2023-10-11 00:14:02,387 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 00:14:02,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:14:02,387 Training Params: |
|
2023-10-11 00:14:02,387 - learning_rate: "0.00015" |
|
2023-10-11 00:14:02,387 - mini_batch_size: "8" |
|
2023-10-11 00:14:02,387 - max_epochs: "10" |
|
2023-10-11 00:14:02,387 - shuffle: "True" |
|
2023-10-11 00:14:02,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:14:02,387 Plugins: |
|
2023-10-11 00:14:02,387 - TensorboardLogger |
|
2023-10-11 00:14:02,387 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 00:14:02,388 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:14:02,388 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 00:14:02,388 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 00:14:02,388 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:14:02,388 Computation: |
|
2023-10-11 00:14:02,388 - compute on device: cuda:0 |
|
2023-10-11 00:14:02,388 - embedding storage: none |
|
2023-10-11 00:14:02,388 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:14:02,388 Model training base path: "hmbench-newseye/fi-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-11 00:14:02,388 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:14:02,388 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:14:02,388 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 00:14:11,164 epoch 1 - iter 14/146 - loss 2.82817866 - time (sec): 8.77 - samples/sec: 427.52 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-11 00:14:20,453 epoch 1 - iter 28/146 - loss 2.81986952 - time (sec): 18.06 - samples/sec: 450.86 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-11 00:14:29,670 epoch 1 - iter 42/146 - loss 2.81010839 - time (sec): 27.28 - samples/sec: 448.27 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-11 00:14:38,407 epoch 1 - iter 56/146 - loss 2.79282156 - time (sec): 36.02 - samples/sec: 439.82 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-11 00:14:48,463 epoch 1 - iter 70/146 - loss 2.75564153 - time (sec): 46.07 - samples/sec: 449.23 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-11 00:14:58,486 epoch 1 - iter 84/146 - loss 2.70141670 - time (sec): 56.10 - samples/sec: 458.13 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-11 00:15:07,903 epoch 1 - iter 98/146 - loss 2.63744532 - time (sec): 65.51 - samples/sec: 457.86 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 00:15:16,446 epoch 1 - iter 112/146 - loss 2.57069765 - time (sec): 74.06 - samples/sec: 459.84 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-11 00:15:25,138 epoch 1 - iter 126/146 - loss 2.48819700 - time (sec): 82.75 - samples/sec: 464.09 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-11 00:15:33,976 epoch 1 - iter 140/146 - loss 2.40859330 - time (sec): 91.59 - samples/sec: 464.91 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-11 00:15:37,658 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:15:37,659 EPOCH 1 done: loss 2.3727 - lr: 0.000143 |
|
2023-10-11 00:15:42,937 DEV : loss 1.3521078824996948 - f1-score (micro avg) 0.0 |
|
2023-10-11 00:15:42,946 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:15:50,957 epoch 2 - iter 14/146 - loss 1.37293642 - time (sec): 8.01 - samples/sec: 471.08 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-11 00:15:59,218 epoch 2 - iter 28/146 - loss 1.27588533 - time (sec): 16.27 - samples/sec: 479.57 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-11 00:16:07,875 epoch 2 - iter 42/146 - loss 1.19578947 - time (sec): 24.93 - samples/sec: 484.01 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-11 00:16:16,004 epoch 2 - iter 56/146 - loss 1.12937665 - time (sec): 33.06 - samples/sec: 480.81 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-11 00:16:24,974 epoch 2 - iter 70/146 - loss 1.04224254 - time (sec): 42.03 - samples/sec: 486.91 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-11 00:16:34,067 epoch 2 - iter 84/146 - loss 1.00774700 - time (sec): 51.12 - samples/sec: 489.84 - lr: 0.000141 - momentum: 0.000000 |
|
2023-10-11 00:16:42,519 epoch 2 - iter 98/146 - loss 0.96099641 - time (sec): 59.57 - samples/sec: 487.23 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-11 00:16:51,251 epoch 2 - iter 112/146 - loss 0.90901669 - time (sec): 68.30 - samples/sec: 489.49 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 00:17:00,072 epoch 2 - iter 126/146 - loss 0.86569250 - time (sec): 77.12 - samples/sec: 491.52 - lr: 0.000136 - momentum: 0.000000 |
|
2023-10-11 00:17:09,029 epoch 2 - iter 140/146 - loss 0.83180630 - time (sec): 86.08 - samples/sec: 492.45 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-11 00:17:12,929 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:17:12,930 EPOCH 2 done: loss 0.8265 - lr: 0.000134 |
|
2023-10-11 00:17:18,512 DEV : loss 0.45962727069854736 - f1-score (micro avg) 0.0 |
|
2023-10-11 00:17:18,522 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:17:27,515 epoch 3 - iter 14/146 - loss 0.56664628 - time (sec): 8.99 - samples/sec: 550.74 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 00:17:36,702 epoch 3 - iter 28/146 - loss 0.51000469 - time (sec): 18.18 - samples/sec: 553.40 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-11 00:17:45,427 epoch 3 - iter 42/146 - loss 0.55411592 - time (sec): 26.90 - samples/sec: 537.41 - lr: 0.000129 - momentum: 0.000000 |
|
2023-10-11 00:17:53,675 epoch 3 - iter 56/146 - loss 0.52261842 - time (sec): 35.15 - samples/sec: 527.12 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-11 00:18:02,091 epoch 3 - iter 70/146 - loss 0.51648209 - time (sec): 43.57 - samples/sec: 523.93 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-11 00:18:10,925 epoch 3 - iter 84/146 - loss 0.49728401 - time (sec): 52.40 - samples/sec: 518.27 - lr: 0.000124 - momentum: 0.000000 |
|
2023-10-11 00:18:19,329 epoch 3 - iter 98/146 - loss 0.47812146 - time (sec): 60.81 - samples/sec: 512.91 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-11 00:18:27,211 epoch 3 - iter 112/146 - loss 0.47088239 - time (sec): 68.69 - samples/sec: 505.81 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-11 00:18:34,844 epoch 3 - iter 126/146 - loss 0.46014170 - time (sec): 76.32 - samples/sec: 498.40 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-11 00:18:43,390 epoch 3 - iter 140/146 - loss 0.45282216 - time (sec): 84.87 - samples/sec: 496.35 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-11 00:18:47,362 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:18:47,362 EPOCH 3 done: loss 0.4440 - lr: 0.000118 |
|
2023-10-11 00:18:53,034 DEV : loss 0.28692546486854553 - f1-score (micro avg) 0.1634 |
|
2023-10-11 00:18:53,043 saving best model |
|
2023-10-11 00:18:53,929 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:19:02,136 epoch 4 - iter 14/146 - loss 0.33858093 - time (sec): 8.21 - samples/sec: 468.00 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-11 00:19:11,252 epoch 4 - iter 28/146 - loss 0.33897577 - time (sec): 17.32 - samples/sec: 482.48 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-11 00:19:19,557 epoch 4 - iter 42/146 - loss 0.32510505 - time (sec): 25.63 - samples/sec: 480.07 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-11 00:19:27,999 epoch 4 - iter 56/146 - loss 0.33741625 - time (sec): 34.07 - samples/sec: 484.30 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-11 00:19:36,717 epoch 4 - iter 70/146 - loss 0.32330073 - time (sec): 42.79 - samples/sec: 494.60 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-11 00:19:45,262 epoch 4 - iter 84/146 - loss 0.35066962 - time (sec): 51.33 - samples/sec: 491.95 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-11 00:19:53,579 epoch 4 - iter 98/146 - loss 0.34254804 - time (sec): 59.65 - samples/sec: 491.65 - lr: 0.000106 - momentum: 0.000000 |
|
2023-10-11 00:20:02,300 epoch 4 - iter 112/146 - loss 0.33425876 - time (sec): 68.37 - samples/sec: 495.96 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-11 00:20:10,709 epoch 4 - iter 126/146 - loss 0.33494481 - time (sec): 76.78 - samples/sec: 495.05 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-11 00:20:19,699 epoch 4 - iter 140/146 - loss 0.32750282 - time (sec): 85.77 - samples/sec: 494.85 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-11 00:20:23,418 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:20:23,418 EPOCH 4 done: loss 0.3225 - lr: 0.000101 |
|
2023-10-11 00:20:29,017 DEV : loss 0.23322905600070953 - f1-score (micro avg) 0.332 |
|
2023-10-11 00:20:29,025 saving best model |
|
2023-10-11 00:20:35,031 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:20:43,857 epoch 5 - iter 14/146 - loss 0.27735107 - time (sec): 8.82 - samples/sec: 510.33 - lr: 0.000099 - momentum: 0.000000 |
|
2023-10-11 00:20:52,350 epoch 5 - iter 28/146 - loss 0.25431500 - time (sec): 17.31 - samples/sec: 499.40 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-11 00:21:00,691 epoch 5 - iter 42/146 - loss 0.29245784 - time (sec): 25.66 - samples/sec: 492.51 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-11 00:21:08,893 epoch 5 - iter 56/146 - loss 0.30867369 - time (sec): 33.86 - samples/sec: 484.17 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-11 00:21:17,431 epoch 5 - iter 70/146 - loss 0.28826282 - time (sec): 42.40 - samples/sec: 484.22 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-11 00:21:26,668 epoch 5 - iter 84/146 - loss 0.27456335 - time (sec): 51.63 - samples/sec: 487.77 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-11 00:21:36,019 epoch 5 - iter 98/146 - loss 0.26911782 - time (sec): 60.98 - samples/sec: 497.02 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-11 00:21:44,734 epoch 5 - iter 112/146 - loss 0.25803376 - time (sec): 69.70 - samples/sec: 498.10 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-11 00:21:53,421 epoch 5 - iter 126/146 - loss 0.25520050 - time (sec): 78.39 - samples/sec: 498.15 - lr: 0.000086 - momentum: 0.000000 |
|
2023-10-11 00:22:01,791 epoch 5 - iter 140/146 - loss 0.25148287 - time (sec): 86.76 - samples/sec: 496.66 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-11 00:22:05,100 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:22:05,100 EPOCH 5 done: loss 0.2521 - lr: 0.000084 |
|
2023-10-11 00:22:10,781 DEV : loss 0.19501639902591705 - f1-score (micro avg) 0.473 |
|
2023-10-11 00:22:10,790 saving best model |
|
2023-10-11 00:22:16,955 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:22:26,475 epoch 6 - iter 14/146 - loss 0.16514820 - time (sec): 9.52 - samples/sec: 514.75 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-11 00:22:34,815 epoch 6 - iter 28/146 - loss 0.17522830 - time (sec): 17.86 - samples/sec: 477.67 - lr: 0.000081 - momentum: 0.000000 |
|
2023-10-11 00:22:43,531 epoch 6 - iter 42/146 - loss 0.17690455 - time (sec): 26.57 - samples/sec: 477.73 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-11 00:22:52,461 epoch 6 - iter 56/146 - loss 0.16628079 - time (sec): 35.50 - samples/sec: 484.40 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-11 00:23:00,737 epoch 6 - iter 70/146 - loss 0.18071160 - time (sec): 43.78 - samples/sec: 483.63 - lr: 0.000076 - momentum: 0.000000 |
|
2023-10-11 00:23:10,600 epoch 6 - iter 84/146 - loss 0.20187792 - time (sec): 53.64 - samples/sec: 497.31 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-11 00:23:19,008 epoch 6 - iter 98/146 - loss 0.20080362 - time (sec): 62.05 - samples/sec: 494.99 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-11 00:23:27,506 epoch 6 - iter 112/146 - loss 0.19888829 - time (sec): 70.55 - samples/sec: 493.73 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-11 00:23:35,994 epoch 6 - iter 126/146 - loss 0.19539473 - time (sec): 79.03 - samples/sec: 493.95 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-11 00:23:43,994 epoch 6 - iter 140/146 - loss 0.19529054 - time (sec): 87.03 - samples/sec: 491.39 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-11 00:23:47,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:23:47,387 EPOCH 6 done: loss 0.1923 - lr: 0.000068 |
|
2023-10-11 00:23:52,889 DEV : loss 0.1738743782043457 - f1-score (micro avg) 0.5498 |
|
2023-10-11 00:23:52,897 saving best model |
|
2023-10-11 00:23:59,052 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:24:08,020 epoch 7 - iter 14/146 - loss 0.15115652 - time (sec): 8.96 - samples/sec: 516.15 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-11 00:24:16,989 epoch 7 - iter 28/146 - loss 0.15169848 - time (sec): 17.93 - samples/sec: 529.01 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-11 00:24:25,559 epoch 7 - iter 42/146 - loss 0.15112913 - time (sec): 26.50 - samples/sec: 514.72 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-11 00:24:33,559 epoch 7 - iter 56/146 - loss 0.14375947 - time (sec): 34.50 - samples/sec: 505.96 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-11 00:24:41,851 epoch 7 - iter 70/146 - loss 0.14191662 - time (sec): 42.80 - samples/sec: 502.65 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-11 00:24:49,764 epoch 7 - iter 84/146 - loss 0.14733674 - time (sec): 50.71 - samples/sec: 499.78 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-11 00:24:58,428 epoch 7 - iter 98/146 - loss 0.15209724 - time (sec): 59.37 - samples/sec: 503.09 - lr: 0.000056 - momentum: 0.000000 |
|
2023-10-11 00:25:06,268 epoch 7 - iter 112/146 - loss 0.15110304 - time (sec): 67.21 - samples/sec: 493.92 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-11 00:25:15,356 epoch 7 - iter 126/146 - loss 0.15315807 - time (sec): 76.30 - samples/sec: 498.24 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-11 00:25:24,258 epoch 7 - iter 140/146 - loss 0.15339801 - time (sec): 85.20 - samples/sec: 504.10 - lr: 0.000051 - momentum: 0.000000 |
|
2023-10-11 00:25:27,449 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:25:27,450 EPOCH 7 done: loss 0.1525 - lr: 0.000051 |
|
2023-10-11 00:25:33,160 DEV : loss 0.1568579375743866 - f1-score (micro avg) 0.6026 |
|
2023-10-11 00:25:33,170 saving best model |
|
2023-10-11 00:25:39,402 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:25:48,695 epoch 8 - iter 14/146 - loss 0.14470409 - time (sec): 9.29 - samples/sec: 565.87 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-11 00:25:56,856 epoch 8 - iter 28/146 - loss 0.15466879 - time (sec): 17.45 - samples/sec: 513.07 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-11 00:26:05,090 epoch 8 - iter 42/146 - loss 0.14556756 - time (sec): 25.68 - samples/sec: 500.74 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-11 00:26:13,757 epoch 8 - iter 56/146 - loss 0.14562604 - time (sec): 34.35 - samples/sec: 497.81 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-11 00:26:22,674 epoch 8 - iter 70/146 - loss 0.14744312 - time (sec): 43.27 - samples/sec: 498.29 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-11 00:26:31,283 epoch 8 - iter 84/146 - loss 0.14623450 - time (sec): 51.88 - samples/sec: 486.66 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-11 00:26:40,567 epoch 8 - iter 98/146 - loss 0.13910467 - time (sec): 61.16 - samples/sec: 479.51 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-11 00:26:50,125 epoch 8 - iter 112/146 - loss 0.13334599 - time (sec): 70.72 - samples/sec: 476.83 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-11 00:26:59,926 epoch 8 - iter 126/146 - loss 0.12965286 - time (sec): 80.52 - samples/sec: 473.91 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-11 00:27:09,581 epoch 8 - iter 140/146 - loss 0.12939577 - time (sec): 90.18 - samples/sec: 471.19 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-11 00:27:13,663 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:27:13,664 EPOCH 8 done: loss 0.1293 - lr: 0.000035 |
|
2023-10-11 00:27:20,336 DEV : loss 0.14915454387664795 - f1-score (micro avg) 0.6711 |
|
2023-10-11 00:27:20,346 saving best model |
|
2023-10-11 00:27:26,639 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:27:35,810 epoch 9 - iter 14/146 - loss 0.14480468 - time (sec): 9.17 - samples/sec: 512.79 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 00:27:44,972 epoch 9 - iter 28/146 - loss 0.12111733 - time (sec): 18.33 - samples/sec: 508.59 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-11 00:27:53,282 epoch 9 - iter 42/146 - loss 0.11551154 - time (sec): 26.64 - samples/sec: 494.12 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 00:28:02,178 epoch 9 - iter 56/146 - loss 0.11450421 - time (sec): 35.54 - samples/sec: 496.91 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-11 00:28:11,315 epoch 9 - iter 70/146 - loss 0.11627392 - time (sec): 44.67 - samples/sec: 491.31 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-11 00:28:20,239 epoch 9 - iter 84/146 - loss 0.11633930 - time (sec): 53.60 - samples/sec: 491.19 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-11 00:28:28,971 epoch 9 - iter 98/146 - loss 0.11323542 - time (sec): 62.33 - samples/sec: 488.62 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-11 00:28:37,804 epoch 9 - iter 112/146 - loss 0.10890718 - time (sec): 71.16 - samples/sec: 488.50 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-11 00:28:46,797 epoch 9 - iter 126/146 - loss 0.11265525 - time (sec): 80.15 - samples/sec: 487.63 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-11 00:28:55,411 epoch 9 - iter 140/146 - loss 0.11540303 - time (sec): 88.77 - samples/sec: 485.35 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-11 00:28:58,607 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:28:58,607 EPOCH 9 done: loss 0.1148 - lr: 0.000018 |
|
2023-10-11 00:29:04,628 DEV : loss 0.15014490485191345 - f1-score (micro avg) 0.7097 |
|
2023-10-11 00:29:04,638 saving best model |
|
2023-10-11 00:29:10,636 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:29:19,540 epoch 10 - iter 14/146 - loss 0.11532110 - time (sec): 8.90 - samples/sec: 515.52 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-11 00:29:28,674 epoch 10 - iter 28/146 - loss 0.11738693 - time (sec): 18.03 - samples/sec: 506.38 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 00:29:37,812 epoch 10 - iter 42/146 - loss 0.11927845 - time (sec): 27.17 - samples/sec: 512.69 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-11 00:29:47,596 epoch 10 - iter 56/146 - loss 0.11248663 - time (sec): 36.96 - samples/sec: 504.22 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-11 00:29:56,837 epoch 10 - iter 70/146 - loss 0.11378977 - time (sec): 46.20 - samples/sec: 489.68 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-11 00:30:06,496 epoch 10 - iter 84/146 - loss 0.10860430 - time (sec): 55.86 - samples/sec: 483.19 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-11 00:30:15,171 epoch 10 - iter 98/146 - loss 0.10596910 - time (sec): 64.53 - samples/sec: 467.25 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-11 00:30:25,050 epoch 10 - iter 112/146 - loss 0.10933142 - time (sec): 74.41 - samples/sec: 465.68 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-11 00:30:34,564 epoch 10 - iter 126/146 - loss 0.10650302 - time (sec): 83.92 - samples/sec: 460.29 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-11 00:30:44,129 epoch 10 - iter 140/146 - loss 0.10893081 - time (sec): 93.49 - samples/sec: 456.67 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-11 00:30:48,031 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:30:48,032 EPOCH 10 done: loss 0.1087 - lr: 0.000001 |
|
2023-10-11 00:30:53,773 DEV : loss 0.15238186717033386 - f1-score (micro avg) 0.7229 |
|
2023-10-11 00:30:53,782 saving best model |
|
2023-10-11 00:31:00,790 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:31:00,792 Loading model from best epoch ... |
|
2023-10-11 00:31:04,651 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-11 00:31:16,872 |
|
Results: |
|
- F-score (micro) 0.7015 |
|
- F-score (macro) 0.6099 |
|
- Accuracy 0.5632 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7821 0.8046 0.7932 348 |
|
LOC 0.5766 0.7931 0.6677 261 |
|
ORG 0.2982 0.3269 0.3119 52 |
|
HumanProd 0.7647 0.5909 0.6667 22 |
|
|
|
micro avg 0.6536 0.7570 0.7015 683 |
|
macro avg 0.6054 0.6289 0.6099 683 |
|
weighted avg 0.6662 0.7570 0.7045 683 |
|
|
|
2023-10-11 00:31:16,872 ---------------------------------------------------------------------------------------------------- |
|
|