|
2023-10-11 17:53:01,993 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 17:53:01,995 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 17:53:01,995 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 17:53:01,995 MultiCorpus: 5777 train + 722 dev + 723 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl |
|
2023-10-11 17:53:01,995 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 17:53:01,995 Train: 5777 sentences |
|
2023-10-11 17:53:01,995 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 17:53:01,995 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 17:53:01,995 Training Params: |
|
2023-10-11 17:53:01,995 - learning_rate: "0.00016" |
|
2023-10-11 17:53:01,996 - mini_batch_size: "4" |
|
2023-10-11 17:53:01,996 - max_epochs: "10" |
|
2023-10-11 17:53:01,996 - shuffle: "True" |
|
2023-10-11 17:53:01,996 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 17:53:01,996 Plugins: |
|
2023-10-11 17:53:01,996 - TensorboardLogger |
|
2023-10-11 17:53:01,996 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 17:53:01,996 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 17:53:01,996 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 17:53:01,996 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 17:53:01,996 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 17:53:01,996 Computation: |
|
2023-10-11 17:53:01,996 - compute on device: cuda:0 |
|
2023-10-11 17:53:01,996 - embedding storage: none |
|
2023-10-11 17:53:01,996 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 17:53:01,996 Model training base path: "hmbench-icdar/nl-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-11 17:53:01,996 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 17:53:01,997 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 17:53:01,997 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 17:53:48,700 epoch 1 - iter 144/1445 - loss 2.57848411 - time (sec): 46.70 - samples/sec: 377.20 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-11 17:54:35,241 epoch 1 - iter 288/1445 - loss 2.46579897 - time (sec): 93.24 - samples/sec: 398.60 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 17:55:18,233 epoch 1 - iter 432/1445 - loss 2.21235139 - time (sec): 136.23 - samples/sec: 393.03 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-11 17:56:02,557 epoch 1 - iter 576/1445 - loss 1.90343871 - time (sec): 180.56 - samples/sec: 396.59 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-11 17:56:49,522 epoch 1 - iter 720/1445 - loss 1.59094900 - time (sec): 227.52 - samples/sec: 400.68 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-11 17:57:34,114 epoch 1 - iter 864/1445 - loss 1.38249162 - time (sec): 272.12 - samples/sec: 397.42 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-11 17:58:17,459 epoch 1 - iter 1008/1445 - loss 1.22459509 - time (sec): 315.46 - samples/sec: 398.14 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-11 17:59:01,341 epoch 1 - iter 1152/1445 - loss 1.10127657 - time (sec): 359.34 - samples/sec: 396.35 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-11 17:59:46,633 epoch 1 - iter 1296/1445 - loss 1.00799125 - time (sec): 404.63 - samples/sec: 390.82 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-11 18:00:30,412 epoch 1 - iter 1440/1445 - loss 0.92239231 - time (sec): 448.41 - samples/sec: 392.09 - lr: 0.000159 - momentum: 0.000000 |
|
2023-10-11 18:00:31,780 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 18:00:31,780 EPOCH 1 done: loss 0.9210 - lr: 0.000159 |
|
2023-10-11 18:00:56,886 DEV : loss 0.18090352416038513 - f1-score (micro avg) 0.3961 |
|
2023-10-11 18:00:56,921 saving best model |
|
2023-10-11 18:00:57,825 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 18:01:42,534 epoch 2 - iter 144/1445 - loss 0.15903434 - time (sec): 44.71 - samples/sec: 403.36 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-11 18:02:26,877 epoch 2 - iter 288/1445 - loss 0.13983359 - time (sec): 89.05 - samples/sec: 398.86 - lr: 0.000156 - momentum: 0.000000 |
|
2023-10-11 18:03:07,959 epoch 2 - iter 432/1445 - loss 0.13648926 - time (sec): 130.13 - samples/sec: 397.19 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-11 18:03:51,251 epoch 2 - iter 576/1445 - loss 0.13003560 - time (sec): 173.42 - samples/sec: 399.93 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-11 18:04:34,290 epoch 2 - iter 720/1445 - loss 0.12853763 - time (sec): 216.46 - samples/sec: 403.76 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-11 18:05:15,657 epoch 2 - iter 864/1445 - loss 0.12641822 - time (sec): 257.83 - samples/sec: 406.68 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-11 18:05:57,359 epoch 2 - iter 1008/1445 - loss 0.12127856 - time (sec): 299.53 - samples/sec: 411.68 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 18:06:40,653 epoch 2 - iter 1152/1445 - loss 0.11582879 - time (sec): 342.83 - samples/sec: 417.08 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-11 18:07:23,210 epoch 2 - iter 1296/1445 - loss 0.11382167 - time (sec): 385.38 - samples/sec: 412.81 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-11 18:08:09,871 epoch 2 - iter 1440/1445 - loss 0.11247559 - time (sec): 432.04 - samples/sec: 406.15 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-11 18:08:11,565 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 18:08:11,565 EPOCH 2 done: loss 0.1123 - lr: 0.000142 |
|
2023-10-11 18:08:36,039 DEV : loss 0.08681953698396683 - f1-score (micro avg) 0.7837 |
|
2023-10-11 18:08:36,078 saving best model |
|
2023-10-11 18:08:38,806 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 18:09:31,280 epoch 3 - iter 144/1445 - loss 0.06528553 - time (sec): 52.47 - samples/sec: 329.17 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-11 18:10:17,024 epoch 3 - iter 288/1445 - loss 0.07522556 - time (sec): 98.21 - samples/sec: 350.51 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-11 18:11:03,850 epoch 3 - iter 432/1445 - loss 0.06963333 - time (sec): 145.04 - samples/sec: 358.69 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 18:11:51,125 epoch 3 - iter 576/1445 - loss 0.07482525 - time (sec): 192.31 - samples/sec: 356.97 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-11 18:12:41,126 epoch 3 - iter 720/1445 - loss 0.07425414 - time (sec): 242.32 - samples/sec: 360.00 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-11 18:13:27,470 epoch 3 - iter 864/1445 - loss 0.07379735 - time (sec): 288.66 - samples/sec: 360.35 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 18:14:11,509 epoch 3 - iter 1008/1445 - loss 0.07309368 - time (sec): 332.70 - samples/sec: 365.23 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-11 18:14:54,347 epoch 3 - iter 1152/1445 - loss 0.07050961 - time (sec): 375.54 - samples/sec: 372.12 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-11 18:15:37,674 epoch 3 - iter 1296/1445 - loss 0.06815127 - time (sec): 418.86 - samples/sec: 377.35 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-11 18:16:22,160 epoch 3 - iter 1440/1445 - loss 0.06798324 - time (sec): 463.35 - samples/sec: 379.08 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-11 18:16:23,435 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 18:16:23,436 EPOCH 3 done: loss 0.0679 - lr: 0.000125 |
|
2023-10-11 18:16:44,977 DEV : loss 0.08381146192550659 - f1-score (micro avg) 0.836 |
|
2023-10-11 18:16:45,010 saving best model |
|
2023-10-11 18:16:47,752 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 18:17:31,679 epoch 4 - iter 144/1445 - loss 0.05699248 - time (sec): 43.92 - samples/sec: 411.69 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-11 18:18:15,766 epoch 4 - iter 288/1445 - loss 0.04727960 - time (sec): 88.01 - samples/sec: 398.50 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-11 18:18:59,968 epoch 4 - iter 432/1445 - loss 0.04792294 - time (sec): 132.21 - samples/sec: 399.98 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-11 18:19:45,773 epoch 4 - iter 576/1445 - loss 0.04725140 - time (sec): 178.02 - samples/sec: 394.65 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-11 18:20:31,218 epoch 4 - iter 720/1445 - loss 0.04662759 - time (sec): 223.46 - samples/sec: 390.40 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-11 18:21:12,959 epoch 4 - iter 864/1445 - loss 0.04657745 - time (sec): 265.20 - samples/sec: 391.32 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-11 18:21:55,577 epoch 4 - iter 1008/1445 - loss 0.04690755 - time (sec): 307.82 - samples/sec: 393.01 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-11 18:22:41,034 epoch 4 - iter 1152/1445 - loss 0.04926445 - time (sec): 353.28 - samples/sec: 394.39 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-11 18:23:27,336 epoch 4 - iter 1296/1445 - loss 0.04789025 - time (sec): 399.58 - samples/sec: 393.40 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-11 18:24:12,177 epoch 4 - iter 1440/1445 - loss 0.04577433 - time (sec): 444.42 - samples/sec: 395.52 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-11 18:24:13,407 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 18:24:13,407 EPOCH 4 done: loss 0.0457 - lr: 0.000107 |
|
2023-10-11 18:24:34,260 DEV : loss 0.08963057398796082 - f1-score (micro avg) 0.8319 |
|
2023-10-11 18:24:34,290 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 18:25:16,456 epoch 5 - iter 144/1445 - loss 0.01933052 - time (sec): 42.16 - samples/sec: 423.58 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 18:25:58,242 epoch 5 - iter 288/1445 - loss 0.02130787 - time (sec): 83.95 - samples/sec: 411.79 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-11 18:26:40,391 epoch 5 - iter 432/1445 - loss 0.02806167 - time (sec): 126.10 - samples/sec: 407.37 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-11 18:27:26,424 epoch 5 - iter 576/1445 - loss 0.02869156 - time (sec): 172.13 - samples/sec: 402.10 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 18:28:12,829 epoch 5 - iter 720/1445 - loss 0.03182868 - time (sec): 218.54 - samples/sec: 401.94 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-11 18:28:57,212 epoch 5 - iter 864/1445 - loss 0.03012182 - time (sec): 262.92 - samples/sec: 396.52 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-11 18:29:41,667 epoch 5 - iter 1008/1445 - loss 0.03206491 - time (sec): 307.38 - samples/sec: 396.37 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-11 18:30:27,402 epoch 5 - iter 1152/1445 - loss 0.03245719 - time (sec): 353.11 - samples/sec: 398.52 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-11 18:31:11,210 epoch 5 - iter 1296/1445 - loss 0.03258037 - time (sec): 396.92 - samples/sec: 396.84 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-11 18:31:55,181 epoch 5 - iter 1440/1445 - loss 0.03298262 - time (sec): 440.89 - samples/sec: 398.53 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-11 18:31:56,379 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 18:31:56,380 EPOCH 5 done: loss 0.0330 - lr: 0.000089 |
|
2023-10-11 18:32:18,188 DEV : loss 0.10669823735952377 - f1-score (micro avg) 0.8496 |
|
2023-10-11 18:32:18,220 saving best model |
|
2023-10-11 18:32:26,252 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 18:33:11,957 epoch 6 - iter 144/1445 - loss 0.02858235 - time (sec): 45.70 - samples/sec: 368.04 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-11 18:33:57,813 epoch 6 - iter 288/1445 - loss 0.02496902 - time (sec): 91.56 - samples/sec: 371.61 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-11 18:34:44,871 epoch 6 - iter 432/1445 - loss 0.02903031 - time (sec): 138.62 - samples/sec: 374.33 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-11 18:35:30,962 epoch 6 - iter 576/1445 - loss 0.02525110 - time (sec): 184.71 - samples/sec: 379.07 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-11 18:36:15,891 epoch 6 - iter 720/1445 - loss 0.02445476 - time (sec): 229.63 - samples/sec: 384.94 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-11 18:37:02,122 epoch 6 - iter 864/1445 - loss 0.02502730 - time (sec): 275.87 - samples/sec: 382.09 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-11 18:37:48,549 epoch 6 - iter 1008/1445 - loss 0.02407503 - time (sec): 322.29 - samples/sec: 379.12 - lr: 0.000076 - momentum: 0.000000 |
|
2023-10-11 18:38:33,282 epoch 6 - iter 1152/1445 - loss 0.02357627 - time (sec): 367.03 - samples/sec: 380.88 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-11 18:39:20,713 epoch 6 - iter 1296/1445 - loss 0.02464782 - time (sec): 414.46 - samples/sec: 384.58 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-11 18:40:05,384 epoch 6 - iter 1440/1445 - loss 0.02419495 - time (sec): 459.13 - samples/sec: 382.83 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-11 18:40:06,706 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 18:40:06,707 EPOCH 6 done: loss 0.0244 - lr: 0.000071 |
|
2023-10-11 18:40:29,735 DEV : loss 0.12241014838218689 - f1-score (micro avg) 0.8516 |
|
2023-10-11 18:40:29,808 saving best model |
|
2023-10-11 18:40:40,863 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 18:41:26,037 epoch 7 - iter 144/1445 - loss 0.01832273 - time (sec): 45.17 - samples/sec: 396.36 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-11 18:42:08,635 epoch 7 - iter 288/1445 - loss 0.01737041 - time (sec): 87.77 - samples/sec: 397.61 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-11 18:42:51,383 epoch 7 - iter 432/1445 - loss 0.01452621 - time (sec): 130.51 - samples/sec: 397.04 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-11 18:43:36,181 epoch 7 - iter 576/1445 - loss 0.01641296 - time (sec): 175.31 - samples/sec: 392.78 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-11 18:44:20,666 epoch 7 - iter 720/1445 - loss 0.01644292 - time (sec): 219.80 - samples/sec: 393.55 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-11 18:45:05,619 epoch 7 - iter 864/1445 - loss 0.01754543 - time (sec): 264.75 - samples/sec: 394.37 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-11 18:45:52,489 epoch 7 - iter 1008/1445 - loss 0.01846302 - time (sec): 311.62 - samples/sec: 393.10 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-11 18:46:36,969 epoch 7 - iter 1152/1445 - loss 0.01791464 - time (sec): 356.10 - samples/sec: 392.82 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-11 18:47:23,692 epoch 7 - iter 1296/1445 - loss 0.01888407 - time (sec): 402.82 - samples/sec: 390.17 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-11 18:48:09,718 epoch 7 - iter 1440/1445 - loss 0.01856727 - time (sec): 448.85 - samples/sec: 391.01 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-11 18:48:11,296 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 18:48:11,296 EPOCH 7 done: loss 0.0185 - lr: 0.000053 |
|
2023-10-11 18:48:33,997 DEV : loss 0.1283944845199585 - f1-score (micro avg) 0.8483 |
|
2023-10-11 18:48:34,041 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 18:49:21,193 epoch 8 - iter 144/1445 - loss 0.01377773 - time (sec): 47.15 - samples/sec: 398.08 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-11 18:50:07,645 epoch 8 - iter 288/1445 - loss 0.01338014 - time (sec): 93.60 - samples/sec: 384.26 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-11 18:50:50,610 epoch 8 - iter 432/1445 - loss 0.01240568 - time (sec): 136.57 - samples/sec: 388.41 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-11 18:51:35,104 epoch 8 - iter 576/1445 - loss 0.01139900 - time (sec): 181.06 - samples/sec: 385.89 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-11 18:52:21,530 epoch 8 - iter 720/1445 - loss 0.01311798 - time (sec): 227.49 - samples/sec: 386.11 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-11 18:53:07,917 epoch 8 - iter 864/1445 - loss 0.01251169 - time (sec): 273.87 - samples/sec: 387.76 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-11 18:53:53,699 epoch 8 - iter 1008/1445 - loss 0.01322670 - time (sec): 319.65 - samples/sec: 388.46 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-11 18:54:39,842 epoch 8 - iter 1152/1445 - loss 0.01450279 - time (sec): 365.80 - samples/sec: 389.42 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-11 18:55:25,240 epoch 8 - iter 1296/1445 - loss 0.01418221 - time (sec): 411.20 - samples/sec: 385.19 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-11 18:56:09,983 epoch 8 - iter 1440/1445 - loss 0.01430382 - time (sec): 455.94 - samples/sec: 385.30 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-11 18:56:11,369 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 18:56:11,370 EPOCH 8 done: loss 0.0143 - lr: 0.000036 |
|
2023-10-11 18:56:35,103 DEV : loss 0.1493687927722931 - f1-score (micro avg) 0.8463 |
|
2023-10-11 18:56:35,138 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 18:57:26,756 epoch 9 - iter 144/1445 - loss 0.01365623 - time (sec): 51.61 - samples/sec: 363.64 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-11 18:58:09,147 epoch 9 - iter 288/1445 - loss 0.01086009 - time (sec): 94.01 - samples/sec: 375.25 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 18:58:54,871 epoch 9 - iter 432/1445 - loss 0.01091733 - time (sec): 139.73 - samples/sec: 385.74 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-11 18:59:41,145 epoch 9 - iter 576/1445 - loss 0.00976057 - time (sec): 186.00 - samples/sec: 383.90 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-11 19:00:24,579 epoch 9 - iter 720/1445 - loss 0.00941457 - time (sec): 229.44 - samples/sec: 386.65 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-11 19:01:08,775 epoch 9 - iter 864/1445 - loss 0.00937481 - time (sec): 273.63 - samples/sec: 388.71 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-11 19:01:54,146 epoch 9 - iter 1008/1445 - loss 0.00942580 - time (sec): 319.01 - samples/sec: 391.24 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-11 19:02:36,804 epoch 9 - iter 1152/1445 - loss 0.00946647 - time (sec): 361.66 - samples/sec: 391.23 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-11 19:03:20,018 epoch 9 - iter 1296/1445 - loss 0.00909532 - time (sec): 404.88 - samples/sec: 391.44 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-11 19:04:06,074 epoch 9 - iter 1440/1445 - loss 0.00916284 - time (sec): 450.93 - samples/sec: 389.91 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-11 19:04:07,367 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:04:07,367 EPOCH 9 done: loss 0.0091 - lr: 0.000018 |
|
2023-10-11 19:04:31,956 DEV : loss 0.1440650075674057 - f1-score (micro avg) 0.8482 |
|
2023-10-11 19:04:31,994 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:05:16,374 epoch 10 - iter 144/1445 - loss 0.00340475 - time (sec): 44.38 - samples/sec: 378.02 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-11 19:06:03,828 epoch 10 - iter 288/1445 - loss 0.00685161 - time (sec): 91.83 - samples/sec: 383.93 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 19:06:50,612 epoch 10 - iter 432/1445 - loss 0.00634319 - time (sec): 138.62 - samples/sec: 378.20 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-11 19:07:34,275 epoch 10 - iter 576/1445 - loss 0.00601531 - time (sec): 182.28 - samples/sec: 379.74 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-11 19:08:18,247 epoch 10 - iter 720/1445 - loss 0.00683740 - time (sec): 226.25 - samples/sec: 387.69 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-11 19:09:03,786 epoch 10 - iter 864/1445 - loss 0.00666797 - time (sec): 271.79 - samples/sec: 390.11 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-11 19:09:49,595 epoch 10 - iter 1008/1445 - loss 0.00806627 - time (sec): 317.60 - samples/sec: 390.75 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-11 19:10:33,481 epoch 10 - iter 1152/1445 - loss 0.00767263 - time (sec): 361.49 - samples/sec: 389.28 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-11 19:11:20,051 epoch 10 - iter 1296/1445 - loss 0.00835646 - time (sec): 408.06 - samples/sec: 388.16 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-11 19:12:07,000 epoch 10 - iter 1440/1445 - loss 0.00799785 - time (sec): 455.00 - samples/sec: 385.90 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-11 19:12:08,397 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:12:08,398 EPOCH 10 done: loss 0.0080 - lr: 0.000000 |
|
2023-10-11 19:12:29,737 DEV : loss 0.15512152016162872 - f1-score (micro avg) 0.8485 |
|
2023-10-11 19:12:30,792 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 19:12:30,794 Loading model from best epoch ... |
|
2023-10-11 19:12:37,160 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-11 19:12:57,570 |
|
Results: |
|
- F-score (micro) 0.8349 |
|
- F-score (macro) 0.7208 |
|
- Accuracy 0.7272 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8266 0.8506 0.8384 482 |
|
LOC 0.8969 0.8734 0.8850 458 |
|
ORG 0.5000 0.3913 0.4390 69 |
|
|
|
micro avg 0.8404 0.8295 0.8349 1009 |
|
macro avg 0.7412 0.7051 0.7208 1009 |
|
weighted avg 0.8362 0.8295 0.8322 1009 |
|
|
|
2023-10-11 19:12:57,570 ---------------------------------------------------------------------------------------------------- |
|
|