|
2023-10-14 22:47:24,751 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:47:24,752 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=21, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-14 22:47:24,752 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:47:24,752 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences |
|
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator |
|
2023-10-14 22:47:24,752 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:47:24,752 Train: 3575 sentences |
|
2023-10-14 22:47:24,752 (train_with_dev=False, train_with_test=False) |
|
2023-10-14 22:47:24,752 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:47:24,752 Training Params: |
|
2023-10-14 22:47:24,752 - learning_rate: "0.00015" |
|
2023-10-14 22:47:24,752 - mini_batch_size: "8" |
|
2023-10-14 22:47:24,752 - max_epochs: "10" |
|
2023-10-14 22:47:24,753 - shuffle: "True" |
|
2023-10-14 22:47:24,753 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:47:24,753 Plugins: |
|
2023-10-14 22:47:24,753 - TensorboardLogger |
|
2023-10-14 22:47:24,753 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-14 22:47:24,753 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:47:24,753 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-14 22:47:24,753 - metric: "('micro avg', 'f1-score')" |
|
2023-10-14 22:47:24,753 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:47:24,753 Computation: |
|
2023-10-14 22:47:24,753 - compute on device: cuda:0 |
|
2023-10-14 22:47:24,753 - embedding storage: none |
|
2023-10-14 22:47:24,753 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:47:24,753 Model training base path: "hmbench-hipe2020/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-14 22:47:24,753 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:47:24,753 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:47:24,753 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-14 22:47:40,117 epoch 1 - iter 44/447 - loss 3.00865473 - time (sec): 15.36 - samples/sec: 557.77 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 22:47:57,013 epoch 1 - iter 88/447 - loss 2.99088672 - time (sec): 32.26 - samples/sec: 546.36 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 22:48:12,490 epoch 1 - iter 132/447 - loss 2.93254972 - time (sec): 47.74 - samples/sec: 550.32 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-14 22:48:27,423 epoch 1 - iter 176/447 - loss 2.81928089 - time (sec): 62.67 - samples/sec: 544.20 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-14 22:48:43,197 epoch 1 - iter 220/447 - loss 2.66111726 - time (sec): 78.44 - samples/sec: 542.95 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-14 22:48:58,992 epoch 1 - iter 264/447 - loss 2.48228942 - time (sec): 94.24 - samples/sec: 544.00 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-14 22:49:14,416 epoch 1 - iter 308/447 - loss 2.30227376 - time (sec): 109.66 - samples/sec: 544.30 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-14 22:49:29,786 epoch 1 - iter 352/447 - loss 2.11867337 - time (sec): 125.03 - samples/sec: 544.63 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-14 22:49:44,919 epoch 1 - iter 396/447 - loss 1.95364436 - time (sec): 140.17 - samples/sec: 546.36 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-14 22:50:00,252 epoch 1 - iter 440/447 - loss 1.80859385 - time (sec): 155.50 - samples/sec: 547.62 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-14 22:50:02,675 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:50:02,676 EPOCH 1 done: loss 1.7889 - lr: 0.000147 |
|
2023-10-14 22:50:25,518 DEV : loss 0.46937498450279236 - f1-score (micro avg) 0.0 |
|
2023-10-14 22:50:25,546 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:50:40,868 epoch 2 - iter 44/447 - loss 0.55558054 - time (sec): 15.32 - samples/sec: 546.30 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-14 22:50:56,263 epoch 2 - iter 88/447 - loss 0.48845589 - time (sec): 30.72 - samples/sec: 542.32 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-14 22:51:12,284 epoch 2 - iter 132/447 - loss 0.43240536 - time (sec): 46.74 - samples/sec: 541.69 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-14 22:51:27,584 epoch 2 - iter 176/447 - loss 0.41054710 - time (sec): 62.04 - samples/sec: 542.03 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-14 22:51:43,251 epoch 2 - iter 220/447 - loss 0.40506841 - time (sec): 77.70 - samples/sec: 542.84 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-14 22:51:59,058 epoch 2 - iter 264/447 - loss 0.38451537 - time (sec): 93.51 - samples/sec: 540.88 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-14 22:52:14,301 epoch 2 - iter 308/447 - loss 0.36595975 - time (sec): 108.75 - samples/sec: 538.49 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-14 22:52:29,694 epoch 2 - iter 352/447 - loss 0.35174651 - time (sec): 124.15 - samples/sec: 540.48 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-14 22:52:45,174 epoch 2 - iter 396/447 - loss 0.34343160 - time (sec): 139.63 - samples/sec: 543.43 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-14 22:53:02,266 epoch 2 - iter 440/447 - loss 0.33222224 - time (sec): 156.72 - samples/sec: 544.40 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-14 22:53:04,637 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:53:04,637 EPOCH 2 done: loss 0.3329 - lr: 0.000134 |
|
2023-10-14 22:53:29,512 DEV : loss 0.23229412734508514 - f1-score (micro avg) 0.5356 |
|
2023-10-14 22:53:29,540 saving best model |
|
2023-10-14 22:53:30,215 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:53:45,406 epoch 3 - iter 44/447 - loss 0.21939656 - time (sec): 15.19 - samples/sec: 543.15 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-14 22:54:02,714 epoch 3 - iter 88/447 - loss 0.22487685 - time (sec): 32.50 - samples/sec: 536.01 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-14 22:54:18,169 epoch 3 - iter 132/447 - loss 0.21699101 - time (sec): 47.95 - samples/sec: 540.31 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-14 22:54:34,350 epoch 3 - iter 176/447 - loss 0.21393514 - time (sec): 64.13 - samples/sec: 546.69 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-14 22:54:49,539 epoch 3 - iter 220/447 - loss 0.21489365 - time (sec): 79.32 - samples/sec: 543.20 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-14 22:55:05,789 epoch 3 - iter 264/447 - loss 0.20878197 - time (sec): 95.57 - samples/sec: 548.60 - lr: 0.000124 - momentum: 0.000000 |
|
2023-10-14 22:55:21,170 epoch 3 - iter 308/447 - loss 0.20134815 - time (sec): 110.95 - samples/sec: 545.66 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-14 22:55:36,236 epoch 3 - iter 352/447 - loss 0.19647582 - time (sec): 126.02 - samples/sec: 542.97 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-14 22:55:51,596 epoch 3 - iter 396/447 - loss 0.19391200 - time (sec): 141.38 - samples/sec: 542.17 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-14 22:56:07,050 epoch 3 - iter 440/447 - loss 0.19035496 - time (sec): 156.83 - samples/sec: 542.76 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-14 22:56:09,506 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:56:09,506 EPOCH 3 done: loss 0.1897 - lr: 0.000117 |
|
2023-10-14 22:56:34,219 DEV : loss 0.16955290734767914 - f1-score (micro avg) 0.6845 |
|
2023-10-14 22:56:34,247 saving best model |
|
2023-10-14 22:56:34,981 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:56:50,693 epoch 4 - iter 44/447 - loss 0.12930627 - time (sec): 15.71 - samples/sec: 559.69 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-14 22:57:06,174 epoch 4 - iter 88/447 - loss 0.12478911 - time (sec): 31.19 - samples/sec: 558.14 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-14 22:57:21,378 epoch 4 - iter 132/447 - loss 0.13051898 - time (sec): 46.40 - samples/sec: 551.91 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-14 22:57:36,888 epoch 4 - iter 176/447 - loss 0.12838011 - time (sec): 61.91 - samples/sec: 552.92 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-14 22:57:53,017 epoch 4 - iter 220/447 - loss 0.12533019 - time (sec): 78.03 - samples/sec: 554.82 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-14 22:58:08,232 epoch 4 - iter 264/447 - loss 0.12547990 - time (sec): 93.25 - samples/sec: 550.58 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-14 22:58:23,408 epoch 4 - iter 308/447 - loss 0.11906755 - time (sec): 108.43 - samples/sec: 548.37 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-14 22:58:39,629 epoch 4 - iter 352/447 - loss 0.11500732 - time (sec): 124.65 - samples/sec: 548.31 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-14 22:58:56,563 epoch 4 - iter 396/447 - loss 0.11468376 - time (sec): 141.58 - samples/sec: 545.03 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-14 22:59:11,728 epoch 4 - iter 440/447 - loss 0.11244876 - time (sec): 156.75 - samples/sec: 543.21 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-14 22:59:14,225 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:59:14,226 EPOCH 4 done: loss 0.1121 - lr: 0.000100 |
|
2023-10-14 22:59:39,205 DEV : loss 0.14614106714725494 - f1-score (micro avg) 0.7341 |
|
2023-10-14 22:59:39,233 saving best model |
|
2023-10-14 22:59:40,179 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 22:59:58,642 epoch 5 - iter 44/447 - loss 0.08716574 - time (sec): 18.46 - samples/sec: 571.45 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-14 23:00:14,064 epoch 5 - iter 88/447 - loss 0.08039635 - time (sec): 33.88 - samples/sec: 565.17 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-14 23:00:29,036 epoch 5 - iter 132/447 - loss 0.07769323 - time (sec): 48.85 - samples/sec: 551.39 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-14 23:00:44,131 epoch 5 - iter 176/447 - loss 0.07931203 - time (sec): 63.95 - samples/sec: 546.43 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-14 23:00:59,214 epoch 5 - iter 220/447 - loss 0.07764385 - time (sec): 79.03 - samples/sec: 541.32 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-14 23:01:15,149 epoch 5 - iter 264/447 - loss 0.07421337 - time (sec): 94.97 - samples/sec: 546.71 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-14 23:01:30,532 epoch 5 - iter 308/447 - loss 0.07187376 - time (sec): 110.35 - samples/sec: 544.48 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-14 23:01:46,370 epoch 5 - iter 352/447 - loss 0.07065057 - time (sec): 126.19 - samples/sec: 542.88 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-14 23:02:02,080 epoch 5 - iter 396/447 - loss 0.07030662 - time (sec): 141.90 - samples/sec: 542.77 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-14 23:02:17,328 epoch 5 - iter 440/447 - loss 0.06922786 - time (sec): 157.15 - samples/sec: 541.90 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-14 23:02:19,791 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 23:02:19,791 EPOCH 5 done: loss 0.0690 - lr: 0.000084 |
|
2023-10-14 23:02:44,708 DEV : loss 0.1731875240802765 - f1-score (micro avg) 0.7457 |
|
2023-10-14 23:02:44,736 saving best model |
|
2023-10-14 23:02:45,715 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 23:03:02,976 epoch 6 - iter 44/447 - loss 0.04662483 - time (sec): 17.26 - samples/sec: 532.78 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-14 23:03:18,579 epoch 6 - iter 88/447 - loss 0.04380020 - time (sec): 32.86 - samples/sec: 537.65 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-14 23:03:34,440 epoch 6 - iter 132/447 - loss 0.04119501 - time (sec): 48.72 - samples/sec: 544.12 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-14 23:03:49,907 epoch 6 - iter 176/447 - loss 0.04533557 - time (sec): 64.19 - samples/sec: 539.22 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-14 23:04:07,017 epoch 6 - iter 220/447 - loss 0.04633177 - time (sec): 81.30 - samples/sec: 550.35 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-14 23:04:23,012 epoch 6 - iter 264/447 - loss 0.04554066 - time (sec): 97.30 - samples/sec: 550.94 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-14 23:04:38,612 epoch 6 - iter 308/447 - loss 0.04413655 - time (sec): 112.90 - samples/sec: 545.30 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-14 23:04:53,578 epoch 6 - iter 352/447 - loss 0.04274808 - time (sec): 127.86 - samples/sec: 540.19 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-14 23:05:09,206 epoch 6 - iter 396/447 - loss 0.04590217 - time (sec): 143.49 - samples/sec: 539.97 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-14 23:05:24,469 epoch 6 - iter 440/447 - loss 0.04676563 - time (sec): 158.75 - samples/sec: 537.67 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-14 23:05:26,820 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 23:05:26,821 EPOCH 6 done: loss 0.0464 - lr: 0.000067 |
|
2023-10-14 23:05:51,977 DEV : loss 0.17097032070159912 - f1-score (micro avg) 0.7267 |
|
2023-10-14 23:05:52,005 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 23:06:07,446 epoch 7 - iter 44/447 - loss 0.04825257 - time (sec): 15.44 - samples/sec: 545.37 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-14 23:06:24,420 epoch 7 - iter 88/447 - loss 0.03994302 - time (sec): 32.41 - samples/sec: 537.53 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-14 23:06:40,225 epoch 7 - iter 132/447 - loss 0.03429586 - time (sec): 48.22 - samples/sec: 544.38 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-14 23:06:55,473 epoch 7 - iter 176/447 - loss 0.03286941 - time (sec): 63.47 - samples/sec: 538.62 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-14 23:07:11,019 epoch 7 - iter 220/447 - loss 0.03066026 - time (sec): 79.01 - samples/sec: 541.42 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-14 23:07:27,202 epoch 7 - iter 264/447 - loss 0.03111309 - time (sec): 95.20 - samples/sec: 548.54 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-14 23:07:43,337 epoch 7 - iter 308/447 - loss 0.03123670 - time (sec): 111.33 - samples/sec: 543.15 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-14 23:07:58,901 epoch 7 - iter 352/447 - loss 0.03369142 - time (sec): 126.89 - samples/sec: 543.38 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-14 23:08:14,735 epoch 7 - iter 396/447 - loss 0.03375955 - time (sec): 142.73 - samples/sec: 543.05 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-14 23:08:30,087 epoch 7 - iter 440/447 - loss 0.03316793 - time (sec): 158.08 - samples/sec: 540.15 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-14 23:08:32,422 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 23:08:32,422 EPOCH 7 done: loss 0.0328 - lr: 0.000050 |
|
2023-10-14 23:08:57,836 DEV : loss 0.1885330080986023 - f1-score (micro avg) 0.7483 |
|
2023-10-14 23:08:57,864 saving best model |
|
2023-10-14 23:08:58,600 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 23:09:14,353 epoch 8 - iter 44/447 - loss 0.03181773 - time (sec): 15.75 - samples/sec: 559.26 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-14 23:09:29,490 epoch 8 - iter 88/447 - loss 0.02428238 - time (sec): 30.89 - samples/sec: 540.40 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-14 23:09:44,938 epoch 8 - iter 132/447 - loss 0.02205436 - time (sec): 46.34 - samples/sec: 542.86 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-14 23:09:59,962 epoch 8 - iter 176/447 - loss 0.02159482 - time (sec): 61.36 - samples/sec: 538.90 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-14 23:10:15,350 epoch 8 - iter 220/447 - loss 0.02165351 - time (sec): 76.75 - samples/sec: 537.77 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-14 23:10:32,647 epoch 8 - iter 264/447 - loss 0.02258370 - time (sec): 94.05 - samples/sec: 535.69 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-14 23:10:48,418 epoch 8 - iter 308/447 - loss 0.02205011 - time (sec): 109.82 - samples/sec: 540.49 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-14 23:11:04,619 epoch 8 - iter 352/447 - loss 0.02369386 - time (sec): 126.02 - samples/sec: 544.53 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-14 23:11:19,778 epoch 8 - iter 396/447 - loss 0.02290951 - time (sec): 141.18 - samples/sec: 540.71 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-14 23:11:36,007 epoch 8 - iter 440/447 - loss 0.02293108 - time (sec): 157.41 - samples/sec: 541.21 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-14 23:11:38,461 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 23:11:38,462 EPOCH 8 done: loss 0.0227 - lr: 0.000034 |
|
2023-10-14 23:12:04,733 DEV : loss 0.199814110994339 - f1-score (micro avg) 0.7555 |
|
2023-10-14 23:12:04,761 saving best model |
|
2023-10-14 23:12:05,783 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 23:12:20,951 epoch 9 - iter 44/447 - loss 0.04396758 - time (sec): 15.17 - samples/sec: 499.27 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-14 23:12:36,328 epoch 9 - iter 88/447 - loss 0.02894602 - time (sec): 30.54 - samples/sec: 503.91 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-14 23:12:52,115 epoch 9 - iter 132/447 - loss 0.02505307 - time (sec): 46.33 - samples/sec: 521.06 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-14 23:13:07,569 epoch 9 - iter 176/447 - loss 0.02257352 - time (sec): 61.78 - samples/sec: 525.69 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-14 23:13:23,280 epoch 9 - iter 220/447 - loss 0.02155230 - time (sec): 77.49 - samples/sec: 527.12 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-14 23:13:39,504 epoch 9 - iter 264/447 - loss 0.02018879 - time (sec): 93.72 - samples/sec: 528.61 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-14 23:13:57,367 epoch 9 - iter 308/447 - loss 0.01958928 - time (sec): 111.58 - samples/sec: 531.88 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-14 23:14:13,241 epoch 9 - iter 352/447 - loss 0.01999027 - time (sec): 127.46 - samples/sec: 531.80 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-14 23:14:29,020 epoch 9 - iter 396/447 - loss 0.01952782 - time (sec): 143.24 - samples/sec: 535.61 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-14 23:14:45,030 epoch 9 - iter 440/447 - loss 0.01960205 - time (sec): 159.25 - samples/sec: 535.34 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-14 23:14:47,567 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 23:14:47,567 EPOCH 9 done: loss 0.0194 - lr: 0.000017 |
|
2023-10-14 23:15:13,852 DEV : loss 0.21139812469482422 - f1-score (micro avg) 0.7611 |
|
2023-10-14 23:15:13,881 saving best model |
|
2023-10-14 23:15:16,742 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 23:15:32,924 epoch 10 - iter 44/447 - loss 0.01287936 - time (sec): 16.18 - samples/sec: 549.66 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-14 23:15:50,468 epoch 10 - iter 88/447 - loss 0.01616924 - time (sec): 33.73 - samples/sec: 559.34 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-14 23:16:05,420 epoch 10 - iter 132/447 - loss 0.01437107 - time (sec): 48.68 - samples/sec: 541.47 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-14 23:16:20,761 epoch 10 - iter 176/447 - loss 0.01312121 - time (sec): 64.02 - samples/sec: 538.41 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-14 23:16:36,868 epoch 10 - iter 220/447 - loss 0.01252890 - time (sec): 80.13 - samples/sec: 545.53 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-14 23:16:52,475 epoch 10 - iter 264/447 - loss 0.01373388 - time (sec): 95.73 - samples/sec: 537.95 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-14 23:17:08,408 epoch 10 - iter 308/447 - loss 0.01350192 - time (sec): 111.66 - samples/sec: 538.04 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-14 23:17:23,667 epoch 10 - iter 352/447 - loss 0.01591521 - time (sec): 126.92 - samples/sec: 535.10 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-14 23:17:39,557 epoch 10 - iter 396/447 - loss 0.01631178 - time (sec): 142.81 - samples/sec: 533.79 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-14 23:17:56,279 epoch 10 - iter 440/447 - loss 0.01595372 - time (sec): 159.54 - samples/sec: 535.28 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-14 23:17:58,644 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 23:17:58,644 EPOCH 10 done: loss 0.0158 - lr: 0.000001 |
|
2023-10-14 23:18:23,360 DEV : loss 0.21074163913726807 - f1-score (micro avg) 0.7603 |
|
2023-10-14 23:18:24,055 ---------------------------------------------------------------------------------------------------- |
|
2023-10-14 23:18:24,056 Loading model from best epoch ... |
|
2023-10-14 23:18:26,500 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time |
|
2023-10-14 23:18:47,840 |
|
Results: |
|
- F-score (micro) 0.7415 |
|
- F-score (macro) 0.6185 |
|
- Accuracy 0.6113 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
loc 0.8581 0.8725 0.8652 596 |
|
pers 0.6489 0.7658 0.7025 333 |
|
org 0.4459 0.5303 0.4844 132 |
|
prod 0.5854 0.3636 0.4486 66 |
|
time 0.5918 0.5918 0.5918 49 |
|
|
|
micro avg 0.7207 0.7636 0.7415 1176 |
|
macro avg 0.6260 0.6248 0.6185 1176 |
|
weighted avg 0.7262 0.7636 0.7416 1176 |
|
|
|
2023-10-14 23:18:47,840 ---------------------------------------------------------------------------------------------------- |
|
|