stefan-it's picture
Upload folder using huggingface_hub
b70b76e
2023-10-14 22:47:24,751 ----------------------------------------------------------------------------------------------------
2023-10-14 22:47:24,752 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 22:47:24,752 ----------------------------------------------------------------------------------------------------
2023-10-14 22:47:24,752 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-14 22:47:24,752 ----------------------------------------------------------------------------------------------------
2023-10-14 22:47:24,752 Train: 3575 sentences
2023-10-14 22:47:24,752 (train_with_dev=False, train_with_test=False)
2023-10-14 22:47:24,752 ----------------------------------------------------------------------------------------------------
2023-10-14 22:47:24,752 Training Params:
2023-10-14 22:47:24,752 - learning_rate: "0.00015"
2023-10-14 22:47:24,752 - mini_batch_size: "8"
2023-10-14 22:47:24,752 - max_epochs: "10"
2023-10-14 22:47:24,753 - shuffle: "True"
2023-10-14 22:47:24,753 ----------------------------------------------------------------------------------------------------
2023-10-14 22:47:24,753 Plugins:
2023-10-14 22:47:24,753 - TensorboardLogger
2023-10-14 22:47:24,753 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 22:47:24,753 ----------------------------------------------------------------------------------------------------
2023-10-14 22:47:24,753 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 22:47:24,753 - metric: "('micro avg', 'f1-score')"
2023-10-14 22:47:24,753 ----------------------------------------------------------------------------------------------------
2023-10-14 22:47:24,753 Computation:
2023-10-14 22:47:24,753 - compute on device: cuda:0
2023-10-14 22:47:24,753 - embedding storage: none
2023-10-14 22:47:24,753 ----------------------------------------------------------------------------------------------------
2023-10-14 22:47:24,753 Model training base path: "hmbench-hipe2020/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-3"
2023-10-14 22:47:24,753 ----------------------------------------------------------------------------------------------------
2023-10-14 22:47:24,753 ----------------------------------------------------------------------------------------------------
2023-10-14 22:47:24,753 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-14 22:47:40,117 epoch 1 - iter 44/447 - loss 3.00865473 - time (sec): 15.36 - samples/sec: 557.77 - lr: 0.000014 - momentum: 0.000000
2023-10-14 22:47:57,013 epoch 1 - iter 88/447 - loss 2.99088672 - time (sec): 32.26 - samples/sec: 546.36 - lr: 0.000029 - momentum: 0.000000
2023-10-14 22:48:12,490 epoch 1 - iter 132/447 - loss 2.93254972 - time (sec): 47.74 - samples/sec: 550.32 - lr: 0.000044 - momentum: 0.000000
2023-10-14 22:48:27,423 epoch 1 - iter 176/447 - loss 2.81928089 - time (sec): 62.67 - samples/sec: 544.20 - lr: 0.000059 - momentum: 0.000000
2023-10-14 22:48:43,197 epoch 1 - iter 220/447 - loss 2.66111726 - time (sec): 78.44 - samples/sec: 542.95 - lr: 0.000073 - momentum: 0.000000
2023-10-14 22:48:58,992 epoch 1 - iter 264/447 - loss 2.48228942 - time (sec): 94.24 - samples/sec: 544.00 - lr: 0.000088 - momentum: 0.000000
2023-10-14 22:49:14,416 epoch 1 - iter 308/447 - loss 2.30227376 - time (sec): 109.66 - samples/sec: 544.30 - lr: 0.000103 - momentum: 0.000000
2023-10-14 22:49:29,786 epoch 1 - iter 352/447 - loss 2.11867337 - time (sec): 125.03 - samples/sec: 544.63 - lr: 0.000118 - momentum: 0.000000
2023-10-14 22:49:44,919 epoch 1 - iter 396/447 - loss 1.95364436 - time (sec): 140.17 - samples/sec: 546.36 - lr: 0.000133 - momentum: 0.000000
2023-10-14 22:50:00,252 epoch 1 - iter 440/447 - loss 1.80859385 - time (sec): 155.50 - samples/sec: 547.62 - lr: 0.000147 - momentum: 0.000000
2023-10-14 22:50:02,675 ----------------------------------------------------------------------------------------------------
2023-10-14 22:50:02,676 EPOCH 1 done: loss 1.7889 - lr: 0.000147
2023-10-14 22:50:25,518 DEV : loss 0.46937498450279236 - f1-score (micro avg) 0.0
2023-10-14 22:50:25,546 ----------------------------------------------------------------------------------------------------
2023-10-14 22:50:40,868 epoch 2 - iter 44/447 - loss 0.55558054 - time (sec): 15.32 - samples/sec: 546.30 - lr: 0.000148 - momentum: 0.000000
2023-10-14 22:50:56,263 epoch 2 - iter 88/447 - loss 0.48845589 - time (sec): 30.72 - samples/sec: 542.32 - lr: 0.000147 - momentum: 0.000000
2023-10-14 22:51:12,284 epoch 2 - iter 132/447 - loss 0.43240536 - time (sec): 46.74 - samples/sec: 541.69 - lr: 0.000145 - momentum: 0.000000
2023-10-14 22:51:27,584 epoch 2 - iter 176/447 - loss 0.41054710 - time (sec): 62.04 - samples/sec: 542.03 - lr: 0.000143 - momentum: 0.000000
2023-10-14 22:51:43,251 epoch 2 - iter 220/447 - loss 0.40506841 - time (sec): 77.70 - samples/sec: 542.84 - lr: 0.000142 - momentum: 0.000000
2023-10-14 22:51:59,058 epoch 2 - iter 264/447 - loss 0.38451537 - time (sec): 93.51 - samples/sec: 540.88 - lr: 0.000140 - momentum: 0.000000
2023-10-14 22:52:14,301 epoch 2 - iter 308/447 - loss 0.36595975 - time (sec): 108.75 - samples/sec: 538.49 - lr: 0.000139 - momentum: 0.000000
2023-10-14 22:52:29,694 epoch 2 - iter 352/447 - loss 0.35174651 - time (sec): 124.15 - samples/sec: 540.48 - lr: 0.000137 - momentum: 0.000000
2023-10-14 22:52:45,174 epoch 2 - iter 396/447 - loss 0.34343160 - time (sec): 139.63 - samples/sec: 543.43 - lr: 0.000135 - momentum: 0.000000
2023-10-14 22:53:02,266 epoch 2 - iter 440/447 - loss 0.33222224 - time (sec): 156.72 - samples/sec: 544.40 - lr: 0.000134 - momentum: 0.000000
2023-10-14 22:53:04,637 ----------------------------------------------------------------------------------------------------
2023-10-14 22:53:04,637 EPOCH 2 done: loss 0.3329 - lr: 0.000134
2023-10-14 22:53:29,512 DEV : loss 0.23229412734508514 - f1-score (micro avg) 0.5356
2023-10-14 22:53:29,540 saving best model
2023-10-14 22:53:30,215 ----------------------------------------------------------------------------------------------------
2023-10-14 22:53:45,406 epoch 3 - iter 44/447 - loss 0.21939656 - time (sec): 15.19 - samples/sec: 543.15 - lr: 0.000132 - momentum: 0.000000
2023-10-14 22:54:02,714 epoch 3 - iter 88/447 - loss 0.22487685 - time (sec): 32.50 - samples/sec: 536.01 - lr: 0.000130 - momentum: 0.000000
2023-10-14 22:54:18,169 epoch 3 - iter 132/447 - loss 0.21699101 - time (sec): 47.95 - samples/sec: 540.31 - lr: 0.000128 - momentum: 0.000000
2023-10-14 22:54:34,350 epoch 3 - iter 176/447 - loss 0.21393514 - time (sec): 64.13 - samples/sec: 546.69 - lr: 0.000127 - momentum: 0.000000
2023-10-14 22:54:49,539 epoch 3 - iter 220/447 - loss 0.21489365 - time (sec): 79.32 - samples/sec: 543.20 - lr: 0.000125 - momentum: 0.000000
2023-10-14 22:55:05,789 epoch 3 - iter 264/447 - loss 0.20878197 - time (sec): 95.57 - samples/sec: 548.60 - lr: 0.000124 - momentum: 0.000000
2023-10-14 22:55:21,170 epoch 3 - iter 308/447 - loss 0.20134815 - time (sec): 110.95 - samples/sec: 545.66 - lr: 0.000122 - momentum: 0.000000
2023-10-14 22:55:36,236 epoch 3 - iter 352/447 - loss 0.19647582 - time (sec): 126.02 - samples/sec: 542.97 - lr: 0.000120 - momentum: 0.000000
2023-10-14 22:55:51,596 epoch 3 - iter 396/447 - loss 0.19391200 - time (sec): 141.38 - samples/sec: 542.17 - lr: 0.000119 - momentum: 0.000000
2023-10-14 22:56:07,050 epoch 3 - iter 440/447 - loss 0.19035496 - time (sec): 156.83 - samples/sec: 542.76 - lr: 0.000117 - momentum: 0.000000
2023-10-14 22:56:09,506 ----------------------------------------------------------------------------------------------------
2023-10-14 22:56:09,506 EPOCH 3 done: loss 0.1897 - lr: 0.000117
2023-10-14 22:56:34,219 DEV : loss 0.16955290734767914 - f1-score (micro avg) 0.6845
2023-10-14 22:56:34,247 saving best model
2023-10-14 22:56:34,981 ----------------------------------------------------------------------------------------------------
2023-10-14 22:56:50,693 epoch 4 - iter 44/447 - loss 0.12930627 - time (sec): 15.71 - samples/sec: 559.69 - lr: 0.000115 - momentum: 0.000000
2023-10-14 22:57:06,174 epoch 4 - iter 88/447 - loss 0.12478911 - time (sec): 31.19 - samples/sec: 558.14 - lr: 0.000113 - momentum: 0.000000
2023-10-14 22:57:21,378 epoch 4 - iter 132/447 - loss 0.13051898 - time (sec): 46.40 - samples/sec: 551.91 - lr: 0.000112 - momentum: 0.000000
2023-10-14 22:57:36,888 epoch 4 - iter 176/447 - loss 0.12838011 - time (sec): 61.91 - samples/sec: 552.92 - lr: 0.000110 - momentum: 0.000000
2023-10-14 22:57:53,017 epoch 4 - iter 220/447 - loss 0.12533019 - time (sec): 78.03 - samples/sec: 554.82 - lr: 0.000109 - momentum: 0.000000
2023-10-14 22:58:08,232 epoch 4 - iter 264/447 - loss 0.12547990 - time (sec): 93.25 - samples/sec: 550.58 - lr: 0.000107 - momentum: 0.000000
2023-10-14 22:58:23,408 epoch 4 - iter 308/447 - loss 0.11906755 - time (sec): 108.43 - samples/sec: 548.37 - lr: 0.000105 - momentum: 0.000000
2023-10-14 22:58:39,629 epoch 4 - iter 352/447 - loss 0.11500732 - time (sec): 124.65 - samples/sec: 548.31 - lr: 0.000104 - momentum: 0.000000
2023-10-14 22:58:56,563 epoch 4 - iter 396/447 - loss 0.11468376 - time (sec): 141.58 - samples/sec: 545.03 - lr: 0.000102 - momentum: 0.000000
2023-10-14 22:59:11,728 epoch 4 - iter 440/447 - loss 0.11244876 - time (sec): 156.75 - samples/sec: 543.21 - lr: 0.000100 - momentum: 0.000000
2023-10-14 22:59:14,225 ----------------------------------------------------------------------------------------------------
2023-10-14 22:59:14,226 EPOCH 4 done: loss 0.1121 - lr: 0.000100
2023-10-14 22:59:39,205 DEV : loss 0.14614106714725494 - f1-score (micro avg) 0.7341
2023-10-14 22:59:39,233 saving best model
2023-10-14 22:59:40,179 ----------------------------------------------------------------------------------------------------
2023-10-14 22:59:58,642 epoch 5 - iter 44/447 - loss 0.08716574 - time (sec): 18.46 - samples/sec: 571.45 - lr: 0.000098 - momentum: 0.000000
2023-10-14 23:00:14,064 epoch 5 - iter 88/447 - loss 0.08039635 - time (sec): 33.88 - samples/sec: 565.17 - lr: 0.000097 - momentum: 0.000000
2023-10-14 23:00:29,036 epoch 5 - iter 132/447 - loss 0.07769323 - time (sec): 48.85 - samples/sec: 551.39 - lr: 0.000095 - momentum: 0.000000
2023-10-14 23:00:44,131 epoch 5 - iter 176/447 - loss 0.07931203 - time (sec): 63.95 - samples/sec: 546.43 - lr: 0.000094 - momentum: 0.000000
2023-10-14 23:00:59,214 epoch 5 - iter 220/447 - loss 0.07764385 - time (sec): 79.03 - samples/sec: 541.32 - lr: 0.000092 - momentum: 0.000000
2023-10-14 23:01:15,149 epoch 5 - iter 264/447 - loss 0.07421337 - time (sec): 94.97 - samples/sec: 546.71 - lr: 0.000090 - momentum: 0.000000
2023-10-14 23:01:30,532 epoch 5 - iter 308/447 - loss 0.07187376 - time (sec): 110.35 - samples/sec: 544.48 - lr: 0.000089 - momentum: 0.000000
2023-10-14 23:01:46,370 epoch 5 - iter 352/447 - loss 0.07065057 - time (sec): 126.19 - samples/sec: 542.88 - lr: 0.000087 - momentum: 0.000000
2023-10-14 23:02:02,080 epoch 5 - iter 396/447 - loss 0.07030662 - time (sec): 141.90 - samples/sec: 542.77 - lr: 0.000085 - momentum: 0.000000
2023-10-14 23:02:17,328 epoch 5 - iter 440/447 - loss 0.06922786 - time (sec): 157.15 - samples/sec: 541.90 - lr: 0.000084 - momentum: 0.000000
2023-10-14 23:02:19,791 ----------------------------------------------------------------------------------------------------
2023-10-14 23:02:19,791 EPOCH 5 done: loss 0.0690 - lr: 0.000084
2023-10-14 23:02:44,708 DEV : loss 0.1731875240802765 - f1-score (micro avg) 0.7457
2023-10-14 23:02:44,736 saving best model
2023-10-14 23:02:45,715 ----------------------------------------------------------------------------------------------------
2023-10-14 23:03:02,976 epoch 6 - iter 44/447 - loss 0.04662483 - time (sec): 17.26 - samples/sec: 532.78 - lr: 0.000082 - momentum: 0.000000
2023-10-14 23:03:18,579 epoch 6 - iter 88/447 - loss 0.04380020 - time (sec): 32.86 - samples/sec: 537.65 - lr: 0.000080 - momentum: 0.000000
2023-10-14 23:03:34,440 epoch 6 - iter 132/447 - loss 0.04119501 - time (sec): 48.72 - samples/sec: 544.12 - lr: 0.000079 - momentum: 0.000000
2023-10-14 23:03:49,907 epoch 6 - iter 176/447 - loss 0.04533557 - time (sec): 64.19 - samples/sec: 539.22 - lr: 0.000077 - momentum: 0.000000
2023-10-14 23:04:07,017 epoch 6 - iter 220/447 - loss 0.04633177 - time (sec): 81.30 - samples/sec: 550.35 - lr: 0.000075 - momentum: 0.000000
2023-10-14 23:04:23,012 epoch 6 - iter 264/447 - loss 0.04554066 - time (sec): 97.30 - samples/sec: 550.94 - lr: 0.000074 - momentum: 0.000000
2023-10-14 23:04:38,612 epoch 6 - iter 308/447 - loss 0.04413655 - time (sec): 112.90 - samples/sec: 545.30 - lr: 0.000072 - momentum: 0.000000
2023-10-14 23:04:53,578 epoch 6 - iter 352/447 - loss 0.04274808 - time (sec): 127.86 - samples/sec: 540.19 - lr: 0.000070 - momentum: 0.000000
2023-10-14 23:05:09,206 epoch 6 - iter 396/447 - loss 0.04590217 - time (sec): 143.49 - samples/sec: 539.97 - lr: 0.000069 - momentum: 0.000000
2023-10-14 23:05:24,469 epoch 6 - iter 440/447 - loss 0.04676563 - time (sec): 158.75 - samples/sec: 537.67 - lr: 0.000067 - momentum: 0.000000
2023-10-14 23:05:26,820 ----------------------------------------------------------------------------------------------------
2023-10-14 23:05:26,821 EPOCH 6 done: loss 0.0464 - lr: 0.000067
2023-10-14 23:05:51,977 DEV : loss 0.17097032070159912 - f1-score (micro avg) 0.7267
2023-10-14 23:05:52,005 ----------------------------------------------------------------------------------------------------
2023-10-14 23:06:07,446 epoch 7 - iter 44/447 - loss 0.04825257 - time (sec): 15.44 - samples/sec: 545.37 - lr: 0.000065 - momentum: 0.000000
2023-10-14 23:06:24,420 epoch 7 - iter 88/447 - loss 0.03994302 - time (sec): 32.41 - samples/sec: 537.53 - lr: 0.000064 - momentum: 0.000000
2023-10-14 23:06:40,225 epoch 7 - iter 132/447 - loss 0.03429586 - time (sec): 48.22 - samples/sec: 544.38 - lr: 0.000062 - momentum: 0.000000
2023-10-14 23:06:55,473 epoch 7 - iter 176/447 - loss 0.03286941 - time (sec): 63.47 - samples/sec: 538.62 - lr: 0.000060 - momentum: 0.000000
2023-10-14 23:07:11,019 epoch 7 - iter 220/447 - loss 0.03066026 - time (sec): 79.01 - samples/sec: 541.42 - lr: 0.000059 - momentum: 0.000000
2023-10-14 23:07:27,202 epoch 7 - iter 264/447 - loss 0.03111309 - time (sec): 95.20 - samples/sec: 548.54 - lr: 0.000057 - momentum: 0.000000
2023-10-14 23:07:43,337 epoch 7 - iter 308/447 - loss 0.03123670 - time (sec): 111.33 - samples/sec: 543.15 - lr: 0.000055 - momentum: 0.000000
2023-10-14 23:07:58,901 epoch 7 - iter 352/447 - loss 0.03369142 - time (sec): 126.89 - samples/sec: 543.38 - lr: 0.000054 - momentum: 0.000000
2023-10-14 23:08:14,735 epoch 7 - iter 396/447 - loss 0.03375955 - time (sec): 142.73 - samples/sec: 543.05 - lr: 0.000052 - momentum: 0.000000
2023-10-14 23:08:30,087 epoch 7 - iter 440/447 - loss 0.03316793 - time (sec): 158.08 - samples/sec: 540.15 - lr: 0.000050 - momentum: 0.000000
2023-10-14 23:08:32,422 ----------------------------------------------------------------------------------------------------
2023-10-14 23:08:32,422 EPOCH 7 done: loss 0.0328 - lr: 0.000050
2023-10-14 23:08:57,836 DEV : loss 0.1885330080986023 - f1-score (micro avg) 0.7483
2023-10-14 23:08:57,864 saving best model
2023-10-14 23:08:58,600 ----------------------------------------------------------------------------------------------------
2023-10-14 23:09:14,353 epoch 8 - iter 44/447 - loss 0.03181773 - time (sec): 15.75 - samples/sec: 559.26 - lr: 0.000049 - momentum: 0.000000
2023-10-14 23:09:29,490 epoch 8 - iter 88/447 - loss 0.02428238 - time (sec): 30.89 - samples/sec: 540.40 - lr: 0.000047 - momentum: 0.000000
2023-10-14 23:09:44,938 epoch 8 - iter 132/447 - loss 0.02205436 - time (sec): 46.34 - samples/sec: 542.86 - lr: 0.000045 - momentum: 0.000000
2023-10-14 23:09:59,962 epoch 8 - iter 176/447 - loss 0.02159482 - time (sec): 61.36 - samples/sec: 538.90 - lr: 0.000044 - momentum: 0.000000
2023-10-14 23:10:15,350 epoch 8 - iter 220/447 - loss 0.02165351 - time (sec): 76.75 - samples/sec: 537.77 - lr: 0.000042 - momentum: 0.000000
2023-10-14 23:10:32,647 epoch 8 - iter 264/447 - loss 0.02258370 - time (sec): 94.05 - samples/sec: 535.69 - lr: 0.000040 - momentum: 0.000000
2023-10-14 23:10:48,418 epoch 8 - iter 308/447 - loss 0.02205011 - time (sec): 109.82 - samples/sec: 540.49 - lr: 0.000039 - momentum: 0.000000
2023-10-14 23:11:04,619 epoch 8 - iter 352/447 - loss 0.02369386 - time (sec): 126.02 - samples/sec: 544.53 - lr: 0.000037 - momentum: 0.000000
2023-10-14 23:11:19,778 epoch 8 - iter 396/447 - loss 0.02290951 - time (sec): 141.18 - samples/sec: 540.71 - lr: 0.000035 - momentum: 0.000000
2023-10-14 23:11:36,007 epoch 8 - iter 440/447 - loss 0.02293108 - time (sec): 157.41 - samples/sec: 541.21 - lr: 0.000034 - momentum: 0.000000
2023-10-14 23:11:38,461 ----------------------------------------------------------------------------------------------------
2023-10-14 23:11:38,462 EPOCH 8 done: loss 0.0227 - lr: 0.000034
2023-10-14 23:12:04,733 DEV : loss 0.199814110994339 - f1-score (micro avg) 0.7555
2023-10-14 23:12:04,761 saving best model
2023-10-14 23:12:05,783 ----------------------------------------------------------------------------------------------------
2023-10-14 23:12:20,951 epoch 9 - iter 44/447 - loss 0.04396758 - time (sec): 15.17 - samples/sec: 499.27 - lr: 0.000032 - momentum: 0.000000
2023-10-14 23:12:36,328 epoch 9 - iter 88/447 - loss 0.02894602 - time (sec): 30.54 - samples/sec: 503.91 - lr: 0.000030 - momentum: 0.000000
2023-10-14 23:12:52,115 epoch 9 - iter 132/447 - loss 0.02505307 - time (sec): 46.33 - samples/sec: 521.06 - lr: 0.000029 - momentum: 0.000000
2023-10-14 23:13:07,569 epoch 9 - iter 176/447 - loss 0.02257352 - time (sec): 61.78 - samples/sec: 525.69 - lr: 0.000027 - momentum: 0.000000
2023-10-14 23:13:23,280 epoch 9 - iter 220/447 - loss 0.02155230 - time (sec): 77.49 - samples/sec: 527.12 - lr: 0.000025 - momentum: 0.000000
2023-10-14 23:13:39,504 epoch 9 - iter 264/447 - loss 0.02018879 - time (sec): 93.72 - samples/sec: 528.61 - lr: 0.000024 - momentum: 0.000000
2023-10-14 23:13:57,367 epoch 9 - iter 308/447 - loss 0.01958928 - time (sec): 111.58 - samples/sec: 531.88 - lr: 0.000022 - momentum: 0.000000
2023-10-14 23:14:13,241 epoch 9 - iter 352/447 - loss 0.01999027 - time (sec): 127.46 - samples/sec: 531.80 - lr: 0.000020 - momentum: 0.000000
2023-10-14 23:14:29,020 epoch 9 - iter 396/447 - loss 0.01952782 - time (sec): 143.24 - samples/sec: 535.61 - lr: 0.000019 - momentum: 0.000000
2023-10-14 23:14:45,030 epoch 9 - iter 440/447 - loss 0.01960205 - time (sec): 159.25 - samples/sec: 535.34 - lr: 0.000017 - momentum: 0.000000
2023-10-14 23:14:47,567 ----------------------------------------------------------------------------------------------------
2023-10-14 23:14:47,567 EPOCH 9 done: loss 0.0194 - lr: 0.000017
2023-10-14 23:15:13,852 DEV : loss 0.21139812469482422 - f1-score (micro avg) 0.7611
2023-10-14 23:15:13,881 saving best model
2023-10-14 23:15:16,742 ----------------------------------------------------------------------------------------------------
2023-10-14 23:15:32,924 epoch 10 - iter 44/447 - loss 0.01287936 - time (sec): 16.18 - samples/sec: 549.66 - lr: 0.000015 - momentum: 0.000000
2023-10-14 23:15:50,468 epoch 10 - iter 88/447 - loss 0.01616924 - time (sec): 33.73 - samples/sec: 559.34 - lr: 0.000014 - momentum: 0.000000
2023-10-14 23:16:05,420 epoch 10 - iter 132/447 - loss 0.01437107 - time (sec): 48.68 - samples/sec: 541.47 - lr: 0.000012 - momentum: 0.000000
2023-10-14 23:16:20,761 epoch 10 - iter 176/447 - loss 0.01312121 - time (sec): 64.02 - samples/sec: 538.41 - lr: 0.000010 - momentum: 0.000000
2023-10-14 23:16:36,868 epoch 10 - iter 220/447 - loss 0.01252890 - time (sec): 80.13 - samples/sec: 545.53 - lr: 0.000009 - momentum: 0.000000
2023-10-14 23:16:52,475 epoch 10 - iter 264/447 - loss 0.01373388 - time (sec): 95.73 - samples/sec: 537.95 - lr: 0.000007 - momentum: 0.000000
2023-10-14 23:17:08,408 epoch 10 - iter 308/447 - loss 0.01350192 - time (sec): 111.66 - samples/sec: 538.04 - lr: 0.000005 - momentum: 0.000000
2023-10-14 23:17:23,667 epoch 10 - iter 352/447 - loss 0.01591521 - time (sec): 126.92 - samples/sec: 535.10 - lr: 0.000004 - momentum: 0.000000
2023-10-14 23:17:39,557 epoch 10 - iter 396/447 - loss 0.01631178 - time (sec): 142.81 - samples/sec: 533.79 - lr: 0.000002 - momentum: 0.000000
2023-10-14 23:17:56,279 epoch 10 - iter 440/447 - loss 0.01595372 - time (sec): 159.54 - samples/sec: 535.28 - lr: 0.000001 - momentum: 0.000000
2023-10-14 23:17:58,644 ----------------------------------------------------------------------------------------------------
2023-10-14 23:17:58,644 EPOCH 10 done: loss 0.0158 - lr: 0.000001
2023-10-14 23:18:23,360 DEV : loss 0.21074163913726807 - f1-score (micro avg) 0.7603
2023-10-14 23:18:24,055 ----------------------------------------------------------------------------------------------------
2023-10-14 23:18:24,056 Loading model from best epoch ...
2023-10-14 23:18:26,500 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-14 23:18:47,840
Results:
- F-score (micro) 0.7415
- F-score (macro) 0.6185
- Accuracy 0.6113
By class:
precision recall f1-score support
loc 0.8581 0.8725 0.8652 596
pers 0.6489 0.7658 0.7025 333
org 0.4459 0.5303 0.4844 132
prod 0.5854 0.3636 0.4486 66
time 0.5918 0.5918 0.5918 49
micro avg 0.7207 0.7636 0.7415 1176
macro avg 0.6260 0.6248 0.6185 1176
weighted avg 0.7262 0.7636 0.7416 1176
2023-10-14 23:18:47,840 ----------------------------------------------------------------------------------------------------