stefan-it's picture
Upload folder using huggingface_hub
d9427d0
2023-10-14 20:04:40,573 ----------------------------------------------------------------------------------------------------
2023-10-14 20:04:40,575 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 20:04:40,575 ----------------------------------------------------------------------------------------------------
2023-10-14 20:04:40,575 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences
- NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator
2023-10-14 20:04:40,575 ----------------------------------------------------------------------------------------------------
2023-10-14 20:04:40,575 Train: 3575 sentences
2023-10-14 20:04:40,575 (train_with_dev=False, train_with_test=False)
2023-10-14 20:04:40,575 ----------------------------------------------------------------------------------------------------
2023-10-14 20:04:40,575 Training Params:
2023-10-14 20:04:40,575 - learning_rate: "0.00016"
2023-10-14 20:04:40,575 - mini_batch_size: "4"
2023-10-14 20:04:40,575 - max_epochs: "10"
2023-10-14 20:04:40,575 - shuffle: "True"
2023-10-14 20:04:40,575 ----------------------------------------------------------------------------------------------------
2023-10-14 20:04:40,575 Plugins:
2023-10-14 20:04:40,575 - TensorboardLogger
2023-10-14 20:04:40,575 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 20:04:40,575 ----------------------------------------------------------------------------------------------------
2023-10-14 20:04:40,575 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 20:04:40,575 - metric: "('micro avg', 'f1-score')"
2023-10-14 20:04:40,575 ----------------------------------------------------------------------------------------------------
2023-10-14 20:04:40,575 Computation:
2023-10-14 20:04:40,575 - compute on device: cuda:0
2023-10-14 20:04:40,575 - embedding storage: none
2023-10-14 20:04:40,575 ----------------------------------------------------------------------------------------------------
2023-10-14 20:04:40,576 Model training base path: "hmbench-hipe2020/de-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-1"
2023-10-14 20:04:40,576 ----------------------------------------------------------------------------------------------------
2023-10-14 20:04:40,576 ----------------------------------------------------------------------------------------------------
2023-10-14 20:04:40,576 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-14 20:04:57,508 epoch 1 - iter 89/894 - loss 3.04433325 - time (sec): 16.93 - samples/sec: 546.74 - lr: 0.000016 - momentum: 0.000000
2023-10-14 20:05:14,096 epoch 1 - iter 178/894 - loss 3.00697322 - time (sec): 33.52 - samples/sec: 521.10 - lr: 0.000032 - momentum: 0.000000
2023-10-14 20:05:30,548 epoch 1 - iter 267/894 - loss 2.85458665 - time (sec): 49.97 - samples/sec: 514.10 - lr: 0.000048 - momentum: 0.000000
2023-10-14 20:05:47,119 epoch 1 - iter 356/894 - loss 2.63070465 - time (sec): 66.54 - samples/sec: 516.75 - lr: 0.000064 - momentum: 0.000000
2023-10-14 20:06:02,771 epoch 1 - iter 445/894 - loss 2.41357097 - time (sec): 82.19 - samples/sec: 508.03 - lr: 0.000079 - momentum: 0.000000
2023-10-14 20:06:19,218 epoch 1 - iter 534/894 - loss 2.15809760 - time (sec): 98.64 - samples/sec: 507.30 - lr: 0.000095 - momentum: 0.000000
2023-10-14 20:06:36,432 epoch 1 - iter 623/894 - loss 1.90629951 - time (sec): 115.86 - samples/sec: 512.68 - lr: 0.000111 - momentum: 0.000000
2023-10-14 20:06:52,947 epoch 1 - iter 712/894 - loss 1.73655924 - time (sec): 132.37 - samples/sec: 513.88 - lr: 0.000127 - momentum: 0.000000
2023-10-14 20:07:11,799 epoch 1 - iter 801/894 - loss 1.57462985 - time (sec): 151.22 - samples/sec: 516.25 - lr: 0.000143 - momentum: 0.000000
2023-10-14 20:07:28,037 epoch 1 - iter 890/894 - loss 1.46293211 - time (sec): 167.46 - samples/sec: 514.09 - lr: 0.000159 - momentum: 0.000000
2023-10-14 20:07:28,794 ----------------------------------------------------------------------------------------------------
2023-10-14 20:07:28,794 EPOCH 1 done: loss 1.4575 - lr: 0.000159
2023-10-14 20:07:51,443 DEV : loss 0.339751273393631 - f1-score (micro avg) 0.0234
2023-10-14 20:07:51,469 saving best model
2023-10-14 20:07:52,081 ----------------------------------------------------------------------------------------------------
2023-10-14 20:08:08,569 epoch 2 - iter 89/894 - loss 0.36784261 - time (sec): 16.49 - samples/sec: 517.02 - lr: 0.000158 - momentum: 0.000000
2023-10-14 20:08:25,152 epoch 2 - iter 178/894 - loss 0.35533195 - time (sec): 33.07 - samples/sec: 517.57 - lr: 0.000156 - momentum: 0.000000
2023-10-14 20:08:42,147 epoch 2 - iter 267/894 - loss 0.33472166 - time (sec): 50.06 - samples/sec: 529.77 - lr: 0.000155 - momentum: 0.000000
2023-10-14 20:09:00,581 epoch 2 - iter 356/894 - loss 0.32234839 - time (sec): 68.50 - samples/sec: 525.59 - lr: 0.000153 - momentum: 0.000000
2023-10-14 20:09:17,317 epoch 2 - iter 445/894 - loss 0.30927331 - time (sec): 85.23 - samples/sec: 523.45 - lr: 0.000151 - momentum: 0.000000
2023-10-14 20:09:34,281 epoch 2 - iter 534/894 - loss 0.29646500 - time (sec): 102.20 - samples/sec: 523.10 - lr: 0.000149 - momentum: 0.000000
2023-10-14 20:09:50,684 epoch 2 - iter 623/894 - loss 0.29525988 - time (sec): 118.60 - samples/sec: 518.76 - lr: 0.000148 - momentum: 0.000000
2023-10-14 20:10:07,451 epoch 2 - iter 712/894 - loss 0.29082523 - time (sec): 135.37 - samples/sec: 517.81 - lr: 0.000146 - momentum: 0.000000
2023-10-14 20:10:24,559 epoch 2 - iter 801/894 - loss 0.28139580 - time (sec): 152.48 - samples/sec: 516.17 - lr: 0.000144 - momentum: 0.000000
2023-10-14 20:10:40,763 epoch 2 - iter 890/894 - loss 0.27716837 - time (sec): 168.68 - samples/sec: 511.28 - lr: 0.000142 - momentum: 0.000000
2023-10-14 20:10:41,447 ----------------------------------------------------------------------------------------------------
2023-10-14 20:10:41,448 EPOCH 2 done: loss 0.2772 - lr: 0.000142
2023-10-14 20:11:06,565 DEV : loss 0.19352570176124573 - f1-score (micro avg) 0.6235
2023-10-14 20:11:06,591 saving best model
2023-10-14 20:11:11,217 ----------------------------------------------------------------------------------------------------
2023-10-14 20:11:27,822 epoch 3 - iter 89/894 - loss 0.20345638 - time (sec): 16.60 - samples/sec: 498.08 - lr: 0.000140 - momentum: 0.000000
2023-10-14 20:11:44,065 epoch 3 - iter 178/894 - loss 0.18152700 - time (sec): 32.85 - samples/sec: 500.96 - lr: 0.000139 - momentum: 0.000000
2023-10-14 20:12:01,084 epoch 3 - iter 267/894 - loss 0.17490578 - time (sec): 49.87 - samples/sec: 504.95 - lr: 0.000137 - momentum: 0.000000
2023-10-14 20:12:17,425 epoch 3 - iter 356/894 - loss 0.17202406 - time (sec): 66.21 - samples/sec: 508.41 - lr: 0.000135 - momentum: 0.000000
2023-10-14 20:12:36,005 epoch 3 - iter 445/894 - loss 0.16603411 - time (sec): 84.79 - samples/sec: 517.44 - lr: 0.000133 - momentum: 0.000000
2023-10-14 20:12:52,303 epoch 3 - iter 534/894 - loss 0.16465909 - time (sec): 101.08 - samples/sec: 516.75 - lr: 0.000132 - momentum: 0.000000
2023-10-14 20:13:08,435 epoch 3 - iter 623/894 - loss 0.15551952 - time (sec): 117.22 - samples/sec: 513.59 - lr: 0.000130 - momentum: 0.000000
2023-10-14 20:13:24,463 epoch 3 - iter 712/894 - loss 0.14985602 - time (sec): 133.24 - samples/sec: 512.13 - lr: 0.000128 - momentum: 0.000000
2023-10-14 20:13:41,373 epoch 3 - iter 801/894 - loss 0.14468106 - time (sec): 150.15 - samples/sec: 515.69 - lr: 0.000126 - momentum: 0.000000
2023-10-14 20:13:57,801 epoch 3 - iter 890/894 - loss 0.14031154 - time (sec): 166.58 - samples/sec: 516.59 - lr: 0.000125 - momentum: 0.000000
2023-10-14 20:13:58,586 ----------------------------------------------------------------------------------------------------
2023-10-14 20:13:58,587 EPOCH 3 done: loss 0.1400 - lr: 0.000125
2023-10-14 20:14:23,740 DEV : loss 0.16954360902309418 - f1-score (micro avg) 0.6643
2023-10-14 20:14:23,767 saving best model
2023-10-14 20:14:27,052 ----------------------------------------------------------------------------------------------------
2023-10-14 20:14:43,421 epoch 4 - iter 89/894 - loss 0.08846701 - time (sec): 16.37 - samples/sec: 515.92 - lr: 0.000123 - momentum: 0.000000
2023-10-14 20:14:59,309 epoch 4 - iter 178/894 - loss 0.09164804 - time (sec): 32.26 - samples/sec: 501.01 - lr: 0.000121 - momentum: 0.000000
2023-10-14 20:15:15,593 epoch 4 - iter 267/894 - loss 0.08809052 - time (sec): 48.54 - samples/sec: 501.81 - lr: 0.000119 - momentum: 0.000000
2023-10-14 20:15:32,110 epoch 4 - iter 356/894 - loss 0.09020029 - time (sec): 65.06 - samples/sec: 501.19 - lr: 0.000117 - momentum: 0.000000
2023-10-14 20:15:48,299 epoch 4 - iter 445/894 - loss 0.08499223 - time (sec): 81.24 - samples/sec: 500.34 - lr: 0.000116 - momentum: 0.000000
2023-10-14 20:16:05,170 epoch 4 - iter 534/894 - loss 0.08110597 - time (sec): 98.12 - samples/sec: 506.20 - lr: 0.000114 - momentum: 0.000000
2023-10-14 20:16:21,605 epoch 4 - iter 623/894 - loss 0.07754561 - time (sec): 114.55 - samples/sec: 505.87 - lr: 0.000112 - momentum: 0.000000
2023-10-14 20:16:38,148 epoch 4 - iter 712/894 - loss 0.07756119 - time (sec): 131.09 - samples/sec: 504.72 - lr: 0.000110 - momentum: 0.000000
2023-10-14 20:16:57,069 epoch 4 - iter 801/894 - loss 0.07686532 - time (sec): 150.01 - samples/sec: 508.09 - lr: 0.000109 - momentum: 0.000000
2023-10-14 20:17:15,151 epoch 4 - iter 890/894 - loss 0.07342823 - time (sec): 168.10 - samples/sec: 512.26 - lr: 0.000107 - momentum: 0.000000
2023-10-14 20:17:15,919 ----------------------------------------------------------------------------------------------------
2023-10-14 20:17:15,920 EPOCH 4 done: loss 0.0732 - lr: 0.000107
2023-10-14 20:17:41,184 DEV : loss 0.17608195543289185 - f1-score (micro avg) 0.7396
2023-10-14 20:17:41,211 saving best model
2023-10-14 20:17:41,886 ----------------------------------------------------------------------------------------------------
2023-10-14 20:17:58,085 epoch 5 - iter 89/894 - loss 0.04773510 - time (sec): 16.20 - samples/sec: 480.18 - lr: 0.000105 - momentum: 0.000000
2023-10-14 20:18:14,911 epoch 5 - iter 178/894 - loss 0.04110256 - time (sec): 33.02 - samples/sec: 489.92 - lr: 0.000103 - momentum: 0.000000
2023-10-14 20:18:32,101 epoch 5 - iter 267/894 - loss 0.04134861 - time (sec): 50.21 - samples/sec: 500.72 - lr: 0.000101 - momentum: 0.000000
2023-10-14 20:18:48,790 epoch 5 - iter 356/894 - loss 0.04668182 - time (sec): 66.90 - samples/sec: 503.49 - lr: 0.000100 - momentum: 0.000000
2023-10-14 20:19:05,227 epoch 5 - iter 445/894 - loss 0.04405493 - time (sec): 83.34 - samples/sec: 503.64 - lr: 0.000098 - momentum: 0.000000
2023-10-14 20:19:23,909 epoch 5 - iter 534/894 - loss 0.04580574 - time (sec): 102.02 - samples/sec: 505.88 - lr: 0.000096 - momentum: 0.000000
2023-10-14 20:19:40,166 epoch 5 - iter 623/894 - loss 0.04527345 - time (sec): 118.28 - samples/sec: 505.78 - lr: 0.000094 - momentum: 0.000000
2023-10-14 20:19:56,921 epoch 5 - iter 712/894 - loss 0.04590300 - time (sec): 135.03 - samples/sec: 509.19 - lr: 0.000093 - momentum: 0.000000
2023-10-14 20:20:13,926 epoch 5 - iter 801/894 - loss 0.04752093 - time (sec): 152.04 - samples/sec: 509.41 - lr: 0.000091 - momentum: 0.000000
2023-10-14 20:20:30,503 epoch 5 - iter 890/894 - loss 0.04824072 - time (sec): 168.62 - samples/sec: 510.41 - lr: 0.000089 - momentum: 0.000000
2023-10-14 20:20:31,281 ----------------------------------------------------------------------------------------------------
2023-10-14 20:20:31,281 EPOCH 5 done: loss 0.0486 - lr: 0.000089
2023-10-14 20:20:56,057 DEV : loss 0.20735777914524078 - f1-score (micro avg) 0.7229
2023-10-14 20:20:56,085 ----------------------------------------------------------------------------------------------------
2023-10-14 20:21:12,577 epoch 6 - iter 89/894 - loss 0.01695411 - time (sec): 16.49 - samples/sec: 525.25 - lr: 0.000087 - momentum: 0.000000
2023-10-14 20:21:28,712 epoch 6 - iter 178/894 - loss 0.02446819 - time (sec): 32.63 - samples/sec: 520.04 - lr: 0.000085 - momentum: 0.000000
2023-10-14 20:21:45,112 epoch 6 - iter 267/894 - loss 0.02659230 - time (sec): 49.03 - samples/sec: 518.96 - lr: 0.000084 - momentum: 0.000000
2023-10-14 20:22:01,419 epoch 6 - iter 356/894 - loss 0.02527246 - time (sec): 65.33 - samples/sec: 520.98 - lr: 0.000082 - momentum: 0.000000
2023-10-14 20:22:17,554 epoch 6 - iter 445/894 - loss 0.02492951 - time (sec): 81.47 - samples/sec: 517.28 - lr: 0.000080 - momentum: 0.000000
2023-10-14 20:22:35,796 epoch 6 - iter 534/894 - loss 0.02839357 - time (sec): 99.71 - samples/sec: 519.78 - lr: 0.000078 - momentum: 0.000000
2023-10-14 20:22:52,541 epoch 6 - iter 623/894 - loss 0.02825206 - time (sec): 116.45 - samples/sec: 523.70 - lr: 0.000077 - momentum: 0.000000
2023-10-14 20:23:09,200 epoch 6 - iter 712/894 - loss 0.02825234 - time (sec): 133.11 - samples/sec: 521.37 - lr: 0.000075 - momentum: 0.000000
2023-10-14 20:23:25,181 epoch 6 - iter 801/894 - loss 0.03010817 - time (sec): 149.10 - samples/sec: 518.59 - lr: 0.000073 - momentum: 0.000000
2023-10-14 20:23:41,905 epoch 6 - iter 890/894 - loss 0.03021347 - time (sec): 165.82 - samples/sec: 519.76 - lr: 0.000071 - momentum: 0.000000
2023-10-14 20:23:42,597 ----------------------------------------------------------------------------------------------------
2023-10-14 20:23:42,597 EPOCH 6 done: loss 0.0302 - lr: 0.000071
2023-10-14 20:24:07,543 DEV : loss 0.2202872484922409 - f1-score (micro avg) 0.7455
2023-10-14 20:24:07,569 saving best model
2023-10-14 20:24:11,367 ----------------------------------------------------------------------------------------------------
2023-10-14 20:24:29,869 epoch 7 - iter 89/894 - loss 0.02780046 - time (sec): 18.50 - samples/sec: 527.16 - lr: 0.000069 - momentum: 0.000000
2023-10-14 20:24:46,557 epoch 7 - iter 178/894 - loss 0.02662654 - time (sec): 35.19 - samples/sec: 523.71 - lr: 0.000068 - momentum: 0.000000
2023-10-14 20:25:03,009 epoch 7 - iter 267/894 - loss 0.03213013 - time (sec): 51.64 - samples/sec: 510.98 - lr: 0.000066 - momentum: 0.000000
2023-10-14 20:25:19,405 epoch 7 - iter 356/894 - loss 0.02646254 - time (sec): 68.03 - samples/sec: 511.03 - lr: 0.000064 - momentum: 0.000000
2023-10-14 20:25:36,170 epoch 7 - iter 445/894 - loss 0.02557549 - time (sec): 84.80 - samples/sec: 512.87 - lr: 0.000062 - momentum: 0.000000
2023-10-14 20:25:53,249 epoch 7 - iter 534/894 - loss 0.02364430 - time (sec): 101.88 - samples/sec: 514.82 - lr: 0.000061 - momentum: 0.000000
2023-10-14 20:26:09,746 epoch 7 - iter 623/894 - loss 0.02404203 - time (sec): 118.38 - samples/sec: 513.81 - lr: 0.000059 - momentum: 0.000000
2023-10-14 20:26:26,140 epoch 7 - iter 712/894 - loss 0.02297158 - time (sec): 134.77 - samples/sec: 514.00 - lr: 0.000057 - momentum: 0.000000
2023-10-14 20:26:42,777 epoch 7 - iter 801/894 - loss 0.02189655 - time (sec): 151.41 - samples/sec: 514.26 - lr: 0.000055 - momentum: 0.000000
2023-10-14 20:26:59,291 epoch 7 - iter 890/894 - loss 0.02080018 - time (sec): 167.92 - samples/sec: 513.25 - lr: 0.000053 - momentum: 0.000000
2023-10-14 20:27:00,016 ----------------------------------------------------------------------------------------------------
2023-10-14 20:27:00,016 EPOCH 7 done: loss 0.0208 - lr: 0.000053
2023-10-14 20:27:25,243 DEV : loss 0.24006003141403198 - f1-score (micro avg) 0.7596
2023-10-14 20:27:25,270 saving best model
2023-10-14 20:27:29,580 ----------------------------------------------------------------------------------------------------
2023-10-14 20:27:45,981 epoch 8 - iter 89/894 - loss 0.01819202 - time (sec): 16.40 - samples/sec: 505.35 - lr: 0.000052 - momentum: 0.000000
2023-10-14 20:28:02,806 epoch 8 - iter 178/894 - loss 0.02002471 - time (sec): 33.22 - samples/sec: 508.31 - lr: 0.000050 - momentum: 0.000000
2023-10-14 20:28:18,996 epoch 8 - iter 267/894 - loss 0.01725648 - time (sec): 49.41 - samples/sec: 504.36 - lr: 0.000048 - momentum: 0.000000
2023-10-14 20:28:36,157 epoch 8 - iter 356/894 - loss 0.01616480 - time (sec): 66.57 - samples/sec: 518.25 - lr: 0.000046 - momentum: 0.000000
2023-10-14 20:28:53,072 epoch 8 - iter 445/894 - loss 0.01677385 - time (sec): 83.49 - samples/sec: 523.17 - lr: 0.000045 - momentum: 0.000000
2023-10-14 20:29:09,355 epoch 8 - iter 534/894 - loss 0.01546777 - time (sec): 99.77 - samples/sec: 518.34 - lr: 0.000043 - momentum: 0.000000
2023-10-14 20:29:27,470 epoch 8 - iter 623/894 - loss 0.01582854 - time (sec): 117.89 - samples/sec: 514.01 - lr: 0.000041 - momentum: 0.000000
2023-10-14 20:29:44,191 epoch 8 - iter 712/894 - loss 0.01615447 - time (sec): 134.61 - samples/sec: 513.46 - lr: 0.000039 - momentum: 0.000000
2023-10-14 20:30:00,775 epoch 8 - iter 801/894 - loss 0.01593930 - time (sec): 151.19 - samples/sec: 511.91 - lr: 0.000038 - momentum: 0.000000
2023-10-14 20:30:17,616 epoch 8 - iter 890/894 - loss 0.01516070 - time (sec): 168.03 - samples/sec: 513.74 - lr: 0.000036 - momentum: 0.000000
2023-10-14 20:30:18,252 ----------------------------------------------------------------------------------------------------
2023-10-14 20:30:18,252 EPOCH 8 done: loss 0.0151 - lr: 0.000036
2023-10-14 20:30:43,109 DEV : loss 0.23652133345603943 - f1-score (micro avg) 0.7519
2023-10-14 20:30:43,135 ----------------------------------------------------------------------------------------------------
2023-10-14 20:31:01,634 epoch 9 - iter 89/894 - loss 0.01814129 - time (sec): 18.50 - samples/sec: 529.30 - lr: 0.000034 - momentum: 0.000000
2023-10-14 20:31:18,702 epoch 9 - iter 178/894 - loss 0.01179305 - time (sec): 35.57 - samples/sec: 533.30 - lr: 0.000032 - momentum: 0.000000
2023-10-14 20:31:35,620 epoch 9 - iter 267/894 - loss 0.01076027 - time (sec): 52.48 - samples/sec: 527.10 - lr: 0.000030 - momentum: 0.000000
2023-10-14 20:31:52,398 epoch 9 - iter 356/894 - loss 0.00961319 - time (sec): 69.26 - samples/sec: 528.25 - lr: 0.000029 - momentum: 0.000000
2023-10-14 20:32:08,429 epoch 9 - iter 445/894 - loss 0.01201137 - time (sec): 85.29 - samples/sec: 520.40 - lr: 0.000027 - momentum: 0.000000
2023-10-14 20:32:24,799 epoch 9 - iter 534/894 - loss 0.01096827 - time (sec): 101.66 - samples/sec: 517.67 - lr: 0.000025 - momentum: 0.000000
2023-10-14 20:32:41,012 epoch 9 - iter 623/894 - loss 0.01012014 - time (sec): 117.87 - samples/sec: 513.10 - lr: 0.000023 - momentum: 0.000000
2023-10-14 20:32:57,699 epoch 9 - iter 712/894 - loss 0.01039388 - time (sec): 134.56 - samples/sec: 513.70 - lr: 0.000022 - momentum: 0.000000
2023-10-14 20:33:14,257 epoch 9 - iter 801/894 - loss 0.01006521 - time (sec): 151.12 - samples/sec: 512.95 - lr: 0.000020 - momentum: 0.000000
2023-10-14 20:33:31,112 epoch 9 - iter 890/894 - loss 0.01011928 - time (sec): 167.98 - samples/sec: 513.55 - lr: 0.000018 - momentum: 0.000000
2023-10-14 20:33:31,781 ----------------------------------------------------------------------------------------------------
2023-10-14 20:33:31,781 EPOCH 9 done: loss 0.0101 - lr: 0.000018
2023-10-14 20:33:57,240 DEV : loss 0.25627970695495605 - f1-score (micro avg) 0.7519
2023-10-14 20:33:57,266 ----------------------------------------------------------------------------------------------------
2023-10-14 20:34:14,039 epoch 10 - iter 89/894 - loss 0.01200696 - time (sec): 16.77 - samples/sec: 526.50 - lr: 0.000016 - momentum: 0.000000
2023-10-14 20:34:30,197 epoch 10 - iter 178/894 - loss 0.00857499 - time (sec): 32.93 - samples/sec: 504.69 - lr: 0.000014 - momentum: 0.000000
2023-10-14 20:34:46,697 epoch 10 - iter 267/894 - loss 0.00703922 - time (sec): 49.43 - samples/sec: 506.28 - lr: 0.000013 - momentum: 0.000000
2023-10-14 20:35:04,040 epoch 10 - iter 356/894 - loss 0.00654044 - time (sec): 66.77 - samples/sec: 513.17 - lr: 0.000011 - momentum: 0.000000
2023-10-14 20:35:22,595 epoch 10 - iter 445/894 - loss 0.00710455 - time (sec): 85.33 - samples/sec: 517.56 - lr: 0.000009 - momentum: 0.000000
2023-10-14 20:35:39,208 epoch 10 - iter 534/894 - loss 0.00709956 - time (sec): 101.94 - samples/sec: 517.20 - lr: 0.000007 - momentum: 0.000000
2023-10-14 20:35:55,393 epoch 10 - iter 623/894 - loss 0.00690966 - time (sec): 118.13 - samples/sec: 512.29 - lr: 0.000006 - momentum: 0.000000
2023-10-14 20:36:11,258 epoch 10 - iter 712/894 - loss 0.00715382 - time (sec): 133.99 - samples/sec: 509.56 - lr: 0.000004 - momentum: 0.000000
2023-10-14 20:36:28,507 epoch 10 - iter 801/894 - loss 0.00646675 - time (sec): 151.24 - samples/sec: 513.66 - lr: 0.000002 - momentum: 0.000000
2023-10-14 20:36:44,895 epoch 10 - iter 890/894 - loss 0.00727492 - time (sec): 167.63 - samples/sec: 514.63 - lr: 0.000000 - momentum: 0.000000
2023-10-14 20:36:45,547 ----------------------------------------------------------------------------------------------------
2023-10-14 20:36:45,548 EPOCH 10 done: loss 0.0073 - lr: 0.000000
2023-10-14 20:37:10,711 DEV : loss 0.26143890619277954 - f1-score (micro avg) 0.75
2023-10-14 20:37:11,338 ----------------------------------------------------------------------------------------------------
2023-10-14 20:37:11,339 Loading model from best epoch ...
2023-10-14 20:37:13,530 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time
2023-10-14 20:37:35,355
Results:
- F-score (micro) 0.759
- F-score (macro) 0.6755
- Accuracy 0.6254
By class:
precision recall f1-score support
loc 0.8396 0.8607 0.8500 596
pers 0.6815 0.7838 0.7291 333
org 0.5397 0.5152 0.5271 132
prod 0.6140 0.5303 0.5691 66
time 0.7333 0.6735 0.7021 49
micro avg 0.7447 0.7738 0.7590 1176
macro avg 0.6816 0.6727 0.6755 1176
weighted avg 0.7441 0.7738 0.7576 1176
2023-10-14 20:37:35,355 ----------------------------------------------------------------------------------------------------