stefan-it's picture
Upload folder using huggingface_hub
bbd815c
2023-10-11 23:01:47,070 ----------------------------------------------------------------------------------------------------
2023-10-11 23:01:47,072 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-11 23:01:47,072 ----------------------------------------------------------------------------------------------------
2023-10-11 23:01:47,072 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-11 23:01:47,072 ----------------------------------------------------------------------------------------------------
2023-10-11 23:01:47,072 Train: 5777 sentences
2023-10-11 23:01:47,073 (train_with_dev=False, train_with_test=False)
2023-10-11 23:01:47,073 ----------------------------------------------------------------------------------------------------
2023-10-11 23:01:47,073 Training Params:
2023-10-11 23:01:47,073 - learning_rate: "0.00016"
2023-10-11 23:01:47,073 - mini_batch_size: "4"
2023-10-11 23:01:47,073 - max_epochs: "10"
2023-10-11 23:01:47,073 - shuffle: "True"
2023-10-11 23:01:47,073 ----------------------------------------------------------------------------------------------------
2023-10-11 23:01:47,073 Plugins:
2023-10-11 23:01:47,073 - TensorboardLogger
2023-10-11 23:01:47,073 - LinearScheduler | warmup_fraction: '0.1'
2023-10-11 23:01:47,073 ----------------------------------------------------------------------------------------------------
2023-10-11 23:01:47,073 Final evaluation on model from best epoch (best-model.pt)
2023-10-11 23:01:47,073 - metric: "('micro avg', 'f1-score')"
2023-10-11 23:01:47,073 ----------------------------------------------------------------------------------------------------
2023-10-11 23:01:47,073 Computation:
2023-10-11 23:01:47,074 - compute on device: cuda:0
2023-10-11 23:01:47,074 - embedding storage: none
2023-10-11 23:01:47,074 ----------------------------------------------------------------------------------------------------
2023-10-11 23:01:47,074 Model training base path: "hmbench-icdar/nl-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-2"
2023-10-11 23:01:47,074 ----------------------------------------------------------------------------------------------------
2023-10-11 23:01:47,074 ----------------------------------------------------------------------------------------------------
2023-10-11 23:01:47,074 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-11 23:02:29,577 epoch 1 - iter 144/1445 - loss 2.56504692 - time (sec): 42.50 - samples/sec: 437.44 - lr: 0.000016 - momentum: 0.000000
2023-10-11 23:03:12,798 epoch 1 - iter 288/1445 - loss 2.44150093 - time (sec): 85.72 - samples/sec: 416.17 - lr: 0.000032 - momentum: 0.000000
2023-10-11 23:03:56,536 epoch 1 - iter 432/1445 - loss 2.16436375 - time (sec): 129.46 - samples/sec: 416.45 - lr: 0.000048 - momentum: 0.000000
2023-10-11 23:04:38,360 epoch 1 - iter 576/1445 - loss 1.86790557 - time (sec): 171.28 - samples/sec: 416.44 - lr: 0.000064 - momentum: 0.000000
2023-10-11 23:05:20,176 epoch 1 - iter 720/1445 - loss 1.58401234 - time (sec): 213.10 - samples/sec: 418.18 - lr: 0.000080 - momentum: 0.000000
2023-10-11 23:06:04,468 epoch 1 - iter 864/1445 - loss 1.36702605 - time (sec): 257.39 - samples/sec: 414.93 - lr: 0.000096 - momentum: 0.000000
2023-10-11 23:06:47,420 epoch 1 - iter 1008/1445 - loss 1.20723711 - time (sec): 300.34 - samples/sec: 415.07 - lr: 0.000112 - momentum: 0.000000
2023-10-11 23:07:28,055 epoch 1 - iter 1152/1445 - loss 1.08380572 - time (sec): 340.98 - samples/sec: 415.54 - lr: 0.000127 - momentum: 0.000000
2023-10-11 23:08:10,527 epoch 1 - iter 1296/1445 - loss 0.98304740 - time (sec): 383.45 - samples/sec: 415.51 - lr: 0.000143 - momentum: 0.000000
2023-10-11 23:08:51,785 epoch 1 - iter 1440/1445 - loss 0.90575201 - time (sec): 424.71 - samples/sec: 413.89 - lr: 0.000159 - momentum: 0.000000
2023-10-11 23:08:53,040 ----------------------------------------------------------------------------------------------------
2023-10-11 23:08:53,041 EPOCH 1 done: loss 0.9041 - lr: 0.000159
2023-10-11 23:09:12,824 DEV : loss 0.19637209177017212 - f1-score (micro avg) 0.337
2023-10-11 23:09:12,864 saving best model
2023-10-11 23:09:13,889 ----------------------------------------------------------------------------------------------------
2023-10-11 23:09:55,884 epoch 2 - iter 144/1445 - loss 0.14820675 - time (sec): 41.99 - samples/sec: 405.54 - lr: 0.000158 - momentum: 0.000000
2023-10-11 23:10:39,422 epoch 2 - iter 288/1445 - loss 0.14776595 - time (sec): 85.53 - samples/sec: 405.66 - lr: 0.000156 - momentum: 0.000000
2023-10-11 23:11:20,957 epoch 2 - iter 432/1445 - loss 0.14142232 - time (sec): 127.07 - samples/sec: 417.19 - lr: 0.000155 - momentum: 0.000000
2023-10-11 23:12:04,561 epoch 2 - iter 576/1445 - loss 0.14046314 - time (sec): 170.67 - samples/sec: 416.71 - lr: 0.000153 - momentum: 0.000000
2023-10-11 23:12:47,276 epoch 2 - iter 720/1445 - loss 0.13346839 - time (sec): 213.39 - samples/sec: 415.77 - lr: 0.000151 - momentum: 0.000000
2023-10-11 23:13:29,863 epoch 2 - iter 864/1445 - loss 0.12999825 - time (sec): 255.97 - samples/sec: 413.52 - lr: 0.000149 - momentum: 0.000000
2023-10-11 23:14:11,807 epoch 2 - iter 1008/1445 - loss 0.12549631 - time (sec): 297.92 - samples/sec: 413.66 - lr: 0.000148 - momentum: 0.000000
2023-10-11 23:14:55,284 epoch 2 - iter 1152/1445 - loss 0.12210863 - time (sec): 341.39 - samples/sec: 412.48 - lr: 0.000146 - momentum: 0.000000
2023-10-11 23:15:44,328 epoch 2 - iter 1296/1445 - loss 0.11930857 - time (sec): 390.44 - samples/sec: 403.41 - lr: 0.000144 - momentum: 0.000000
2023-10-11 23:16:29,752 epoch 2 - iter 1440/1445 - loss 0.11702306 - time (sec): 435.86 - samples/sec: 403.17 - lr: 0.000142 - momentum: 0.000000
2023-10-11 23:16:31,116 ----------------------------------------------------------------------------------------------------
2023-10-11 23:16:31,117 EPOCH 2 done: loss 0.1170 - lr: 0.000142
2023-10-11 23:16:52,882 DEV : loss 0.09597545862197876 - f1-score (micro avg) 0.7901
2023-10-11 23:16:52,914 saving best model
2023-10-11 23:16:56,865 ----------------------------------------------------------------------------------------------------
2023-10-11 23:17:42,241 epoch 3 - iter 144/1445 - loss 0.08787870 - time (sec): 45.37 - samples/sec: 400.96 - lr: 0.000140 - momentum: 0.000000
2023-10-11 23:18:24,660 epoch 3 - iter 288/1445 - loss 0.07849694 - time (sec): 87.79 - samples/sec: 402.17 - lr: 0.000139 - momentum: 0.000000
2023-10-11 23:19:10,100 epoch 3 - iter 432/1445 - loss 0.07359975 - time (sec): 133.23 - samples/sec: 393.96 - lr: 0.000137 - momentum: 0.000000
2023-10-11 23:19:54,534 epoch 3 - iter 576/1445 - loss 0.07148524 - time (sec): 177.66 - samples/sec: 390.16 - lr: 0.000135 - momentum: 0.000000
2023-10-11 23:20:36,239 epoch 3 - iter 720/1445 - loss 0.06904176 - time (sec): 219.37 - samples/sec: 395.42 - lr: 0.000133 - momentum: 0.000000
2023-10-11 23:21:20,175 epoch 3 - iter 864/1445 - loss 0.07225162 - time (sec): 263.30 - samples/sec: 401.89 - lr: 0.000132 - momentum: 0.000000
2023-10-11 23:22:03,866 epoch 3 - iter 1008/1445 - loss 0.07242454 - time (sec): 306.99 - samples/sec: 400.44 - lr: 0.000130 - momentum: 0.000000
2023-10-11 23:22:51,028 epoch 3 - iter 1152/1445 - loss 0.07061901 - time (sec): 354.16 - samples/sec: 395.86 - lr: 0.000128 - momentum: 0.000000
2023-10-11 23:23:31,796 epoch 3 - iter 1296/1445 - loss 0.06930577 - time (sec): 394.92 - samples/sec: 397.98 - lr: 0.000126 - momentum: 0.000000
2023-10-11 23:24:17,693 epoch 3 - iter 1440/1445 - loss 0.06817481 - time (sec): 440.82 - samples/sec: 398.56 - lr: 0.000125 - momentum: 0.000000
2023-10-11 23:24:18,983 ----------------------------------------------------------------------------------------------------
2023-10-11 23:24:18,984 EPOCH 3 done: loss 0.0681 - lr: 0.000125
2023-10-11 23:24:41,500 DEV : loss 0.07571297883987427 - f1-score (micro avg) 0.8424
2023-10-11 23:24:41,540 saving best model
2023-10-11 23:24:44,350 ----------------------------------------------------------------------------------------------------
2023-10-11 23:25:28,755 epoch 4 - iter 144/1445 - loss 0.05317648 - time (sec): 44.40 - samples/sec: 394.61 - lr: 0.000123 - momentum: 0.000000
2023-10-11 23:26:14,399 epoch 4 - iter 288/1445 - loss 0.04776722 - time (sec): 90.05 - samples/sec: 394.64 - lr: 0.000121 - momentum: 0.000000
2023-10-11 23:26:59,460 epoch 4 - iter 432/1445 - loss 0.04761497 - time (sec): 135.11 - samples/sec: 390.73 - lr: 0.000119 - momentum: 0.000000
2023-10-11 23:27:43,444 epoch 4 - iter 576/1445 - loss 0.04723012 - time (sec): 179.09 - samples/sec: 392.28 - lr: 0.000117 - momentum: 0.000000
2023-10-11 23:28:25,872 epoch 4 - iter 720/1445 - loss 0.05074390 - time (sec): 221.52 - samples/sec: 390.64 - lr: 0.000116 - momentum: 0.000000
2023-10-11 23:29:09,101 epoch 4 - iter 864/1445 - loss 0.04975381 - time (sec): 264.75 - samples/sec: 395.49 - lr: 0.000114 - momentum: 0.000000
2023-10-11 23:29:52,115 epoch 4 - iter 1008/1445 - loss 0.04920143 - time (sec): 307.76 - samples/sec: 399.94 - lr: 0.000112 - momentum: 0.000000
2023-10-11 23:30:39,362 epoch 4 - iter 1152/1445 - loss 0.04742537 - time (sec): 355.01 - samples/sec: 396.13 - lr: 0.000110 - momentum: 0.000000
2023-10-11 23:31:28,500 epoch 4 - iter 1296/1445 - loss 0.04477836 - time (sec): 404.15 - samples/sec: 395.10 - lr: 0.000109 - momentum: 0.000000
2023-10-11 23:32:10,884 epoch 4 - iter 1440/1445 - loss 0.04528448 - time (sec): 446.53 - samples/sec: 393.84 - lr: 0.000107 - momentum: 0.000000
2023-10-11 23:32:12,067 ----------------------------------------------------------------------------------------------------
2023-10-11 23:32:12,068 EPOCH 4 done: loss 0.0452 - lr: 0.000107
2023-10-11 23:32:32,644 DEV : loss 0.07717934250831604 - f1-score (micro avg) 0.8654
2023-10-11 23:32:32,675 saving best model
2023-10-11 23:32:35,406 ----------------------------------------------------------------------------------------------------
2023-10-11 23:33:19,520 epoch 5 - iter 144/1445 - loss 0.03679728 - time (sec): 44.11 - samples/sec: 400.38 - lr: 0.000105 - momentum: 0.000000
2023-10-11 23:34:05,960 epoch 5 - iter 288/1445 - loss 0.03243108 - time (sec): 90.55 - samples/sec: 389.52 - lr: 0.000103 - momentum: 0.000000
2023-10-11 23:34:49,836 epoch 5 - iter 432/1445 - loss 0.03251890 - time (sec): 134.43 - samples/sec: 397.93 - lr: 0.000101 - momentum: 0.000000
2023-10-11 23:35:33,322 epoch 5 - iter 576/1445 - loss 0.03153095 - time (sec): 177.91 - samples/sec: 397.85 - lr: 0.000100 - momentum: 0.000000
2023-10-11 23:36:19,573 epoch 5 - iter 720/1445 - loss 0.03166998 - time (sec): 224.16 - samples/sec: 399.23 - lr: 0.000098 - momentum: 0.000000
2023-10-11 23:37:04,869 epoch 5 - iter 864/1445 - loss 0.03099000 - time (sec): 269.46 - samples/sec: 392.76 - lr: 0.000096 - momentum: 0.000000
2023-10-11 23:37:50,323 epoch 5 - iter 1008/1445 - loss 0.03399168 - time (sec): 314.91 - samples/sec: 392.74 - lr: 0.000094 - momentum: 0.000000
2023-10-11 23:38:33,420 epoch 5 - iter 1152/1445 - loss 0.03356007 - time (sec): 358.01 - samples/sec: 393.99 - lr: 0.000093 - momentum: 0.000000
2023-10-11 23:39:18,483 epoch 5 - iter 1296/1445 - loss 0.03705131 - time (sec): 403.07 - samples/sec: 393.32 - lr: 0.000091 - momentum: 0.000000
2023-10-11 23:40:01,118 epoch 5 - iter 1440/1445 - loss 0.03691198 - time (sec): 445.71 - samples/sec: 393.96 - lr: 0.000089 - momentum: 0.000000
2023-10-11 23:40:02,393 ----------------------------------------------------------------------------------------------------
2023-10-11 23:40:02,394 EPOCH 5 done: loss 0.0368 - lr: 0.000089
2023-10-11 23:40:23,145 DEV : loss 0.10290254652500153 - f1-score (micro avg) 0.8485
2023-10-11 23:40:23,176 ----------------------------------------------------------------------------------------------------
2023-10-11 23:41:06,055 epoch 6 - iter 144/1445 - loss 0.02233664 - time (sec): 42.88 - samples/sec: 433.91 - lr: 0.000087 - momentum: 0.000000
2023-10-11 23:41:49,040 epoch 6 - iter 288/1445 - loss 0.01799245 - time (sec): 85.86 - samples/sec: 429.14 - lr: 0.000085 - momentum: 0.000000
2023-10-11 23:42:30,919 epoch 6 - iter 432/1445 - loss 0.02283230 - time (sec): 127.74 - samples/sec: 415.18 - lr: 0.000084 - momentum: 0.000000
2023-10-11 23:43:15,917 epoch 6 - iter 576/1445 - loss 0.02348982 - time (sec): 172.74 - samples/sec: 408.45 - lr: 0.000082 - momentum: 0.000000
2023-10-11 23:44:02,655 epoch 6 - iter 720/1445 - loss 0.02282929 - time (sec): 219.48 - samples/sec: 395.58 - lr: 0.000080 - momentum: 0.000000
2023-10-11 23:44:45,177 epoch 6 - iter 864/1445 - loss 0.02242678 - time (sec): 262.00 - samples/sec: 400.20 - lr: 0.000078 - momentum: 0.000000
2023-10-11 23:45:28,671 epoch 6 - iter 1008/1445 - loss 0.02528479 - time (sec): 305.49 - samples/sec: 404.33 - lr: 0.000076 - momentum: 0.000000
2023-10-11 23:46:11,961 epoch 6 - iter 1152/1445 - loss 0.02440097 - time (sec): 348.78 - samples/sec: 405.32 - lr: 0.000075 - momentum: 0.000000
2023-10-11 23:46:55,331 epoch 6 - iter 1296/1445 - loss 0.02461694 - time (sec): 392.15 - samples/sec: 402.25 - lr: 0.000073 - momentum: 0.000000
2023-10-11 23:47:40,310 epoch 6 - iter 1440/1445 - loss 0.02524712 - time (sec): 437.13 - samples/sec: 401.24 - lr: 0.000071 - momentum: 0.000000
2023-10-11 23:47:42,045 ----------------------------------------------------------------------------------------------------
2023-10-11 23:47:42,045 EPOCH 6 done: loss 0.0253 - lr: 0.000071
2023-10-11 23:48:04,506 DEV : loss 0.11330018192529678 - f1-score (micro avg) 0.8482
2023-10-11 23:48:04,540 ----------------------------------------------------------------------------------------------------
2023-10-11 23:48:52,986 epoch 7 - iter 144/1445 - loss 0.01961613 - time (sec): 48.44 - samples/sec: 366.64 - lr: 0.000069 - momentum: 0.000000
2023-10-11 23:49:39,670 epoch 7 - iter 288/1445 - loss 0.01737598 - time (sec): 95.13 - samples/sec: 359.72 - lr: 0.000068 - momentum: 0.000000
2023-10-11 23:50:21,081 epoch 7 - iter 432/1445 - loss 0.01519697 - time (sec): 136.54 - samples/sec: 369.81 - lr: 0.000066 - momentum: 0.000000
2023-10-11 23:51:06,573 epoch 7 - iter 576/1445 - loss 0.01797181 - time (sec): 182.03 - samples/sec: 383.46 - lr: 0.000064 - momentum: 0.000000
2023-10-11 23:51:55,719 epoch 7 - iter 720/1445 - loss 0.02192375 - time (sec): 231.18 - samples/sec: 379.57 - lr: 0.000062 - momentum: 0.000000
2023-10-11 23:52:42,777 epoch 7 - iter 864/1445 - loss 0.02114113 - time (sec): 278.23 - samples/sec: 380.61 - lr: 0.000060 - momentum: 0.000000
2023-10-11 23:53:29,128 epoch 7 - iter 1008/1445 - loss 0.01941452 - time (sec): 324.58 - samples/sec: 378.01 - lr: 0.000059 - momentum: 0.000000
2023-10-11 23:54:14,079 epoch 7 - iter 1152/1445 - loss 0.01868922 - time (sec): 369.54 - samples/sec: 380.56 - lr: 0.000057 - momentum: 0.000000
2023-10-11 23:54:59,710 epoch 7 - iter 1296/1445 - loss 0.01897559 - time (sec): 415.17 - samples/sec: 381.19 - lr: 0.000055 - momentum: 0.000000
2023-10-11 23:55:42,976 epoch 7 - iter 1440/1445 - loss 0.01837138 - time (sec): 458.43 - samples/sec: 382.79 - lr: 0.000053 - momentum: 0.000000
2023-10-11 23:55:44,456 ----------------------------------------------------------------------------------------------------
2023-10-11 23:55:44,456 EPOCH 7 done: loss 0.0183 - lr: 0.000053
2023-10-11 23:56:05,378 DEV : loss 0.1252562254667282 - f1-score (micro avg) 0.8553
2023-10-11 23:56:05,414 ----------------------------------------------------------------------------------------------------
2023-10-11 23:56:52,075 epoch 8 - iter 144/1445 - loss 0.00746075 - time (sec): 46.66 - samples/sec: 384.79 - lr: 0.000052 - momentum: 0.000000
2023-10-11 23:57:33,156 epoch 8 - iter 288/1445 - loss 0.01120551 - time (sec): 87.74 - samples/sec: 389.64 - lr: 0.000050 - momentum: 0.000000
2023-10-11 23:58:14,451 epoch 8 - iter 432/1445 - loss 0.01190775 - time (sec): 129.03 - samples/sec: 393.45 - lr: 0.000048 - momentum: 0.000000
2023-10-11 23:58:56,599 epoch 8 - iter 576/1445 - loss 0.01075084 - time (sec): 171.18 - samples/sec: 396.20 - lr: 0.000046 - momentum: 0.000000
2023-10-11 23:59:40,000 epoch 8 - iter 720/1445 - loss 0.01066676 - time (sec): 214.58 - samples/sec: 398.15 - lr: 0.000044 - momentum: 0.000000
2023-10-12 00:00:24,020 epoch 8 - iter 864/1445 - loss 0.01162819 - time (sec): 258.60 - samples/sec: 401.54 - lr: 0.000043 - momentum: 0.000000
2023-10-12 00:01:08,431 epoch 8 - iter 1008/1445 - loss 0.01257430 - time (sec): 303.01 - samples/sec: 404.82 - lr: 0.000041 - momentum: 0.000000
2023-10-12 00:01:51,744 epoch 8 - iter 1152/1445 - loss 0.01300952 - time (sec): 346.33 - samples/sec: 403.84 - lr: 0.000039 - momentum: 0.000000
2023-10-12 00:02:35,889 epoch 8 - iter 1296/1445 - loss 0.01290861 - time (sec): 390.47 - samples/sec: 405.12 - lr: 0.000037 - momentum: 0.000000
2023-10-12 00:03:22,108 epoch 8 - iter 1440/1445 - loss 0.01355414 - time (sec): 436.69 - samples/sec: 402.32 - lr: 0.000036 - momentum: 0.000000
2023-10-12 00:03:23,511 ----------------------------------------------------------------------------------------------------
2023-10-12 00:03:23,511 EPOCH 8 done: loss 0.0135 - lr: 0.000036
2023-10-12 00:03:47,026 DEV : loss 0.15301184356212616 - f1-score (micro avg) 0.8507
2023-10-12 00:03:47,058 ----------------------------------------------------------------------------------------------------
2023-10-12 00:04:31,022 epoch 9 - iter 144/1445 - loss 0.01407469 - time (sec): 43.96 - samples/sec: 390.32 - lr: 0.000034 - momentum: 0.000000
2023-10-12 00:05:14,446 epoch 9 - iter 288/1445 - loss 0.01237596 - time (sec): 87.39 - samples/sec: 389.46 - lr: 0.000032 - momentum: 0.000000
2023-10-12 00:05:56,362 epoch 9 - iter 432/1445 - loss 0.01114705 - time (sec): 129.30 - samples/sec: 395.67 - lr: 0.000030 - momentum: 0.000000
2023-10-12 00:06:39,472 epoch 9 - iter 576/1445 - loss 0.01148971 - time (sec): 172.41 - samples/sec: 404.79 - lr: 0.000028 - momentum: 0.000000
2023-10-12 00:07:23,878 epoch 9 - iter 720/1445 - loss 0.01198347 - time (sec): 216.82 - samples/sec: 408.58 - lr: 0.000027 - momentum: 0.000000
2023-10-12 00:08:07,659 epoch 9 - iter 864/1445 - loss 0.01175221 - time (sec): 260.60 - samples/sec: 407.94 - lr: 0.000025 - momentum: 0.000000
2023-10-12 00:08:51,892 epoch 9 - iter 1008/1445 - loss 0.01179629 - time (sec): 304.83 - samples/sec: 407.14 - lr: 0.000023 - momentum: 0.000000
2023-10-12 00:09:41,030 epoch 9 - iter 1152/1445 - loss 0.01124818 - time (sec): 353.97 - samples/sec: 400.00 - lr: 0.000021 - momentum: 0.000000
2023-10-12 00:10:25,562 epoch 9 - iter 1296/1445 - loss 0.01114700 - time (sec): 398.50 - samples/sec: 397.74 - lr: 0.000020 - momentum: 0.000000
2023-10-12 00:11:09,740 epoch 9 - iter 1440/1445 - loss 0.01070460 - time (sec): 442.68 - samples/sec: 396.70 - lr: 0.000018 - momentum: 0.000000
2023-10-12 00:11:11,213 ----------------------------------------------------------------------------------------------------
2023-10-12 00:11:11,214 EPOCH 9 done: loss 0.0107 - lr: 0.000018
2023-10-12 00:11:32,499 DEV : loss 0.14755892753601074 - f1-score (micro avg) 0.8556
2023-10-12 00:11:32,530 ----------------------------------------------------------------------------------------------------
2023-10-12 00:12:16,419 epoch 10 - iter 144/1445 - loss 0.00621849 - time (sec): 43.89 - samples/sec: 410.35 - lr: 0.000016 - momentum: 0.000000
2023-10-12 00:13:00,915 epoch 10 - iter 288/1445 - loss 0.00813280 - time (sec): 88.38 - samples/sec: 410.23 - lr: 0.000014 - momentum: 0.000000
2023-10-12 00:13:46,203 epoch 10 - iter 432/1445 - loss 0.00928314 - time (sec): 133.67 - samples/sec: 414.42 - lr: 0.000012 - momentum: 0.000000
2023-10-12 00:14:30,616 epoch 10 - iter 576/1445 - loss 0.00802405 - time (sec): 178.08 - samples/sec: 409.80 - lr: 0.000011 - momentum: 0.000000
2023-10-12 00:15:14,826 epoch 10 - iter 720/1445 - loss 0.00735469 - time (sec): 222.29 - samples/sec: 409.84 - lr: 0.000009 - momentum: 0.000000
2023-10-12 00:15:58,215 epoch 10 - iter 864/1445 - loss 0.00740270 - time (sec): 265.68 - samples/sec: 404.22 - lr: 0.000007 - momentum: 0.000000
2023-10-12 00:16:46,891 epoch 10 - iter 1008/1445 - loss 0.00785369 - time (sec): 314.36 - samples/sec: 395.84 - lr: 0.000005 - momentum: 0.000000
2023-10-12 00:17:31,973 epoch 10 - iter 1152/1445 - loss 0.00750830 - time (sec): 359.44 - samples/sec: 394.28 - lr: 0.000004 - momentum: 0.000000
2023-10-12 00:18:16,241 epoch 10 - iter 1296/1445 - loss 0.00743922 - time (sec): 403.71 - samples/sec: 393.59 - lr: 0.000002 - momentum: 0.000000
2023-10-12 00:19:00,020 epoch 10 - iter 1440/1445 - loss 0.00777974 - time (sec): 447.49 - samples/sec: 392.69 - lr: 0.000000 - momentum: 0.000000
2023-10-12 00:19:01,330 ----------------------------------------------------------------------------------------------------
2023-10-12 00:19:01,331 EPOCH 10 done: loss 0.0078 - lr: 0.000000
2023-10-12 00:19:23,771 DEV : loss 0.15156520903110504 - f1-score (micro avg) 0.8564
2023-10-12 00:19:24,773 ----------------------------------------------------------------------------------------------------
2023-10-12 00:19:24,775 Loading model from best epoch ...
2023-10-12 00:19:28,666 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-12 00:19:50,872
Results:
- F-score (micro) 0.8487
- F-score (macro) 0.7431
- Accuracy 0.7507
By class:
precision recall f1-score support
PER 0.8300 0.8714 0.8502 482
LOC 0.8894 0.8952 0.8923 458
ORG 0.6087 0.4058 0.4870 69
micro avg 0.8470 0.8503 0.8487 1009
macro avg 0.7760 0.7241 0.7431 1009
weighted avg 0.8418 0.8503 0.8445 1009
2023-10-12 00:19:50,872 ----------------------------------------------------------------------------------------------------