stefan-it's picture
Upload folder using huggingface_hub
4004aa2
2023-10-12 21:23:45,902 ----------------------------------------------------------------------------------------------------
2023-10-12 21:23:45,904 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-12 21:23:45,904 ----------------------------------------------------------------------------------------------------
2023-10-12 21:23:45,904 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-12 21:23:45,904 ----------------------------------------------------------------------------------------------------
2023-10-12 21:23:45,904 Train: 5777 sentences
2023-10-12 21:23:45,905 (train_with_dev=False, train_with_test=False)
2023-10-12 21:23:45,905 ----------------------------------------------------------------------------------------------------
2023-10-12 21:23:45,905 Training Params:
2023-10-12 21:23:45,905 - learning_rate: "0.00016"
2023-10-12 21:23:45,905 - mini_batch_size: "4"
2023-10-12 21:23:45,905 - max_epochs: "10"
2023-10-12 21:23:45,905 - shuffle: "True"
2023-10-12 21:23:45,905 ----------------------------------------------------------------------------------------------------
2023-10-12 21:23:45,905 Plugins:
2023-10-12 21:23:45,905 - TensorboardLogger
2023-10-12 21:23:45,905 - LinearScheduler | warmup_fraction: '0.1'
2023-10-12 21:23:45,905 ----------------------------------------------------------------------------------------------------
2023-10-12 21:23:45,905 Final evaluation on model from best epoch (best-model.pt)
2023-10-12 21:23:45,905 - metric: "('micro avg', 'f1-score')"
2023-10-12 21:23:45,905 ----------------------------------------------------------------------------------------------------
2023-10-12 21:23:45,906 Computation:
2023-10-12 21:23:45,906 - compute on device: cuda:0
2023-10-12 21:23:45,906 - embedding storage: none
2023-10-12 21:23:45,906 ----------------------------------------------------------------------------------------------------
2023-10-12 21:23:45,906 Model training base path: "hmbench-icdar/nl-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-5"
2023-10-12 21:23:45,906 ----------------------------------------------------------------------------------------------------
2023-10-12 21:23:45,906 ----------------------------------------------------------------------------------------------------
2023-10-12 21:23:45,906 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-12 21:24:27,641 epoch 1 - iter 144/1445 - loss 2.53329630 - time (sec): 41.73 - samples/sec: 432.73 - lr: 0.000016 - momentum: 0.000000
2023-10-12 21:25:09,203 epoch 1 - iter 288/1445 - loss 2.36537339 - time (sec): 83.29 - samples/sec: 433.39 - lr: 0.000032 - momentum: 0.000000
2023-10-12 21:25:49,786 epoch 1 - iter 432/1445 - loss 2.11838959 - time (sec): 123.88 - samples/sec: 422.80 - lr: 0.000048 - momentum: 0.000000
2023-10-12 21:26:31,250 epoch 1 - iter 576/1445 - loss 1.81958359 - time (sec): 165.34 - samples/sec: 423.75 - lr: 0.000064 - momentum: 0.000000
2023-10-12 21:27:12,928 epoch 1 - iter 720/1445 - loss 1.54330067 - time (sec): 207.02 - samples/sec: 424.24 - lr: 0.000080 - momentum: 0.000000
2023-10-12 21:27:54,484 epoch 1 - iter 864/1445 - loss 1.33675258 - time (sec): 248.58 - samples/sec: 421.19 - lr: 0.000096 - momentum: 0.000000
2023-10-12 21:28:36,755 epoch 1 - iter 1008/1445 - loss 1.17087202 - time (sec): 290.85 - samples/sec: 420.93 - lr: 0.000112 - momentum: 0.000000
2023-10-12 21:29:17,487 epoch 1 - iter 1152/1445 - loss 1.05166695 - time (sec): 331.58 - samples/sec: 419.84 - lr: 0.000127 - momentum: 0.000000
2023-10-12 21:29:59,649 epoch 1 - iter 1296/1445 - loss 0.95132545 - time (sec): 373.74 - samples/sec: 419.88 - lr: 0.000143 - momentum: 0.000000
2023-10-12 21:30:42,244 epoch 1 - iter 1440/1445 - loss 0.86538888 - time (sec): 416.34 - samples/sec: 421.44 - lr: 0.000159 - momentum: 0.000000
2023-10-12 21:30:43,683 ----------------------------------------------------------------------------------------------------
2023-10-12 21:30:43,684 EPOCH 1 done: loss 0.8621 - lr: 0.000159
2023-10-12 21:31:04,241 DEV : loss 0.1847972571849823 - f1-score (micro avg) 0.3705
2023-10-12 21:31:04,273 saving best model
2023-10-12 21:31:05,195 ----------------------------------------------------------------------------------------------------
2023-10-12 21:31:48,840 epoch 2 - iter 144/1445 - loss 0.13150717 - time (sec): 43.64 - samples/sec: 399.70 - lr: 0.000158 - momentum: 0.000000
2023-10-12 21:32:32,839 epoch 2 - iter 288/1445 - loss 0.12547482 - time (sec): 87.64 - samples/sec: 404.66 - lr: 0.000156 - momentum: 0.000000
2023-10-12 21:33:15,269 epoch 2 - iter 432/1445 - loss 0.12341173 - time (sec): 130.07 - samples/sec: 399.66 - lr: 0.000155 - momentum: 0.000000
2023-10-12 21:33:58,085 epoch 2 - iter 576/1445 - loss 0.12210825 - time (sec): 172.89 - samples/sec: 403.62 - lr: 0.000153 - momentum: 0.000000
2023-10-12 21:34:41,150 epoch 2 - iter 720/1445 - loss 0.11883011 - time (sec): 215.95 - samples/sec: 403.98 - lr: 0.000151 - momentum: 0.000000
2023-10-12 21:35:27,590 epoch 2 - iter 864/1445 - loss 0.11676973 - time (sec): 262.39 - samples/sec: 399.41 - lr: 0.000149 - momentum: 0.000000
2023-10-12 21:36:12,186 epoch 2 - iter 1008/1445 - loss 0.11778461 - time (sec): 306.99 - samples/sec: 397.72 - lr: 0.000148 - momentum: 0.000000
2023-10-12 21:36:53,431 epoch 2 - iter 1152/1445 - loss 0.11553749 - time (sec): 348.23 - samples/sec: 401.78 - lr: 0.000146 - momentum: 0.000000
2023-10-12 21:37:33,796 epoch 2 - iter 1296/1445 - loss 0.11346210 - time (sec): 388.60 - samples/sec: 407.08 - lr: 0.000144 - momentum: 0.000000
2023-10-12 21:38:13,915 epoch 2 - iter 1440/1445 - loss 0.11050610 - time (sec): 428.72 - samples/sec: 409.60 - lr: 0.000142 - momentum: 0.000000
2023-10-12 21:38:15,240 ----------------------------------------------------------------------------------------------------
2023-10-12 21:38:15,241 EPOCH 2 done: loss 0.1103 - lr: 0.000142
2023-10-12 21:38:35,365 DEV : loss 0.08923686295747757 - f1-score (micro avg) 0.8234
2023-10-12 21:38:35,394 saving best model
2023-10-12 21:38:37,920 ----------------------------------------------------------------------------------------------------
2023-10-12 21:39:18,782 epoch 3 - iter 144/1445 - loss 0.06834371 - time (sec): 40.86 - samples/sec: 438.20 - lr: 0.000140 - momentum: 0.000000
2023-10-12 21:40:00,281 epoch 3 - iter 288/1445 - loss 0.06815199 - time (sec): 82.36 - samples/sec: 436.70 - lr: 0.000139 - momentum: 0.000000
2023-10-12 21:40:40,704 epoch 3 - iter 432/1445 - loss 0.06735091 - time (sec): 122.78 - samples/sec: 435.93 - lr: 0.000137 - momentum: 0.000000
2023-10-12 21:41:21,257 epoch 3 - iter 576/1445 - loss 0.06974133 - time (sec): 163.33 - samples/sec: 435.20 - lr: 0.000135 - momentum: 0.000000
2023-10-12 21:42:01,798 epoch 3 - iter 720/1445 - loss 0.06899677 - time (sec): 203.87 - samples/sec: 436.74 - lr: 0.000133 - momentum: 0.000000
2023-10-12 21:42:42,864 epoch 3 - iter 864/1445 - loss 0.06959133 - time (sec): 244.94 - samples/sec: 439.98 - lr: 0.000132 - momentum: 0.000000
2023-10-12 21:43:24,863 epoch 3 - iter 1008/1445 - loss 0.06960467 - time (sec): 286.94 - samples/sec: 436.21 - lr: 0.000130 - momentum: 0.000000
2023-10-12 21:44:06,858 epoch 3 - iter 1152/1445 - loss 0.07023728 - time (sec): 328.93 - samples/sec: 432.23 - lr: 0.000128 - momentum: 0.000000
2023-10-12 21:44:49,005 epoch 3 - iter 1296/1445 - loss 0.06915927 - time (sec): 371.08 - samples/sec: 427.73 - lr: 0.000126 - momentum: 0.000000
2023-10-12 21:45:31,391 epoch 3 - iter 1440/1445 - loss 0.06784203 - time (sec): 413.47 - samples/sec: 424.50 - lr: 0.000125 - momentum: 0.000000
2023-10-12 21:45:32,761 ----------------------------------------------------------------------------------------------------
2023-10-12 21:45:32,762 EPOCH 3 done: loss 0.0679 - lr: 0.000125
2023-10-12 21:45:54,049 DEV : loss 0.07864446192979813 - f1-score (micro avg) 0.8472
2023-10-12 21:45:54,079 saving best model
2023-10-12 21:45:56,622 ----------------------------------------------------------------------------------------------------
2023-10-12 21:46:38,942 epoch 4 - iter 144/1445 - loss 0.05236873 - time (sec): 42.32 - samples/sec: 423.83 - lr: 0.000123 - momentum: 0.000000
2023-10-12 21:47:20,089 epoch 4 - iter 288/1445 - loss 0.05180904 - time (sec): 83.46 - samples/sec: 416.34 - lr: 0.000121 - momentum: 0.000000
2023-10-12 21:48:02,270 epoch 4 - iter 432/1445 - loss 0.04908008 - time (sec): 125.64 - samples/sec: 417.28 - lr: 0.000119 - momentum: 0.000000
2023-10-12 21:48:45,487 epoch 4 - iter 576/1445 - loss 0.04796946 - time (sec): 168.86 - samples/sec: 423.21 - lr: 0.000117 - momentum: 0.000000
2023-10-12 21:49:28,393 epoch 4 - iter 720/1445 - loss 0.04610333 - time (sec): 211.77 - samples/sec: 421.32 - lr: 0.000116 - momentum: 0.000000
2023-10-12 21:50:09,873 epoch 4 - iter 864/1445 - loss 0.04493061 - time (sec): 253.25 - samples/sec: 419.33 - lr: 0.000114 - momentum: 0.000000
2023-10-12 21:50:50,880 epoch 4 - iter 1008/1445 - loss 0.04548811 - time (sec): 294.25 - samples/sec: 418.63 - lr: 0.000112 - momentum: 0.000000
2023-10-12 21:51:32,768 epoch 4 - iter 1152/1445 - loss 0.04476085 - time (sec): 336.14 - samples/sec: 420.71 - lr: 0.000110 - momentum: 0.000000
2023-10-12 21:52:13,889 epoch 4 - iter 1296/1445 - loss 0.04656794 - time (sec): 377.26 - samples/sec: 421.08 - lr: 0.000109 - momentum: 0.000000
2023-10-12 21:52:54,707 epoch 4 - iter 1440/1445 - loss 0.04588922 - time (sec): 418.08 - samples/sec: 420.56 - lr: 0.000107 - momentum: 0.000000
2023-10-12 21:52:55,823 ----------------------------------------------------------------------------------------------------
2023-10-12 21:52:55,823 EPOCH 4 done: loss 0.0460 - lr: 0.000107
2023-10-12 21:53:16,107 DEV : loss 0.10065485537052155 - f1-score (micro avg) 0.8398
2023-10-12 21:53:16,137 ----------------------------------------------------------------------------------------------------
2023-10-12 21:53:57,924 epoch 5 - iter 144/1445 - loss 0.03754111 - time (sec): 41.79 - samples/sec: 451.73 - lr: 0.000105 - momentum: 0.000000
2023-10-12 21:54:37,644 epoch 5 - iter 288/1445 - loss 0.03295251 - time (sec): 81.51 - samples/sec: 445.25 - lr: 0.000103 - momentum: 0.000000
2023-10-12 21:55:16,790 epoch 5 - iter 432/1445 - loss 0.03034951 - time (sec): 120.65 - samples/sec: 429.78 - lr: 0.000101 - momentum: 0.000000
2023-10-12 21:55:55,747 epoch 5 - iter 576/1445 - loss 0.02962774 - time (sec): 159.61 - samples/sec: 426.38 - lr: 0.000100 - momentum: 0.000000
2023-10-12 21:56:36,572 epoch 5 - iter 720/1445 - loss 0.03171128 - time (sec): 200.43 - samples/sec: 432.19 - lr: 0.000098 - momentum: 0.000000
2023-10-12 21:57:16,464 epoch 5 - iter 864/1445 - loss 0.03189659 - time (sec): 240.33 - samples/sec: 432.78 - lr: 0.000096 - momentum: 0.000000
2023-10-12 21:57:57,974 epoch 5 - iter 1008/1445 - loss 0.03208778 - time (sec): 281.83 - samples/sec: 434.75 - lr: 0.000094 - momentum: 0.000000
2023-10-12 21:58:38,623 epoch 5 - iter 1152/1445 - loss 0.03196953 - time (sec): 322.48 - samples/sec: 435.07 - lr: 0.000093 - momentum: 0.000000
2023-10-12 21:59:18,873 epoch 5 - iter 1296/1445 - loss 0.03247366 - time (sec): 362.73 - samples/sec: 435.30 - lr: 0.000091 - momentum: 0.000000
2023-10-12 21:59:58,904 epoch 5 - iter 1440/1445 - loss 0.03343762 - time (sec): 402.77 - samples/sec: 435.41 - lr: 0.000089 - momentum: 0.000000
2023-10-12 22:00:00,335 ----------------------------------------------------------------------------------------------------
2023-10-12 22:00:00,336 EPOCH 5 done: loss 0.0340 - lr: 0.000089
2023-10-12 22:00:21,562 DEV : loss 0.11156909167766571 - f1-score (micro avg) 0.8332
2023-10-12 22:00:21,591 ----------------------------------------------------------------------------------------------------
2023-10-12 22:01:02,121 epoch 6 - iter 144/1445 - loss 0.01887123 - time (sec): 40.53 - samples/sec: 424.57 - lr: 0.000087 - momentum: 0.000000
2023-10-12 22:01:42,563 epoch 6 - iter 288/1445 - loss 0.02116022 - time (sec): 80.97 - samples/sec: 426.42 - lr: 0.000085 - momentum: 0.000000
2023-10-12 22:02:23,726 epoch 6 - iter 432/1445 - loss 0.02492081 - time (sec): 122.13 - samples/sec: 429.07 - lr: 0.000084 - momentum: 0.000000
2023-10-12 22:03:05,157 epoch 6 - iter 576/1445 - loss 0.02312836 - time (sec): 163.56 - samples/sec: 430.48 - lr: 0.000082 - momentum: 0.000000
2023-10-12 22:03:46,615 epoch 6 - iter 720/1445 - loss 0.02397003 - time (sec): 205.02 - samples/sec: 429.86 - lr: 0.000080 - momentum: 0.000000
2023-10-12 22:04:29,524 epoch 6 - iter 864/1445 - loss 0.02169080 - time (sec): 247.93 - samples/sec: 429.81 - lr: 0.000078 - momentum: 0.000000
2023-10-12 22:05:11,526 epoch 6 - iter 1008/1445 - loss 0.02492991 - time (sec): 289.93 - samples/sec: 429.05 - lr: 0.000076 - momentum: 0.000000
2023-10-12 22:05:52,796 epoch 6 - iter 1152/1445 - loss 0.02363443 - time (sec): 331.20 - samples/sec: 425.58 - lr: 0.000075 - momentum: 0.000000
2023-10-12 22:06:32,978 epoch 6 - iter 1296/1445 - loss 0.02330251 - time (sec): 371.38 - samples/sec: 424.51 - lr: 0.000073 - momentum: 0.000000
2023-10-12 22:07:14,891 epoch 6 - iter 1440/1445 - loss 0.02376795 - time (sec): 413.30 - samples/sec: 425.05 - lr: 0.000071 - momentum: 0.000000
2023-10-12 22:07:16,126 ----------------------------------------------------------------------------------------------------
2023-10-12 22:07:16,126 EPOCH 6 done: loss 0.0237 - lr: 0.000071
2023-10-12 22:07:36,356 DEV : loss 0.13551419973373413 - f1-score (micro avg) 0.841
2023-10-12 22:07:36,386 ----------------------------------------------------------------------------------------------------
2023-10-12 22:08:18,583 epoch 7 - iter 144/1445 - loss 0.01993712 - time (sec): 42.19 - samples/sec: 418.04 - lr: 0.000069 - momentum: 0.000000
2023-10-12 22:08:59,833 epoch 7 - iter 288/1445 - loss 0.01779429 - time (sec): 83.45 - samples/sec: 426.34 - lr: 0.000068 - momentum: 0.000000
2023-10-12 22:09:40,252 epoch 7 - iter 432/1445 - loss 0.01748285 - time (sec): 123.86 - samples/sec: 420.58 - lr: 0.000066 - momentum: 0.000000
2023-10-12 22:10:20,324 epoch 7 - iter 576/1445 - loss 0.01656374 - time (sec): 163.94 - samples/sec: 418.75 - lr: 0.000064 - momentum: 0.000000
2023-10-12 22:11:01,416 epoch 7 - iter 720/1445 - loss 0.01878790 - time (sec): 205.03 - samples/sec: 423.33 - lr: 0.000062 - momentum: 0.000000
2023-10-12 22:11:42,695 epoch 7 - iter 864/1445 - loss 0.01829691 - time (sec): 246.31 - samples/sec: 423.11 - lr: 0.000060 - momentum: 0.000000
2023-10-12 22:12:23,106 epoch 7 - iter 1008/1445 - loss 0.01767695 - time (sec): 286.72 - samples/sec: 425.26 - lr: 0.000059 - momentum: 0.000000
2023-10-12 22:13:03,909 epoch 7 - iter 1152/1445 - loss 0.01746928 - time (sec): 327.52 - samples/sec: 424.22 - lr: 0.000057 - momentum: 0.000000
2023-10-12 22:13:45,103 epoch 7 - iter 1296/1445 - loss 0.01724052 - time (sec): 368.71 - samples/sec: 424.49 - lr: 0.000055 - momentum: 0.000000
2023-10-12 22:14:27,824 epoch 7 - iter 1440/1445 - loss 0.01810091 - time (sec): 411.44 - samples/sec: 426.53 - lr: 0.000053 - momentum: 0.000000
2023-10-12 22:14:29,309 ----------------------------------------------------------------------------------------------------
2023-10-12 22:14:29,310 EPOCH 7 done: loss 0.0180 - lr: 0.000053
2023-10-12 22:14:51,394 DEV : loss 0.13199612498283386 - f1-score (micro avg) 0.8541
2023-10-12 22:14:51,426 saving best model
2023-10-12 22:14:54,014 ----------------------------------------------------------------------------------------------------
2023-10-12 22:15:35,081 epoch 8 - iter 144/1445 - loss 0.01400894 - time (sec): 41.06 - samples/sec: 452.00 - lr: 0.000052 - momentum: 0.000000
2023-10-12 22:16:16,384 epoch 8 - iter 288/1445 - loss 0.01190023 - time (sec): 82.36 - samples/sec: 435.97 - lr: 0.000050 - momentum: 0.000000
2023-10-12 22:16:57,421 epoch 8 - iter 432/1445 - loss 0.01305794 - time (sec): 123.40 - samples/sec: 428.05 - lr: 0.000048 - momentum: 0.000000
2023-10-12 22:17:40,099 epoch 8 - iter 576/1445 - loss 0.01195701 - time (sec): 166.08 - samples/sec: 433.10 - lr: 0.000046 - momentum: 0.000000
2023-10-12 22:18:22,708 epoch 8 - iter 720/1445 - loss 0.01228432 - time (sec): 208.69 - samples/sec: 427.92 - lr: 0.000044 - momentum: 0.000000
2023-10-12 22:19:04,704 epoch 8 - iter 864/1445 - loss 0.01257074 - time (sec): 250.69 - samples/sec: 421.89 - lr: 0.000043 - momentum: 0.000000
2023-10-12 22:19:47,463 epoch 8 - iter 1008/1445 - loss 0.01349553 - time (sec): 293.44 - samples/sec: 419.32 - lr: 0.000041 - momentum: 0.000000
2023-10-12 22:20:29,739 epoch 8 - iter 1152/1445 - loss 0.01302670 - time (sec): 335.72 - samples/sec: 415.59 - lr: 0.000039 - momentum: 0.000000
2023-10-12 22:21:13,081 epoch 8 - iter 1296/1445 - loss 0.01428236 - time (sec): 379.06 - samples/sec: 416.56 - lr: 0.000037 - momentum: 0.000000
2023-10-12 22:21:55,650 epoch 8 - iter 1440/1445 - loss 0.01424477 - time (sec): 421.63 - samples/sec: 416.77 - lr: 0.000036 - momentum: 0.000000
2023-10-12 22:21:56,886 ----------------------------------------------------------------------------------------------------
2023-10-12 22:21:56,887 EPOCH 8 done: loss 0.0143 - lr: 0.000036
2023-10-12 22:22:17,436 DEV : loss 0.15470005571842194 - f1-score (micro avg) 0.8457
2023-10-12 22:22:17,465 ----------------------------------------------------------------------------------------------------
2023-10-12 22:22:59,069 epoch 9 - iter 144/1445 - loss 0.00569141 - time (sec): 41.60 - samples/sec: 442.36 - lr: 0.000034 - momentum: 0.000000
2023-10-12 22:23:40,717 epoch 9 - iter 288/1445 - loss 0.01301404 - time (sec): 83.25 - samples/sec: 446.83 - lr: 0.000032 - momentum: 0.000000
2023-10-12 22:24:21,009 epoch 9 - iter 432/1445 - loss 0.01230076 - time (sec): 123.54 - samples/sec: 445.61 - lr: 0.000030 - momentum: 0.000000
2023-10-12 22:25:00,563 epoch 9 - iter 576/1445 - loss 0.01178494 - time (sec): 163.10 - samples/sec: 436.19 - lr: 0.000028 - momentum: 0.000000
2023-10-12 22:25:39,478 epoch 9 - iter 720/1445 - loss 0.01121005 - time (sec): 202.01 - samples/sec: 430.67 - lr: 0.000027 - momentum: 0.000000
2023-10-12 22:26:19,982 epoch 9 - iter 864/1445 - loss 0.01117518 - time (sec): 242.51 - samples/sec: 432.62 - lr: 0.000025 - momentum: 0.000000
2023-10-12 22:27:00,522 epoch 9 - iter 1008/1445 - loss 0.01142421 - time (sec): 283.05 - samples/sec: 432.90 - lr: 0.000023 - momentum: 0.000000
2023-10-12 22:27:42,905 epoch 9 - iter 1152/1445 - loss 0.01188551 - time (sec): 325.44 - samples/sec: 434.68 - lr: 0.000021 - momentum: 0.000000
2023-10-12 22:28:23,774 epoch 9 - iter 1296/1445 - loss 0.01117904 - time (sec): 366.31 - samples/sec: 432.74 - lr: 0.000020 - momentum: 0.000000
2023-10-12 22:29:04,715 epoch 9 - iter 1440/1445 - loss 0.01054392 - time (sec): 407.25 - samples/sec: 431.36 - lr: 0.000018 - momentum: 0.000000
2023-10-12 22:29:05,942 ----------------------------------------------------------------------------------------------------
2023-10-12 22:29:05,943 EPOCH 9 done: loss 0.0105 - lr: 0.000018
2023-10-12 22:29:27,685 DEV : loss 0.15782958269119263 - f1-score (micro avg) 0.851
2023-10-12 22:29:27,716 ----------------------------------------------------------------------------------------------------
2023-10-12 22:30:09,377 epoch 10 - iter 144/1445 - loss 0.00649677 - time (sec): 41.66 - samples/sec: 432.41 - lr: 0.000016 - momentum: 0.000000
2023-10-12 22:30:48,589 epoch 10 - iter 288/1445 - loss 0.00717470 - time (sec): 80.87 - samples/sec: 417.09 - lr: 0.000014 - momentum: 0.000000
2023-10-12 22:31:28,912 epoch 10 - iter 432/1445 - loss 0.00796770 - time (sec): 121.19 - samples/sec: 418.26 - lr: 0.000012 - momentum: 0.000000
2023-10-12 22:32:11,533 epoch 10 - iter 576/1445 - loss 0.00925299 - time (sec): 163.82 - samples/sec: 421.90 - lr: 0.000011 - momentum: 0.000000
2023-10-12 22:32:53,189 epoch 10 - iter 720/1445 - loss 0.00823255 - time (sec): 205.47 - samples/sec: 420.03 - lr: 0.000009 - momentum: 0.000000
2023-10-12 22:33:35,725 epoch 10 - iter 864/1445 - loss 0.00732982 - time (sec): 248.01 - samples/sec: 422.20 - lr: 0.000007 - momentum: 0.000000
2023-10-12 22:34:18,703 epoch 10 - iter 1008/1445 - loss 0.00779755 - time (sec): 290.98 - samples/sec: 423.65 - lr: 0.000005 - momentum: 0.000000
2023-10-12 22:35:00,516 epoch 10 - iter 1152/1445 - loss 0.00738921 - time (sec): 332.80 - samples/sec: 420.58 - lr: 0.000004 - momentum: 0.000000
2023-10-12 22:35:42,966 epoch 10 - iter 1296/1445 - loss 0.00811633 - time (sec): 375.25 - samples/sec: 420.22 - lr: 0.000002 - momentum: 0.000000
2023-10-12 22:36:25,327 epoch 10 - iter 1440/1445 - loss 0.00775884 - time (sec): 417.61 - samples/sec: 420.82 - lr: 0.000000 - momentum: 0.000000
2023-10-12 22:36:26,519 ----------------------------------------------------------------------------------------------------
2023-10-12 22:36:26,519 EPOCH 10 done: loss 0.0077 - lr: 0.000000
2023-10-12 22:36:47,942 DEV : loss 0.16641554236412048 - f1-score (micro avg) 0.8455
2023-10-12 22:36:48,832 ----------------------------------------------------------------------------------------------------
2023-10-12 22:36:48,834 Loading model from best epoch ...
2023-10-12 22:36:52,922 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-12 22:37:13,988
Results:
- F-score (micro) 0.8543
- F-score (macro) 0.7635
- Accuracy 0.7602
By class:
precision recall f1-score support
PER 0.8577 0.8631 0.8604 482
LOC 0.9238 0.8734 0.8979 458
ORG 0.5286 0.5362 0.5324 69
micro avg 0.8634 0.8454 0.8543 1009
macro avg 0.7700 0.7576 0.7635 1009
weighted avg 0.8652 0.8454 0.8550 1009
2023-10-12 22:37:13,988 ----------------------------------------------------------------------------------------------------