stefan-it's picture
Upload folder using huggingface_hub
bbdcd25
2023-10-13 20:58:05,783 ----------------------------------------------------------------------------------------------------
2023-10-13 20:58:05,786 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 20:58:05,786 ----------------------------------------------------------------------------------------------------
2023-10-13 20:58:05,787 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-13 20:58:05,787 ----------------------------------------------------------------------------------------------------
2023-10-13 20:58:05,787 Train: 6183 sentences
2023-10-13 20:58:05,787 (train_with_dev=False, train_with_test=False)
2023-10-13 20:58:05,787 ----------------------------------------------------------------------------------------------------
2023-10-13 20:58:05,787 Training Params:
2023-10-13 20:58:05,787 - learning_rate: "0.00016"
2023-10-13 20:58:05,787 - mini_batch_size: "4"
2023-10-13 20:58:05,787 - max_epochs: "10"
2023-10-13 20:58:05,787 - shuffle: "True"
2023-10-13 20:58:05,787 ----------------------------------------------------------------------------------------------------
2023-10-13 20:58:05,787 Plugins:
2023-10-13 20:58:05,787 - TensorboardLogger
2023-10-13 20:58:05,788 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 20:58:05,788 ----------------------------------------------------------------------------------------------------
2023-10-13 20:58:05,788 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 20:58:05,788 - metric: "('micro avg', 'f1-score')"
2023-10-13 20:58:05,788 ----------------------------------------------------------------------------------------------------
2023-10-13 20:58:05,788 Computation:
2023-10-13 20:58:05,788 - compute on device: cuda:0
2023-10-13 20:58:05,788 - embedding storage: none
2023-10-13 20:58:05,788 ----------------------------------------------------------------------------------------------------
2023-10-13 20:58:05,788 Model training base path: "hmbench-topres19th/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-3"
2023-10-13 20:58:05,788 ----------------------------------------------------------------------------------------------------
2023-10-13 20:58:05,788 ----------------------------------------------------------------------------------------------------
2023-10-13 20:58:05,789 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-13 20:58:49,004 epoch 1 - iter 154/1546 - loss 2.53231029 - time (sec): 43.21 - samples/sec: 267.16 - lr: 0.000016 - momentum: 0.000000
2023-10-13 20:59:32,658 epoch 1 - iter 308/1546 - loss 2.41126778 - time (sec): 86.87 - samples/sec: 277.17 - lr: 0.000032 - momentum: 0.000000
2023-10-13 21:00:16,388 epoch 1 - iter 462/1546 - loss 2.13403547 - time (sec): 130.60 - samples/sec: 280.34 - lr: 0.000048 - momentum: 0.000000
2023-10-13 21:01:00,109 epoch 1 - iter 616/1546 - loss 1.82534353 - time (sec): 174.32 - samples/sec: 282.12 - lr: 0.000064 - momentum: 0.000000
2023-10-13 21:01:43,709 epoch 1 - iter 770/1546 - loss 1.55027770 - time (sec): 217.92 - samples/sec: 280.43 - lr: 0.000080 - momentum: 0.000000
2023-10-13 21:02:26,951 epoch 1 - iter 924/1546 - loss 1.33327065 - time (sec): 261.16 - samples/sec: 280.72 - lr: 0.000096 - momentum: 0.000000
2023-10-13 21:03:11,451 epoch 1 - iter 1078/1546 - loss 1.17102836 - time (sec): 305.66 - samples/sec: 280.17 - lr: 0.000111 - momentum: 0.000000
2023-10-13 21:03:54,682 epoch 1 - iter 1232/1546 - loss 1.04226789 - time (sec): 348.89 - samples/sec: 281.74 - lr: 0.000127 - momentum: 0.000000
2023-10-13 21:04:38,843 epoch 1 - iter 1386/1546 - loss 0.94299904 - time (sec): 393.05 - samples/sec: 281.61 - lr: 0.000143 - momentum: 0.000000
2023-10-13 21:05:23,431 epoch 1 - iter 1540/1546 - loss 0.85609274 - time (sec): 437.64 - samples/sec: 283.15 - lr: 0.000159 - momentum: 0.000000
2023-10-13 21:05:24,901 ----------------------------------------------------------------------------------------------------
2023-10-13 21:05:24,901 EPOCH 1 done: loss 0.8540 - lr: 0.000159
2023-10-13 21:05:41,576 DEV : loss 0.09267304092645645 - f1-score (micro avg) 0.6009
2023-10-13 21:05:41,608 saving best model
2023-10-13 21:05:42,536 ----------------------------------------------------------------------------------------------------
2023-10-13 21:06:26,579 epoch 2 - iter 154/1546 - loss 0.11615964 - time (sec): 44.04 - samples/sec: 254.88 - lr: 0.000158 - momentum: 0.000000
2023-10-13 21:07:10,522 epoch 2 - iter 308/1546 - loss 0.11061761 - time (sec): 87.98 - samples/sec: 263.62 - lr: 0.000156 - momentum: 0.000000
2023-10-13 21:07:55,175 epoch 2 - iter 462/1546 - loss 0.10758656 - time (sec): 132.64 - samples/sec: 273.51 - lr: 0.000155 - momentum: 0.000000
2023-10-13 21:08:41,329 epoch 2 - iter 616/1546 - loss 0.10808746 - time (sec): 178.79 - samples/sec: 273.48 - lr: 0.000153 - momentum: 0.000000
2023-10-13 21:09:27,938 epoch 2 - iter 770/1546 - loss 0.10648110 - time (sec): 225.40 - samples/sec: 273.00 - lr: 0.000151 - momentum: 0.000000
2023-10-13 21:10:14,282 epoch 2 - iter 924/1546 - loss 0.10566464 - time (sec): 271.74 - samples/sec: 269.16 - lr: 0.000149 - momentum: 0.000000
2023-10-13 21:11:00,253 epoch 2 - iter 1078/1546 - loss 0.10384736 - time (sec): 317.71 - samples/sec: 267.01 - lr: 0.000148 - momentum: 0.000000
2023-10-13 21:11:48,176 epoch 2 - iter 1232/1546 - loss 0.10079906 - time (sec): 365.64 - samples/sec: 269.51 - lr: 0.000146 - momentum: 0.000000
2023-10-13 21:12:35,146 epoch 2 - iter 1386/1546 - loss 0.09804350 - time (sec): 412.61 - samples/sec: 267.81 - lr: 0.000144 - momentum: 0.000000
2023-10-13 21:13:22,800 epoch 2 - iter 1540/1546 - loss 0.09640835 - time (sec): 460.26 - samples/sec: 269.31 - lr: 0.000142 - momentum: 0.000000
2023-10-13 21:13:24,440 ----------------------------------------------------------------------------------------------------
2023-10-13 21:13:24,441 EPOCH 2 done: loss 0.0965 - lr: 0.000142
2023-10-13 21:13:41,749 DEV : loss 0.06338806450366974 - f1-score (micro avg) 0.7627
2023-10-13 21:13:41,778 saving best model
2023-10-13 21:13:44,817 ----------------------------------------------------------------------------------------------------
2023-10-13 21:14:29,695 epoch 3 - iter 154/1546 - loss 0.05410580 - time (sec): 44.87 - samples/sec: 286.43 - lr: 0.000140 - momentum: 0.000000
2023-10-13 21:15:13,367 epoch 3 - iter 308/1546 - loss 0.06094115 - time (sec): 88.55 - samples/sec: 284.61 - lr: 0.000139 - momentum: 0.000000
2023-10-13 21:15:57,684 epoch 3 - iter 462/1546 - loss 0.05647310 - time (sec): 132.86 - samples/sec: 281.56 - lr: 0.000137 - momentum: 0.000000
2023-10-13 21:16:41,468 epoch 3 - iter 616/1546 - loss 0.05893520 - time (sec): 176.65 - samples/sec: 280.16 - lr: 0.000135 - momentum: 0.000000
2023-10-13 21:17:22,843 epoch 3 - iter 770/1546 - loss 0.05949715 - time (sec): 218.02 - samples/sec: 281.27 - lr: 0.000133 - momentum: 0.000000
2023-10-13 21:18:03,773 epoch 3 - iter 924/1546 - loss 0.06026043 - time (sec): 258.95 - samples/sec: 285.37 - lr: 0.000132 - momentum: 0.000000
2023-10-13 21:18:44,789 epoch 3 - iter 1078/1546 - loss 0.05796925 - time (sec): 299.97 - samples/sec: 288.16 - lr: 0.000130 - momentum: 0.000000
2023-10-13 21:19:26,339 epoch 3 - iter 1232/1546 - loss 0.05578592 - time (sec): 341.52 - samples/sec: 289.43 - lr: 0.000128 - momentum: 0.000000
2023-10-13 21:20:09,255 epoch 3 - iter 1386/1546 - loss 0.05615699 - time (sec): 384.43 - samples/sec: 289.28 - lr: 0.000126 - momentum: 0.000000
2023-10-13 21:20:52,873 epoch 3 - iter 1540/1546 - loss 0.05656863 - time (sec): 428.05 - samples/sec: 288.85 - lr: 0.000125 - momentum: 0.000000
2023-10-13 21:20:54,649 ----------------------------------------------------------------------------------------------------
2023-10-13 21:20:54,649 EPOCH 3 done: loss 0.0567 - lr: 0.000125
2023-10-13 21:21:11,549 DEV : loss 0.0654740035533905 - f1-score (micro avg) 0.766
2023-10-13 21:21:11,579 saving best model
2023-10-13 21:21:14,186 ----------------------------------------------------------------------------------------------------
2023-10-13 21:21:57,158 epoch 4 - iter 154/1546 - loss 0.03165886 - time (sec): 42.97 - samples/sec: 278.38 - lr: 0.000123 - momentum: 0.000000
2023-10-13 21:22:39,777 epoch 4 - iter 308/1546 - loss 0.03776274 - time (sec): 85.59 - samples/sec: 281.61 - lr: 0.000121 - momentum: 0.000000
2023-10-13 21:23:23,569 epoch 4 - iter 462/1546 - loss 0.03624070 - time (sec): 129.38 - samples/sec: 284.14 - lr: 0.000119 - momentum: 0.000000
2023-10-13 21:24:06,560 epoch 4 - iter 616/1546 - loss 0.03377782 - time (sec): 172.37 - samples/sec: 280.75 - lr: 0.000117 - momentum: 0.000000
2023-10-13 21:24:49,543 epoch 4 - iter 770/1546 - loss 0.03634239 - time (sec): 215.35 - samples/sec: 282.12 - lr: 0.000116 - momentum: 0.000000
2023-10-13 21:25:33,555 epoch 4 - iter 924/1546 - loss 0.03508408 - time (sec): 259.36 - samples/sec: 284.97 - lr: 0.000114 - momentum: 0.000000
2023-10-13 21:26:17,081 epoch 4 - iter 1078/1546 - loss 0.03498885 - time (sec): 302.89 - samples/sec: 283.82 - lr: 0.000112 - momentum: 0.000000
2023-10-13 21:27:01,339 epoch 4 - iter 1232/1546 - loss 0.03528830 - time (sec): 347.15 - samples/sec: 284.53 - lr: 0.000110 - momentum: 0.000000
2023-10-13 21:27:45,612 epoch 4 - iter 1386/1546 - loss 0.03507875 - time (sec): 391.42 - samples/sec: 285.14 - lr: 0.000109 - momentum: 0.000000
2023-10-13 21:28:29,305 epoch 4 - iter 1540/1546 - loss 0.03462670 - time (sec): 435.11 - samples/sec: 284.48 - lr: 0.000107 - momentum: 0.000000
2023-10-13 21:28:31,045 ----------------------------------------------------------------------------------------------------
2023-10-13 21:28:31,046 EPOCH 4 done: loss 0.0347 - lr: 0.000107
2023-10-13 21:28:48,762 DEV : loss 0.08016947656869888 - f1-score (micro avg) 0.7992
2023-10-13 21:28:48,791 saving best model
2023-10-13 21:28:51,424 ----------------------------------------------------------------------------------------------------
2023-10-13 21:29:35,833 epoch 5 - iter 154/1546 - loss 0.02532910 - time (sec): 44.40 - samples/sec: 294.86 - lr: 0.000105 - momentum: 0.000000
2023-10-13 21:30:17,792 epoch 5 - iter 308/1546 - loss 0.01972287 - time (sec): 86.36 - samples/sec: 283.14 - lr: 0.000103 - momentum: 0.000000
2023-10-13 21:31:00,804 epoch 5 - iter 462/1546 - loss 0.02131220 - time (sec): 129.38 - samples/sec: 288.04 - lr: 0.000101 - momentum: 0.000000
2023-10-13 21:31:44,172 epoch 5 - iter 616/1546 - loss 0.02104739 - time (sec): 172.74 - samples/sec: 288.32 - lr: 0.000100 - momentum: 0.000000
2023-10-13 21:32:27,586 epoch 5 - iter 770/1546 - loss 0.02339049 - time (sec): 216.16 - samples/sec: 290.50 - lr: 0.000098 - momentum: 0.000000
2023-10-13 21:33:12,384 epoch 5 - iter 924/1546 - loss 0.02327555 - time (sec): 260.96 - samples/sec: 289.34 - lr: 0.000096 - momentum: 0.000000
2023-10-13 21:33:56,075 epoch 5 - iter 1078/1546 - loss 0.02357816 - time (sec): 304.65 - samples/sec: 287.95 - lr: 0.000094 - momentum: 0.000000
2023-10-13 21:34:40,157 epoch 5 - iter 1232/1546 - loss 0.02310391 - time (sec): 348.73 - samples/sec: 287.65 - lr: 0.000093 - momentum: 0.000000
2023-10-13 21:35:23,594 epoch 5 - iter 1386/1546 - loss 0.02246346 - time (sec): 392.17 - samples/sec: 286.67 - lr: 0.000091 - momentum: 0.000000
2023-10-13 21:36:06,499 epoch 5 - iter 1540/1546 - loss 0.02243696 - time (sec): 435.07 - samples/sec: 284.77 - lr: 0.000089 - momentum: 0.000000
2023-10-13 21:36:08,071 ----------------------------------------------------------------------------------------------------
2023-10-13 21:36:08,071 EPOCH 5 done: loss 0.0224 - lr: 0.000089
2023-10-13 21:36:25,916 DEV : loss 0.08204298466444016 - f1-score (micro avg) 0.7821
2023-10-13 21:36:25,946 ----------------------------------------------------------------------------------------------------
2023-10-13 21:37:11,154 epoch 6 - iter 154/1546 - loss 0.01814063 - time (sec): 45.21 - samples/sec: 292.26 - lr: 0.000087 - momentum: 0.000000
2023-10-13 21:37:54,160 epoch 6 - iter 308/1546 - loss 0.01774314 - time (sec): 88.21 - samples/sec: 279.12 - lr: 0.000085 - momentum: 0.000000
2023-10-13 21:38:37,357 epoch 6 - iter 462/1546 - loss 0.01713563 - time (sec): 131.41 - samples/sec: 288.35 - lr: 0.000084 - momentum: 0.000000
2023-10-13 21:39:20,425 epoch 6 - iter 616/1546 - loss 0.01628056 - time (sec): 174.48 - samples/sec: 287.49 - lr: 0.000082 - momentum: 0.000000
2023-10-13 21:40:04,240 epoch 6 - iter 770/1546 - loss 0.01595664 - time (sec): 218.29 - samples/sec: 282.96 - lr: 0.000080 - momentum: 0.000000
2023-10-13 21:40:48,691 epoch 6 - iter 924/1546 - loss 0.01621568 - time (sec): 262.74 - samples/sec: 283.64 - lr: 0.000078 - momentum: 0.000000
2023-10-13 21:41:31,631 epoch 6 - iter 1078/1546 - loss 0.01541703 - time (sec): 305.68 - samples/sec: 282.50 - lr: 0.000077 - momentum: 0.000000
2023-10-13 21:42:14,622 epoch 6 - iter 1232/1546 - loss 0.01535718 - time (sec): 348.67 - samples/sec: 281.17 - lr: 0.000075 - momentum: 0.000000
2023-10-13 21:42:58,358 epoch 6 - iter 1386/1546 - loss 0.01520733 - time (sec): 392.41 - samples/sec: 281.05 - lr: 0.000073 - momentum: 0.000000
2023-10-13 21:43:42,477 epoch 6 - iter 1540/1546 - loss 0.01526695 - time (sec): 436.53 - samples/sec: 283.83 - lr: 0.000071 - momentum: 0.000000
2023-10-13 21:43:44,085 ----------------------------------------------------------------------------------------------------
2023-10-13 21:43:44,086 EPOCH 6 done: loss 0.0153 - lr: 0.000071
2023-10-13 21:44:01,974 DEV : loss 0.0963883250951767 - f1-score (micro avg) 0.7893
2023-10-13 21:44:02,014 ----------------------------------------------------------------------------------------------------
2023-10-13 21:44:46,895 epoch 7 - iter 154/1546 - loss 0.00960781 - time (sec): 44.88 - samples/sec: 284.46 - lr: 0.000069 - momentum: 0.000000
2023-10-13 21:45:29,473 epoch 7 - iter 308/1546 - loss 0.01117041 - time (sec): 87.46 - samples/sec: 284.15 - lr: 0.000068 - momentum: 0.000000
2023-10-13 21:46:12,373 epoch 7 - iter 462/1546 - loss 0.01014518 - time (sec): 130.36 - samples/sec: 285.83 - lr: 0.000066 - momentum: 0.000000
2023-10-13 21:46:54,873 epoch 7 - iter 616/1546 - loss 0.01028503 - time (sec): 172.86 - samples/sec: 289.07 - lr: 0.000064 - momentum: 0.000000
2023-10-13 21:47:37,986 epoch 7 - iter 770/1546 - loss 0.00991293 - time (sec): 215.97 - samples/sec: 288.02 - lr: 0.000062 - momentum: 0.000000
2023-10-13 21:48:20,564 epoch 7 - iter 924/1546 - loss 0.01000884 - time (sec): 258.55 - samples/sec: 286.75 - lr: 0.000061 - momentum: 0.000000
2023-10-13 21:49:03,107 epoch 7 - iter 1078/1546 - loss 0.01016652 - time (sec): 301.09 - samples/sec: 287.15 - lr: 0.000059 - momentum: 0.000000
2023-10-13 21:49:47,239 epoch 7 - iter 1232/1546 - loss 0.00978862 - time (sec): 345.22 - samples/sec: 287.20 - lr: 0.000057 - momentum: 0.000000
2023-10-13 21:50:29,729 epoch 7 - iter 1386/1546 - loss 0.00993261 - time (sec): 387.71 - samples/sec: 286.92 - lr: 0.000055 - momentum: 0.000000
2023-10-13 21:51:12,855 epoch 7 - iter 1540/1546 - loss 0.00981392 - time (sec): 430.84 - samples/sec: 287.31 - lr: 0.000053 - momentum: 0.000000
2023-10-13 21:51:14,469 ----------------------------------------------------------------------------------------------------
2023-10-13 21:51:14,469 EPOCH 7 done: loss 0.0098 - lr: 0.000053
2023-10-13 21:51:31,675 DEV : loss 0.10647968202829361 - f1-score (micro avg) 0.809
2023-10-13 21:51:31,704 saving best model
2023-10-13 21:51:34,296 ----------------------------------------------------------------------------------------------------
2023-10-13 21:52:18,945 epoch 8 - iter 154/1546 - loss 0.00622852 - time (sec): 44.64 - samples/sec: 298.56 - lr: 0.000052 - momentum: 0.000000
2023-10-13 21:53:02,997 epoch 8 - iter 308/1546 - loss 0.00494139 - time (sec): 88.70 - samples/sec: 293.22 - lr: 0.000050 - momentum: 0.000000
2023-10-13 21:53:45,552 epoch 8 - iter 462/1546 - loss 0.00507210 - time (sec): 131.25 - samples/sec: 288.71 - lr: 0.000048 - momentum: 0.000000
2023-10-13 21:54:29,414 epoch 8 - iter 616/1546 - loss 0.00497498 - time (sec): 175.11 - samples/sec: 287.41 - lr: 0.000046 - momentum: 0.000000
2023-10-13 21:55:13,230 epoch 8 - iter 770/1546 - loss 0.00534911 - time (sec): 218.93 - samples/sec: 291.42 - lr: 0.000045 - momentum: 0.000000
2023-10-13 21:55:56,708 epoch 8 - iter 924/1546 - loss 0.00637688 - time (sec): 262.41 - samples/sec: 289.55 - lr: 0.000043 - momentum: 0.000000
2023-10-13 21:56:39,906 epoch 8 - iter 1078/1546 - loss 0.00653282 - time (sec): 305.61 - samples/sec: 288.47 - lr: 0.000041 - momentum: 0.000000
2023-10-13 21:57:22,544 epoch 8 - iter 1232/1546 - loss 0.00664004 - time (sec): 348.24 - samples/sec: 284.57 - lr: 0.000039 - momentum: 0.000000
2023-10-13 21:58:06,005 epoch 8 - iter 1386/1546 - loss 0.00638657 - time (sec): 391.70 - samples/sec: 283.64 - lr: 0.000037 - momentum: 0.000000
2023-10-13 21:58:49,445 epoch 8 - iter 1540/1546 - loss 0.00613878 - time (sec): 435.14 - samples/sec: 284.69 - lr: 0.000036 - momentum: 0.000000
2023-10-13 21:58:51,045 ----------------------------------------------------------------------------------------------------
2023-10-13 21:58:51,045 EPOCH 8 done: loss 0.0062 - lr: 0.000036
2023-10-13 21:59:07,991 DEV : loss 0.1150372177362442 - f1-score (micro avg) 0.7935
2023-10-13 21:59:08,019 ----------------------------------------------------------------------------------------------------
2023-10-13 21:59:52,685 epoch 9 - iter 154/1546 - loss 0.00366725 - time (sec): 44.66 - samples/sec: 287.39 - lr: 0.000034 - momentum: 0.000000
2023-10-13 22:00:35,676 epoch 9 - iter 308/1546 - loss 0.00268623 - time (sec): 87.65 - samples/sec: 294.17 - lr: 0.000032 - momentum: 0.000000
2023-10-13 22:01:17,375 epoch 9 - iter 462/1546 - loss 0.00276166 - time (sec): 129.35 - samples/sec: 288.20 - lr: 0.000030 - momentum: 0.000000
2023-10-13 22:02:01,087 epoch 9 - iter 616/1546 - loss 0.00250196 - time (sec): 173.07 - samples/sec: 284.98 - lr: 0.000029 - momentum: 0.000000
2023-10-13 22:02:44,103 epoch 9 - iter 770/1546 - loss 0.00303337 - time (sec): 216.08 - samples/sec: 283.98 - lr: 0.000027 - momentum: 0.000000
2023-10-13 22:03:27,347 epoch 9 - iter 924/1546 - loss 0.00314542 - time (sec): 259.33 - samples/sec: 280.87 - lr: 0.000025 - momentum: 0.000000
2023-10-13 22:04:11,161 epoch 9 - iter 1078/1546 - loss 0.00328251 - time (sec): 303.14 - samples/sec: 282.99 - lr: 0.000023 - momentum: 0.000000
2023-10-13 22:04:54,454 epoch 9 - iter 1232/1546 - loss 0.00306510 - time (sec): 346.43 - samples/sec: 283.05 - lr: 0.000021 - momentum: 0.000000
2023-10-13 22:05:37,259 epoch 9 - iter 1386/1546 - loss 0.00332886 - time (sec): 389.24 - samples/sec: 283.27 - lr: 0.000020 - momentum: 0.000000
2023-10-13 22:06:21,528 epoch 9 - iter 1540/1546 - loss 0.00316735 - time (sec): 433.51 - samples/sec: 285.69 - lr: 0.000018 - momentum: 0.000000
2023-10-13 22:06:23,212 ----------------------------------------------------------------------------------------------------
2023-10-13 22:06:23,213 EPOCH 9 done: loss 0.0032 - lr: 0.000018
2023-10-13 22:06:40,917 DEV : loss 0.12184790521860123 - f1-score (micro avg) 0.8
2023-10-13 22:06:40,949 ----------------------------------------------------------------------------------------------------
2023-10-13 22:07:23,978 epoch 10 - iter 154/1546 - loss 0.00257130 - time (sec): 43.03 - samples/sec: 282.43 - lr: 0.000016 - momentum: 0.000000
2023-10-13 22:08:06,526 epoch 10 - iter 308/1546 - loss 0.00146925 - time (sec): 85.57 - samples/sec: 275.91 - lr: 0.000014 - momentum: 0.000000
2023-10-13 22:08:49,779 epoch 10 - iter 462/1546 - loss 0.00267861 - time (sec): 128.83 - samples/sec: 274.48 - lr: 0.000013 - momentum: 0.000000
2023-10-13 22:09:32,666 epoch 10 - iter 616/1546 - loss 0.00254695 - time (sec): 171.71 - samples/sec: 280.28 - lr: 0.000011 - momentum: 0.000000
2023-10-13 22:10:15,409 epoch 10 - iter 770/1546 - loss 0.00258972 - time (sec): 214.46 - samples/sec: 282.09 - lr: 0.000009 - momentum: 0.000000
2023-10-13 22:10:59,014 epoch 10 - iter 924/1546 - loss 0.00290086 - time (sec): 258.06 - samples/sec: 285.29 - lr: 0.000007 - momentum: 0.000000
2023-10-13 22:11:42,430 epoch 10 - iter 1078/1546 - loss 0.00274036 - time (sec): 301.48 - samples/sec: 285.89 - lr: 0.000005 - momentum: 0.000000
2023-10-13 22:12:27,030 epoch 10 - iter 1232/1546 - loss 0.00254910 - time (sec): 346.08 - samples/sec: 287.04 - lr: 0.000004 - momentum: 0.000000
2023-10-13 22:13:10,817 epoch 10 - iter 1386/1546 - loss 0.00252901 - time (sec): 389.87 - samples/sec: 286.35 - lr: 0.000002 - momentum: 0.000000
2023-10-13 22:13:54,027 epoch 10 - iter 1540/1546 - loss 0.00239455 - time (sec): 433.08 - samples/sec: 285.77 - lr: 0.000000 - momentum: 0.000000
2023-10-13 22:13:55,684 ----------------------------------------------------------------------------------------------------
2023-10-13 22:13:55,684 EPOCH 10 done: loss 0.0024 - lr: 0.000000
2023-10-13 22:14:12,745 DEV : loss 0.1256277710199356 - f1-score (micro avg) 0.7844
2023-10-13 22:14:13,677 ----------------------------------------------------------------------------------------------------
2023-10-13 22:14:13,679 Loading model from best epoch ...
2023-10-13 22:14:18,037 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-13 22:15:13,065
Results:
- F-score (micro) 0.7916
- F-score (macro) 0.7114
- Accuracy 0.6714
By class:
precision recall f1-score support
LOC 0.8150 0.8615 0.8376 946
BUILDING 0.5973 0.4811 0.5329 185
STREET 0.7778 0.7500 0.7636 56
micro avg 0.7864 0.7970 0.7916 1187
macro avg 0.7300 0.6975 0.7114 1187
weighted avg 0.7793 0.7970 0.7866 1187
2023-10-13 22:15:13,066 ----------------------------------------------------------------------------------------------------