stefan-it's picture
Upload folder using huggingface_hub
3cb60a4
2023-10-13 14:39:41,971 ----------------------------------------------------------------------------------------------------
2023-10-13 14:39:41,974 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 14:39:41,974 ----------------------------------------------------------------------------------------------------
2023-10-13 14:39:41,975 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-13 14:39:41,975 ----------------------------------------------------------------------------------------------------
2023-10-13 14:39:41,975 Train: 6183 sentences
2023-10-13 14:39:41,975 (train_with_dev=False, train_with_test=False)
2023-10-13 14:39:41,975 ----------------------------------------------------------------------------------------------------
2023-10-13 14:39:41,975 Training Params:
2023-10-13 14:39:41,975 - learning_rate: "0.00015"
2023-10-13 14:39:41,975 - mini_batch_size: "4"
2023-10-13 14:39:41,975 - max_epochs: "10"
2023-10-13 14:39:41,975 - shuffle: "True"
2023-10-13 14:39:41,975 ----------------------------------------------------------------------------------------------------
2023-10-13 14:39:41,975 Plugins:
2023-10-13 14:39:41,976 - TensorboardLogger
2023-10-13 14:39:41,976 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 14:39:41,976 ----------------------------------------------------------------------------------------------------
2023-10-13 14:39:41,976 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 14:39:41,976 - metric: "('micro avg', 'f1-score')"
2023-10-13 14:39:41,976 ----------------------------------------------------------------------------------------------------
2023-10-13 14:39:41,976 Computation:
2023-10-13 14:39:41,976 - compute on device: cuda:0
2023-10-13 14:39:41,976 - embedding storage: none
2023-10-13 14:39:41,976 ----------------------------------------------------------------------------------------------------
2023-10-13 14:39:41,976 Model training base path: "hmbench-topres19th/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-2"
2023-10-13 14:39:41,976 ----------------------------------------------------------------------------------------------------
2023-10-13 14:39:41,976 ----------------------------------------------------------------------------------------------------
2023-10-13 14:39:41,977 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-13 14:40:25,037 epoch 1 - iter 154/1546 - loss 2.56709894 - time (sec): 43.06 - samples/sec: 287.17 - lr: 0.000015 - momentum: 0.000000
2023-10-13 14:41:07,444 epoch 1 - iter 308/1546 - loss 2.48412840 - time (sec): 85.46 - samples/sec: 279.88 - lr: 0.000030 - momentum: 0.000000
2023-10-13 14:41:51,772 epoch 1 - iter 462/1546 - loss 2.20266011 - time (sec): 129.79 - samples/sec: 282.66 - lr: 0.000045 - momentum: 0.000000
2023-10-13 14:42:36,163 epoch 1 - iter 616/1546 - loss 1.87306658 - time (sec): 174.18 - samples/sec: 289.15 - lr: 0.000060 - momentum: 0.000000
2023-10-13 14:43:19,358 epoch 1 - iter 770/1546 - loss 1.58444351 - time (sec): 217.38 - samples/sec: 289.31 - lr: 0.000075 - momentum: 0.000000
2023-10-13 14:44:01,679 epoch 1 - iter 924/1546 - loss 1.37749591 - time (sec): 259.70 - samples/sec: 286.84 - lr: 0.000090 - momentum: 0.000000
2023-10-13 14:44:46,279 epoch 1 - iter 1078/1546 - loss 1.22214309 - time (sec): 304.30 - samples/sec: 283.07 - lr: 0.000104 - momentum: 0.000000
2023-10-13 14:45:31,848 epoch 1 - iter 1232/1546 - loss 1.09399401 - time (sec): 349.87 - samples/sec: 280.27 - lr: 0.000119 - momentum: 0.000000
2023-10-13 14:46:16,699 epoch 1 - iter 1386/1546 - loss 0.98589314 - time (sec): 394.72 - samples/sec: 281.21 - lr: 0.000134 - momentum: 0.000000
2023-10-13 14:47:00,509 epoch 1 - iter 1540/1546 - loss 0.89919584 - time (sec): 438.53 - samples/sec: 282.41 - lr: 0.000149 - momentum: 0.000000
2023-10-13 14:47:02,133 ----------------------------------------------------------------------------------------------------
2023-10-13 14:47:02,134 EPOCH 1 done: loss 0.8964 - lr: 0.000149
2023-10-13 14:47:19,087 DEV : loss 0.08445192873477936 - f1-score (micro avg) 0.5466
2023-10-13 14:47:19,116 saving best model
2023-10-13 14:47:20,059 ----------------------------------------------------------------------------------------------------
2023-10-13 14:48:03,582 epoch 2 - iter 154/1546 - loss 0.13170571 - time (sec): 43.52 - samples/sec: 278.35 - lr: 0.000148 - momentum: 0.000000
2023-10-13 14:48:52,529 epoch 2 - iter 308/1546 - loss 0.13132400 - time (sec): 92.47 - samples/sec: 269.64 - lr: 0.000147 - momentum: 0.000000
2023-10-13 14:49:35,879 epoch 2 - iter 462/1546 - loss 0.12013285 - time (sec): 135.82 - samples/sec: 274.07 - lr: 0.000145 - momentum: 0.000000
2023-10-13 14:50:19,305 epoch 2 - iter 616/1546 - loss 0.11282047 - time (sec): 179.24 - samples/sec: 278.65 - lr: 0.000143 - momentum: 0.000000
2023-10-13 14:51:01,645 epoch 2 - iter 770/1546 - loss 0.10814538 - time (sec): 221.58 - samples/sec: 277.82 - lr: 0.000142 - momentum: 0.000000
2023-10-13 14:51:44,848 epoch 2 - iter 924/1546 - loss 0.10261037 - time (sec): 264.79 - samples/sec: 281.61 - lr: 0.000140 - momentum: 0.000000
2023-10-13 14:52:28,795 epoch 2 - iter 1078/1546 - loss 0.10334400 - time (sec): 308.73 - samples/sec: 283.16 - lr: 0.000138 - momentum: 0.000000
2023-10-13 14:53:11,898 epoch 2 - iter 1232/1546 - loss 0.10194205 - time (sec): 351.84 - samples/sec: 282.94 - lr: 0.000137 - momentum: 0.000000
2023-10-13 14:53:55,412 epoch 2 - iter 1386/1546 - loss 0.10006275 - time (sec): 395.35 - samples/sec: 283.00 - lr: 0.000135 - momentum: 0.000000
2023-10-13 14:54:38,244 epoch 2 - iter 1540/1546 - loss 0.09708740 - time (sec): 438.18 - samples/sec: 282.73 - lr: 0.000133 - momentum: 0.000000
2023-10-13 14:54:39,784 ----------------------------------------------------------------------------------------------------
2023-10-13 14:54:39,785 EPOCH 2 done: loss 0.0969 - lr: 0.000133
2023-10-13 14:54:57,370 DEV : loss 0.06155720353126526 - f1-score (micro avg) 0.7849
2023-10-13 14:54:57,399 saving best model
2023-10-13 14:55:00,436 ----------------------------------------------------------------------------------------------------
2023-10-13 14:55:43,838 epoch 3 - iter 154/1546 - loss 0.06344340 - time (sec): 43.40 - samples/sec: 293.78 - lr: 0.000132 - momentum: 0.000000
2023-10-13 14:56:26,615 epoch 3 - iter 308/1546 - loss 0.06088793 - time (sec): 86.17 - samples/sec: 292.89 - lr: 0.000130 - momentum: 0.000000
2023-10-13 14:57:09,377 epoch 3 - iter 462/1546 - loss 0.06172965 - time (sec): 128.94 - samples/sec: 286.07 - lr: 0.000128 - momentum: 0.000000
2023-10-13 14:57:53,053 epoch 3 - iter 616/1546 - loss 0.06467305 - time (sec): 172.61 - samples/sec: 290.06 - lr: 0.000127 - momentum: 0.000000
2023-10-13 14:58:36,231 epoch 3 - iter 770/1546 - loss 0.06158844 - time (sec): 215.79 - samples/sec: 287.28 - lr: 0.000125 - momentum: 0.000000
2023-10-13 14:59:19,684 epoch 3 - iter 924/1546 - loss 0.06134919 - time (sec): 259.24 - samples/sec: 286.38 - lr: 0.000123 - momentum: 0.000000
2023-10-13 15:00:04,246 epoch 3 - iter 1078/1546 - loss 0.05888760 - time (sec): 303.80 - samples/sec: 286.51 - lr: 0.000122 - momentum: 0.000000
2023-10-13 15:00:48,093 epoch 3 - iter 1232/1546 - loss 0.05684456 - time (sec): 347.65 - samples/sec: 287.67 - lr: 0.000120 - momentum: 0.000000
2023-10-13 15:01:31,209 epoch 3 - iter 1386/1546 - loss 0.05625824 - time (sec): 390.77 - samples/sec: 287.53 - lr: 0.000118 - momentum: 0.000000
2023-10-13 15:02:13,997 epoch 3 - iter 1540/1546 - loss 0.05575664 - time (sec): 433.56 - samples/sec: 285.29 - lr: 0.000117 - momentum: 0.000000
2023-10-13 15:02:15,696 ----------------------------------------------------------------------------------------------------
2023-10-13 15:02:15,696 EPOCH 3 done: loss 0.0558 - lr: 0.000117
2023-10-13 15:02:32,617 DEV : loss 0.050705861300230026 - f1-score (micro avg) 0.8048
2023-10-13 15:02:32,645 saving best model
2023-10-13 15:02:35,269 ----------------------------------------------------------------------------------------------------
2023-10-13 15:03:18,055 epoch 4 - iter 154/1546 - loss 0.03891303 - time (sec): 42.78 - samples/sec: 264.46 - lr: 0.000115 - momentum: 0.000000
2023-10-13 15:04:01,155 epoch 4 - iter 308/1546 - loss 0.03407498 - time (sec): 85.88 - samples/sec: 284.12 - lr: 0.000113 - momentum: 0.000000
2023-10-13 15:04:44,213 epoch 4 - iter 462/1546 - loss 0.03585217 - time (sec): 128.94 - samples/sec: 280.08 - lr: 0.000112 - momentum: 0.000000
2023-10-13 15:05:28,633 epoch 4 - iter 616/1546 - loss 0.03650901 - time (sec): 173.36 - samples/sec: 288.59 - lr: 0.000110 - momentum: 0.000000
2023-10-13 15:06:11,835 epoch 4 - iter 770/1546 - loss 0.03523602 - time (sec): 216.56 - samples/sec: 286.03 - lr: 0.000108 - momentum: 0.000000
2023-10-13 15:06:56,009 epoch 4 - iter 924/1546 - loss 0.03397520 - time (sec): 260.74 - samples/sec: 286.51 - lr: 0.000107 - momentum: 0.000000
2023-10-13 15:07:39,889 epoch 4 - iter 1078/1546 - loss 0.03395534 - time (sec): 304.62 - samples/sec: 285.94 - lr: 0.000105 - momentum: 0.000000
2023-10-13 15:08:24,583 epoch 4 - iter 1232/1546 - loss 0.03243616 - time (sec): 349.31 - samples/sec: 285.86 - lr: 0.000103 - momentum: 0.000000
2023-10-13 15:09:07,318 epoch 4 - iter 1386/1546 - loss 0.03431532 - time (sec): 392.05 - samples/sec: 284.94 - lr: 0.000102 - momentum: 0.000000
2023-10-13 15:09:50,205 epoch 4 - iter 1540/1546 - loss 0.03415158 - time (sec): 434.93 - samples/sec: 284.68 - lr: 0.000100 - momentum: 0.000000
2023-10-13 15:09:51,764 ----------------------------------------------------------------------------------------------------
2023-10-13 15:09:51,764 EPOCH 4 done: loss 0.0340 - lr: 0.000100
2023-10-13 15:10:08,641 DEV : loss 0.066124826669693 - f1-score (micro avg) 0.8102
2023-10-13 15:10:08,670 saving best model
2023-10-13 15:10:11,310 ----------------------------------------------------------------------------------------------------
2023-10-13 15:10:55,368 epoch 5 - iter 154/1546 - loss 0.01656157 - time (sec): 44.05 - samples/sec: 278.05 - lr: 0.000098 - momentum: 0.000000
2023-10-13 15:11:38,874 epoch 5 - iter 308/1546 - loss 0.02090640 - time (sec): 87.56 - samples/sec: 278.77 - lr: 0.000097 - momentum: 0.000000
2023-10-13 15:12:23,039 epoch 5 - iter 462/1546 - loss 0.02203142 - time (sec): 131.72 - samples/sec: 286.68 - lr: 0.000095 - momentum: 0.000000
2023-10-13 15:13:06,817 epoch 5 - iter 616/1546 - loss 0.02298316 - time (sec): 175.50 - samples/sec: 288.75 - lr: 0.000093 - momentum: 0.000000
2023-10-13 15:13:49,728 epoch 5 - iter 770/1546 - loss 0.02244836 - time (sec): 218.41 - samples/sec: 285.26 - lr: 0.000092 - momentum: 0.000000
2023-10-13 15:14:34,037 epoch 5 - iter 924/1546 - loss 0.02169351 - time (sec): 262.72 - samples/sec: 284.14 - lr: 0.000090 - momentum: 0.000000
2023-10-13 15:15:17,495 epoch 5 - iter 1078/1546 - loss 0.02237350 - time (sec): 306.18 - samples/sec: 282.21 - lr: 0.000088 - momentum: 0.000000
2023-10-13 15:16:01,516 epoch 5 - iter 1232/1546 - loss 0.02299554 - time (sec): 350.20 - samples/sec: 284.15 - lr: 0.000087 - momentum: 0.000000
2023-10-13 15:16:46,257 epoch 5 - iter 1386/1546 - loss 0.02252792 - time (sec): 394.94 - samples/sec: 282.82 - lr: 0.000085 - momentum: 0.000000
2023-10-13 15:17:30,215 epoch 5 - iter 1540/1546 - loss 0.02243190 - time (sec): 438.90 - samples/sec: 281.86 - lr: 0.000083 - momentum: 0.000000
2023-10-13 15:17:31,904 ----------------------------------------------------------------------------------------------------
2023-10-13 15:17:31,905 EPOCH 5 done: loss 0.0226 - lr: 0.000083
2023-10-13 15:17:49,053 DEV : loss 0.07412274181842804 - f1-score (micro avg) 0.8226
2023-10-13 15:17:49,082 saving best model
2023-10-13 15:17:51,714 ----------------------------------------------------------------------------------------------------
2023-10-13 15:18:34,691 epoch 6 - iter 154/1546 - loss 0.01385871 - time (sec): 42.97 - samples/sec: 260.49 - lr: 0.000082 - momentum: 0.000000
2023-10-13 15:19:18,577 epoch 6 - iter 308/1546 - loss 0.01249211 - time (sec): 86.86 - samples/sec: 278.41 - lr: 0.000080 - momentum: 0.000000
2023-10-13 15:20:02,193 epoch 6 - iter 462/1546 - loss 0.01538100 - time (sec): 130.47 - samples/sec: 278.15 - lr: 0.000078 - momentum: 0.000000
2023-10-13 15:20:45,434 epoch 6 - iter 616/1546 - loss 0.01532389 - time (sec): 173.72 - samples/sec: 279.20 - lr: 0.000077 - momentum: 0.000000
2023-10-13 15:21:28,645 epoch 6 - iter 770/1546 - loss 0.01606409 - time (sec): 216.93 - samples/sec: 279.60 - lr: 0.000075 - momentum: 0.000000
2023-10-13 15:22:12,505 epoch 6 - iter 924/1546 - loss 0.01591365 - time (sec): 260.79 - samples/sec: 282.76 - lr: 0.000073 - momentum: 0.000000
2023-10-13 15:22:56,045 epoch 6 - iter 1078/1546 - loss 0.01567690 - time (sec): 304.33 - samples/sec: 282.86 - lr: 0.000072 - momentum: 0.000000
2023-10-13 15:23:39,407 epoch 6 - iter 1232/1546 - loss 0.01638299 - time (sec): 347.69 - samples/sec: 284.80 - lr: 0.000070 - momentum: 0.000000
2023-10-13 15:24:24,289 epoch 6 - iter 1386/1546 - loss 0.01612279 - time (sec): 392.57 - samples/sec: 283.77 - lr: 0.000068 - momentum: 0.000000
2023-10-13 15:25:07,833 epoch 6 - iter 1540/1546 - loss 0.01576791 - time (sec): 436.11 - samples/sec: 283.67 - lr: 0.000067 - momentum: 0.000000
2023-10-13 15:25:09,543 ----------------------------------------------------------------------------------------------------
2023-10-13 15:25:09,543 EPOCH 6 done: loss 0.0157 - lr: 0.000067
2023-10-13 15:25:27,177 DEV : loss 0.08703919500112534 - f1-score (micro avg) 0.8206
2023-10-13 15:25:27,215 ----------------------------------------------------------------------------------------------------
2023-10-13 15:26:11,634 epoch 7 - iter 154/1546 - loss 0.00874232 - time (sec): 44.42 - samples/sec: 278.32 - lr: 0.000065 - momentum: 0.000000
2023-10-13 15:26:54,285 epoch 7 - iter 308/1546 - loss 0.00758555 - time (sec): 87.07 - samples/sec: 279.90 - lr: 0.000063 - momentum: 0.000000
2023-10-13 15:27:36,729 epoch 7 - iter 462/1546 - loss 0.00810969 - time (sec): 129.51 - samples/sec: 281.85 - lr: 0.000062 - momentum: 0.000000
2023-10-13 15:28:20,391 epoch 7 - iter 616/1546 - loss 0.00755985 - time (sec): 173.17 - samples/sec: 288.76 - lr: 0.000060 - momentum: 0.000000
2023-10-13 15:29:03,797 epoch 7 - iter 770/1546 - loss 0.00752106 - time (sec): 216.58 - samples/sec: 289.20 - lr: 0.000058 - momentum: 0.000000
2023-10-13 15:29:47,298 epoch 7 - iter 924/1546 - loss 0.00814139 - time (sec): 260.08 - samples/sec: 286.24 - lr: 0.000057 - momentum: 0.000000
2023-10-13 15:30:30,413 epoch 7 - iter 1078/1546 - loss 0.00855511 - time (sec): 303.19 - samples/sec: 284.75 - lr: 0.000055 - momentum: 0.000000
2023-10-13 15:31:13,916 epoch 7 - iter 1232/1546 - loss 0.00863815 - time (sec): 346.70 - samples/sec: 285.55 - lr: 0.000053 - momentum: 0.000000
2023-10-13 15:31:58,289 epoch 7 - iter 1386/1546 - loss 0.00930480 - time (sec): 391.07 - samples/sec: 285.43 - lr: 0.000052 - momentum: 0.000000
2023-10-13 15:32:41,299 epoch 7 - iter 1540/1546 - loss 0.00918102 - time (sec): 434.08 - samples/sec: 285.55 - lr: 0.000050 - momentum: 0.000000
2023-10-13 15:32:42,819 ----------------------------------------------------------------------------------------------------
2023-10-13 15:32:42,819 EPOCH 7 done: loss 0.0095 - lr: 0.000050
2023-10-13 15:33:01,318 DEV : loss 0.10020530968904495 - f1-score (micro avg) 0.804
2023-10-13 15:33:01,350 ----------------------------------------------------------------------------------------------------
2023-10-13 15:33:44,939 epoch 8 - iter 154/1546 - loss 0.00581384 - time (sec): 43.59 - samples/sec: 276.90 - lr: 0.000048 - momentum: 0.000000
2023-10-13 15:34:29,547 epoch 8 - iter 308/1546 - loss 0.00724462 - time (sec): 88.20 - samples/sec: 285.29 - lr: 0.000047 - momentum: 0.000000
2023-10-13 15:35:12,982 epoch 8 - iter 462/1546 - loss 0.00578644 - time (sec): 131.63 - samples/sec: 282.42 - lr: 0.000045 - momentum: 0.000000
2023-10-13 15:35:55,543 epoch 8 - iter 616/1546 - loss 0.00591053 - time (sec): 174.19 - samples/sec: 279.73 - lr: 0.000043 - momentum: 0.000000
2023-10-13 15:36:37,769 epoch 8 - iter 770/1546 - loss 0.00635947 - time (sec): 216.42 - samples/sec: 277.28 - lr: 0.000042 - momentum: 0.000000
2023-10-13 15:37:23,308 epoch 8 - iter 924/1546 - loss 0.00689383 - time (sec): 261.96 - samples/sec: 279.13 - lr: 0.000040 - momentum: 0.000000
2023-10-13 15:38:07,842 epoch 8 - iter 1078/1546 - loss 0.00758832 - time (sec): 306.49 - samples/sec: 282.13 - lr: 0.000038 - momentum: 0.000000
2023-10-13 15:38:51,680 epoch 8 - iter 1232/1546 - loss 0.00694871 - time (sec): 350.33 - samples/sec: 283.60 - lr: 0.000037 - momentum: 0.000000
2023-10-13 15:39:34,163 epoch 8 - iter 1386/1546 - loss 0.00657395 - time (sec): 392.81 - samples/sec: 284.32 - lr: 0.000035 - momentum: 0.000000
2023-10-13 15:40:17,601 epoch 8 - iter 1540/1546 - loss 0.00608630 - time (sec): 436.25 - samples/sec: 283.45 - lr: 0.000033 - momentum: 0.000000
2023-10-13 15:40:19,349 ----------------------------------------------------------------------------------------------------
2023-10-13 15:40:19,350 EPOCH 8 done: loss 0.0062 - lr: 0.000033
2023-10-13 15:40:37,076 DEV : loss 0.1116044670343399 - f1-score (micro avg) 0.8016
2023-10-13 15:40:37,106 ----------------------------------------------------------------------------------------------------
2023-10-13 15:41:20,911 epoch 9 - iter 154/1546 - loss 0.00265755 - time (sec): 43.80 - samples/sec: 285.63 - lr: 0.000032 - momentum: 0.000000
2023-10-13 15:42:04,191 epoch 9 - iter 308/1546 - loss 0.00354680 - time (sec): 87.08 - samples/sec: 291.14 - lr: 0.000030 - momentum: 0.000000
2023-10-13 15:42:48,608 epoch 9 - iter 462/1546 - loss 0.00293509 - time (sec): 131.50 - samples/sec: 289.46 - lr: 0.000028 - momentum: 0.000000
2023-10-13 15:43:32,585 epoch 9 - iter 616/1546 - loss 0.00389616 - time (sec): 175.48 - samples/sec: 285.75 - lr: 0.000027 - momentum: 0.000000
2023-10-13 15:44:16,508 epoch 9 - iter 770/1546 - loss 0.00411035 - time (sec): 219.40 - samples/sec: 286.87 - lr: 0.000025 - momentum: 0.000000
2023-10-13 15:44:59,188 epoch 9 - iter 924/1546 - loss 0.00460724 - time (sec): 262.08 - samples/sec: 286.97 - lr: 0.000023 - momentum: 0.000000
2023-10-13 15:45:41,536 epoch 9 - iter 1078/1546 - loss 0.00470863 - time (sec): 304.43 - samples/sec: 285.12 - lr: 0.000022 - momentum: 0.000000
2023-10-13 15:46:24,703 epoch 9 - iter 1232/1546 - loss 0.00446922 - time (sec): 347.59 - samples/sec: 286.82 - lr: 0.000020 - momentum: 0.000000
2023-10-13 15:47:06,807 epoch 9 - iter 1386/1546 - loss 0.00474740 - time (sec): 389.70 - samples/sec: 286.97 - lr: 0.000018 - momentum: 0.000000
2023-10-13 15:47:49,686 epoch 9 - iter 1540/1546 - loss 0.00495037 - time (sec): 432.58 - samples/sec: 286.68 - lr: 0.000017 - momentum: 0.000000
2023-10-13 15:47:51,269 ----------------------------------------------------------------------------------------------------
2023-10-13 15:47:51,269 EPOCH 9 done: loss 0.0049 - lr: 0.000017
2023-10-13 15:48:09,434 DEV : loss 0.11771833896636963 - f1-score (micro avg) 0.8055
2023-10-13 15:48:09,464 ----------------------------------------------------------------------------------------------------
2023-10-13 15:48:52,741 epoch 10 - iter 154/1546 - loss 0.00245226 - time (sec): 43.27 - samples/sec: 283.79 - lr: 0.000015 - momentum: 0.000000
2023-10-13 15:49:36,642 epoch 10 - iter 308/1546 - loss 0.00267455 - time (sec): 87.18 - samples/sec: 288.77 - lr: 0.000013 - momentum: 0.000000
2023-10-13 15:50:20,139 epoch 10 - iter 462/1546 - loss 0.00312780 - time (sec): 130.67 - samples/sec: 286.68 - lr: 0.000012 - momentum: 0.000000
2023-10-13 15:51:03,639 epoch 10 - iter 616/1546 - loss 0.00294916 - time (sec): 174.17 - samples/sec: 286.04 - lr: 0.000010 - momentum: 0.000000
2023-10-13 15:51:47,281 epoch 10 - iter 770/1546 - loss 0.00279124 - time (sec): 217.82 - samples/sec: 285.29 - lr: 0.000008 - momentum: 0.000000
2023-10-13 15:52:29,781 epoch 10 - iter 924/1546 - loss 0.00302675 - time (sec): 260.32 - samples/sec: 283.54 - lr: 0.000007 - momentum: 0.000000
2023-10-13 15:53:13,281 epoch 10 - iter 1078/1546 - loss 0.00305268 - time (sec): 303.82 - samples/sec: 283.37 - lr: 0.000005 - momentum: 0.000000
2023-10-13 15:53:55,974 epoch 10 - iter 1232/1546 - loss 0.00337322 - time (sec): 346.51 - samples/sec: 282.16 - lr: 0.000003 - momentum: 0.000000
2023-10-13 15:54:40,527 epoch 10 - iter 1386/1546 - loss 0.00314639 - time (sec): 391.06 - samples/sec: 284.24 - lr: 0.000002 - momentum: 0.000000
2023-10-13 15:55:23,930 epoch 10 - iter 1540/1546 - loss 0.00309558 - time (sec): 434.46 - samples/sec: 284.63 - lr: 0.000000 - momentum: 0.000000
2023-10-13 15:55:25,701 ----------------------------------------------------------------------------------------------------
2023-10-13 15:55:25,701 EPOCH 10 done: loss 0.0031 - lr: 0.000000
2023-10-13 15:55:42,753 DEV : loss 0.1177176758646965 - f1-score (micro avg) 0.8072
2023-10-13 15:55:43,706 ----------------------------------------------------------------------------------------------------
2023-10-13 15:55:43,708 Loading model from best epoch ...
2023-10-13 15:55:48,227 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-13 15:56:42,961
Results:
- F-score (micro) 0.7962
- F-score (macro) 0.711
- Accuracy 0.6819
By class:
precision recall f1-score support
LOC 0.8097 0.8679 0.8378 946
BUILDING 0.6264 0.5892 0.6072 185
STREET 0.6232 0.7679 0.6880 56
micro avg 0.7741 0.8197 0.7962 1187
macro avg 0.6864 0.7416 0.7110 1187
weighted avg 0.7723 0.8197 0.7948 1187
2023-10-13 15:56:42,961 ----------------------------------------------------------------------------------------------------