|
2023-10-13 14:39:41,971 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:39:41,974 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 14:39:41,974 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:39:41,975 MultiCorpus: 6183 train + 680 dev + 2113 test sentences |
|
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator |
|
2023-10-13 14:39:41,975 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:39:41,975 Train: 6183 sentences |
|
2023-10-13 14:39:41,975 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 14:39:41,975 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:39:41,975 Training Params: |
|
2023-10-13 14:39:41,975 - learning_rate: "0.00015" |
|
2023-10-13 14:39:41,975 - mini_batch_size: "4" |
|
2023-10-13 14:39:41,975 - max_epochs: "10" |
|
2023-10-13 14:39:41,975 - shuffle: "True" |
|
2023-10-13 14:39:41,975 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:39:41,975 Plugins: |
|
2023-10-13 14:39:41,976 - TensorboardLogger |
|
2023-10-13 14:39:41,976 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 14:39:41,976 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:39:41,976 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 14:39:41,976 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 14:39:41,976 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:39:41,976 Computation: |
|
2023-10-13 14:39:41,976 - compute on device: cuda:0 |
|
2023-10-13 14:39:41,976 - embedding storage: none |
|
2023-10-13 14:39:41,976 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:39:41,976 Model training base path: "hmbench-topres19th/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-13 14:39:41,976 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:39:41,976 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:39:41,977 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-13 14:40:25,037 epoch 1 - iter 154/1546 - loss 2.56709894 - time (sec): 43.06 - samples/sec: 287.17 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 14:41:07,444 epoch 1 - iter 308/1546 - loss 2.48412840 - time (sec): 85.46 - samples/sec: 279.88 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 14:41:51,772 epoch 1 - iter 462/1546 - loss 2.20266011 - time (sec): 129.79 - samples/sec: 282.66 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 14:42:36,163 epoch 1 - iter 616/1546 - loss 1.87306658 - time (sec): 174.18 - samples/sec: 289.15 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-13 14:43:19,358 epoch 1 - iter 770/1546 - loss 1.58444351 - time (sec): 217.38 - samples/sec: 289.31 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-13 14:44:01,679 epoch 1 - iter 924/1546 - loss 1.37749591 - time (sec): 259.70 - samples/sec: 286.84 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-13 14:44:46,279 epoch 1 - iter 1078/1546 - loss 1.22214309 - time (sec): 304.30 - samples/sec: 283.07 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-13 14:45:31,848 epoch 1 - iter 1232/1546 - loss 1.09399401 - time (sec): 349.87 - samples/sec: 280.27 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-13 14:46:16,699 epoch 1 - iter 1386/1546 - loss 0.98589314 - time (sec): 394.72 - samples/sec: 281.21 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-13 14:47:00,509 epoch 1 - iter 1540/1546 - loss 0.89919584 - time (sec): 438.53 - samples/sec: 282.41 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-13 14:47:02,133 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:47:02,134 EPOCH 1 done: loss 0.8964 - lr: 0.000149 |
|
2023-10-13 14:47:19,087 DEV : loss 0.08445192873477936 - f1-score (micro avg) 0.5466 |
|
2023-10-13 14:47:19,116 saving best model |
|
2023-10-13 14:47:20,059 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:48:03,582 epoch 2 - iter 154/1546 - loss 0.13170571 - time (sec): 43.52 - samples/sec: 278.35 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-13 14:48:52,529 epoch 2 - iter 308/1546 - loss 0.13132400 - time (sec): 92.47 - samples/sec: 269.64 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-13 14:49:35,879 epoch 2 - iter 462/1546 - loss 0.12013285 - time (sec): 135.82 - samples/sec: 274.07 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-13 14:50:19,305 epoch 2 - iter 616/1546 - loss 0.11282047 - time (sec): 179.24 - samples/sec: 278.65 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-13 14:51:01,645 epoch 2 - iter 770/1546 - loss 0.10814538 - time (sec): 221.58 - samples/sec: 277.82 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-13 14:51:44,848 epoch 2 - iter 924/1546 - loss 0.10261037 - time (sec): 264.79 - samples/sec: 281.61 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-13 14:52:28,795 epoch 2 - iter 1078/1546 - loss 0.10334400 - time (sec): 308.73 - samples/sec: 283.16 - lr: 0.000138 - momentum: 0.000000 |
|
2023-10-13 14:53:11,898 epoch 2 - iter 1232/1546 - loss 0.10194205 - time (sec): 351.84 - samples/sec: 282.94 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-13 14:53:55,412 epoch 2 - iter 1386/1546 - loss 0.10006275 - time (sec): 395.35 - samples/sec: 283.00 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-13 14:54:38,244 epoch 2 - iter 1540/1546 - loss 0.09708740 - time (sec): 438.18 - samples/sec: 282.73 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-13 14:54:39,784 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:54:39,785 EPOCH 2 done: loss 0.0969 - lr: 0.000133 |
|
2023-10-13 14:54:57,370 DEV : loss 0.06155720353126526 - f1-score (micro avg) 0.7849 |
|
2023-10-13 14:54:57,399 saving best model |
|
2023-10-13 14:55:00,436 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 14:55:43,838 epoch 3 - iter 154/1546 - loss 0.06344340 - time (sec): 43.40 - samples/sec: 293.78 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-13 14:56:26,615 epoch 3 - iter 308/1546 - loss 0.06088793 - time (sec): 86.17 - samples/sec: 292.89 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-13 14:57:09,377 epoch 3 - iter 462/1546 - loss 0.06172965 - time (sec): 128.94 - samples/sec: 286.07 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-13 14:57:53,053 epoch 3 - iter 616/1546 - loss 0.06467305 - time (sec): 172.61 - samples/sec: 290.06 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-13 14:58:36,231 epoch 3 - iter 770/1546 - loss 0.06158844 - time (sec): 215.79 - samples/sec: 287.28 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-13 14:59:19,684 epoch 3 - iter 924/1546 - loss 0.06134919 - time (sec): 259.24 - samples/sec: 286.38 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-13 15:00:04,246 epoch 3 - iter 1078/1546 - loss 0.05888760 - time (sec): 303.80 - samples/sec: 286.51 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-13 15:00:48,093 epoch 3 - iter 1232/1546 - loss 0.05684456 - time (sec): 347.65 - samples/sec: 287.67 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-13 15:01:31,209 epoch 3 - iter 1386/1546 - loss 0.05625824 - time (sec): 390.77 - samples/sec: 287.53 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-13 15:02:13,997 epoch 3 - iter 1540/1546 - loss 0.05575664 - time (sec): 433.56 - samples/sec: 285.29 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-13 15:02:15,696 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 15:02:15,696 EPOCH 3 done: loss 0.0558 - lr: 0.000117 |
|
2023-10-13 15:02:32,617 DEV : loss 0.050705861300230026 - f1-score (micro avg) 0.8048 |
|
2023-10-13 15:02:32,645 saving best model |
|
2023-10-13 15:02:35,269 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 15:03:18,055 epoch 4 - iter 154/1546 - loss 0.03891303 - time (sec): 42.78 - samples/sec: 264.46 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-13 15:04:01,155 epoch 4 - iter 308/1546 - loss 0.03407498 - time (sec): 85.88 - samples/sec: 284.12 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-13 15:04:44,213 epoch 4 - iter 462/1546 - loss 0.03585217 - time (sec): 128.94 - samples/sec: 280.08 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-13 15:05:28,633 epoch 4 - iter 616/1546 - loss 0.03650901 - time (sec): 173.36 - samples/sec: 288.59 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-13 15:06:11,835 epoch 4 - iter 770/1546 - loss 0.03523602 - time (sec): 216.56 - samples/sec: 286.03 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-13 15:06:56,009 epoch 4 - iter 924/1546 - loss 0.03397520 - time (sec): 260.74 - samples/sec: 286.51 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-13 15:07:39,889 epoch 4 - iter 1078/1546 - loss 0.03395534 - time (sec): 304.62 - samples/sec: 285.94 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-13 15:08:24,583 epoch 4 - iter 1232/1546 - loss 0.03243616 - time (sec): 349.31 - samples/sec: 285.86 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-13 15:09:07,318 epoch 4 - iter 1386/1546 - loss 0.03431532 - time (sec): 392.05 - samples/sec: 284.94 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-13 15:09:50,205 epoch 4 - iter 1540/1546 - loss 0.03415158 - time (sec): 434.93 - samples/sec: 284.68 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-13 15:09:51,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 15:09:51,764 EPOCH 4 done: loss 0.0340 - lr: 0.000100 |
|
2023-10-13 15:10:08,641 DEV : loss 0.066124826669693 - f1-score (micro avg) 0.8102 |
|
2023-10-13 15:10:08,670 saving best model |
|
2023-10-13 15:10:11,310 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 15:10:55,368 epoch 5 - iter 154/1546 - loss 0.01656157 - time (sec): 44.05 - samples/sec: 278.05 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-13 15:11:38,874 epoch 5 - iter 308/1546 - loss 0.02090640 - time (sec): 87.56 - samples/sec: 278.77 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-13 15:12:23,039 epoch 5 - iter 462/1546 - loss 0.02203142 - time (sec): 131.72 - samples/sec: 286.68 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-13 15:13:06,817 epoch 5 - iter 616/1546 - loss 0.02298316 - time (sec): 175.50 - samples/sec: 288.75 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-13 15:13:49,728 epoch 5 - iter 770/1546 - loss 0.02244836 - time (sec): 218.41 - samples/sec: 285.26 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-13 15:14:34,037 epoch 5 - iter 924/1546 - loss 0.02169351 - time (sec): 262.72 - samples/sec: 284.14 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-13 15:15:17,495 epoch 5 - iter 1078/1546 - loss 0.02237350 - time (sec): 306.18 - samples/sec: 282.21 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-13 15:16:01,516 epoch 5 - iter 1232/1546 - loss 0.02299554 - time (sec): 350.20 - samples/sec: 284.15 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-13 15:16:46,257 epoch 5 - iter 1386/1546 - loss 0.02252792 - time (sec): 394.94 - samples/sec: 282.82 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-13 15:17:30,215 epoch 5 - iter 1540/1546 - loss 0.02243190 - time (sec): 438.90 - samples/sec: 281.86 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-13 15:17:31,904 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 15:17:31,905 EPOCH 5 done: loss 0.0226 - lr: 0.000083 |
|
2023-10-13 15:17:49,053 DEV : loss 0.07412274181842804 - f1-score (micro avg) 0.8226 |
|
2023-10-13 15:17:49,082 saving best model |
|
2023-10-13 15:17:51,714 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 15:18:34,691 epoch 6 - iter 154/1546 - loss 0.01385871 - time (sec): 42.97 - samples/sec: 260.49 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-13 15:19:18,577 epoch 6 - iter 308/1546 - loss 0.01249211 - time (sec): 86.86 - samples/sec: 278.41 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-13 15:20:02,193 epoch 6 - iter 462/1546 - loss 0.01538100 - time (sec): 130.47 - samples/sec: 278.15 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-13 15:20:45,434 epoch 6 - iter 616/1546 - loss 0.01532389 - time (sec): 173.72 - samples/sec: 279.20 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-13 15:21:28,645 epoch 6 - iter 770/1546 - loss 0.01606409 - time (sec): 216.93 - samples/sec: 279.60 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-13 15:22:12,505 epoch 6 - iter 924/1546 - loss 0.01591365 - time (sec): 260.79 - samples/sec: 282.76 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-13 15:22:56,045 epoch 6 - iter 1078/1546 - loss 0.01567690 - time (sec): 304.33 - samples/sec: 282.86 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-13 15:23:39,407 epoch 6 - iter 1232/1546 - loss 0.01638299 - time (sec): 347.69 - samples/sec: 284.80 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-13 15:24:24,289 epoch 6 - iter 1386/1546 - loss 0.01612279 - time (sec): 392.57 - samples/sec: 283.77 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-13 15:25:07,833 epoch 6 - iter 1540/1546 - loss 0.01576791 - time (sec): 436.11 - samples/sec: 283.67 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-13 15:25:09,543 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 15:25:09,543 EPOCH 6 done: loss 0.0157 - lr: 0.000067 |
|
2023-10-13 15:25:27,177 DEV : loss 0.08703919500112534 - f1-score (micro avg) 0.8206 |
|
2023-10-13 15:25:27,215 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 15:26:11,634 epoch 7 - iter 154/1546 - loss 0.00874232 - time (sec): 44.42 - samples/sec: 278.32 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-13 15:26:54,285 epoch 7 - iter 308/1546 - loss 0.00758555 - time (sec): 87.07 - samples/sec: 279.90 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-13 15:27:36,729 epoch 7 - iter 462/1546 - loss 0.00810969 - time (sec): 129.51 - samples/sec: 281.85 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-13 15:28:20,391 epoch 7 - iter 616/1546 - loss 0.00755985 - time (sec): 173.17 - samples/sec: 288.76 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-13 15:29:03,797 epoch 7 - iter 770/1546 - loss 0.00752106 - time (sec): 216.58 - samples/sec: 289.20 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-13 15:29:47,298 epoch 7 - iter 924/1546 - loss 0.00814139 - time (sec): 260.08 - samples/sec: 286.24 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-13 15:30:30,413 epoch 7 - iter 1078/1546 - loss 0.00855511 - time (sec): 303.19 - samples/sec: 284.75 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-13 15:31:13,916 epoch 7 - iter 1232/1546 - loss 0.00863815 - time (sec): 346.70 - samples/sec: 285.55 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-13 15:31:58,289 epoch 7 - iter 1386/1546 - loss 0.00930480 - time (sec): 391.07 - samples/sec: 285.43 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-13 15:32:41,299 epoch 7 - iter 1540/1546 - loss 0.00918102 - time (sec): 434.08 - samples/sec: 285.55 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-13 15:32:42,819 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 15:32:42,819 EPOCH 7 done: loss 0.0095 - lr: 0.000050 |
|
2023-10-13 15:33:01,318 DEV : loss 0.10020530968904495 - f1-score (micro avg) 0.804 |
|
2023-10-13 15:33:01,350 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 15:33:44,939 epoch 8 - iter 154/1546 - loss 0.00581384 - time (sec): 43.59 - samples/sec: 276.90 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 15:34:29,547 epoch 8 - iter 308/1546 - loss 0.00724462 - time (sec): 88.20 - samples/sec: 285.29 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-13 15:35:12,982 epoch 8 - iter 462/1546 - loss 0.00578644 - time (sec): 131.63 - samples/sec: 282.42 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 15:35:55,543 epoch 8 - iter 616/1546 - loss 0.00591053 - time (sec): 174.19 - samples/sec: 279.73 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 15:36:37,769 epoch 8 - iter 770/1546 - loss 0.00635947 - time (sec): 216.42 - samples/sec: 277.28 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-13 15:37:23,308 epoch 8 - iter 924/1546 - loss 0.00689383 - time (sec): 261.96 - samples/sec: 279.13 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-13 15:38:07,842 epoch 8 - iter 1078/1546 - loss 0.00758832 - time (sec): 306.49 - samples/sec: 282.13 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 15:38:51,680 epoch 8 - iter 1232/1546 - loss 0.00694871 - time (sec): 350.33 - samples/sec: 283.60 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 15:39:34,163 epoch 8 - iter 1386/1546 - loss 0.00657395 - time (sec): 392.81 - samples/sec: 284.32 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-13 15:40:17,601 epoch 8 - iter 1540/1546 - loss 0.00608630 - time (sec): 436.25 - samples/sec: 283.45 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-13 15:40:19,349 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 15:40:19,350 EPOCH 8 done: loss 0.0062 - lr: 0.000033 |
|
2023-10-13 15:40:37,076 DEV : loss 0.1116044670343399 - f1-score (micro avg) 0.8016 |
|
2023-10-13 15:40:37,106 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 15:41:20,911 epoch 9 - iter 154/1546 - loss 0.00265755 - time (sec): 43.80 - samples/sec: 285.63 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 15:42:04,191 epoch 9 - iter 308/1546 - loss 0.00354680 - time (sec): 87.08 - samples/sec: 291.14 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 15:42:48,608 epoch 9 - iter 462/1546 - loss 0.00293509 - time (sec): 131.50 - samples/sec: 289.46 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 15:43:32,585 epoch 9 - iter 616/1546 - loss 0.00389616 - time (sec): 175.48 - samples/sec: 285.75 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 15:44:16,508 epoch 9 - iter 770/1546 - loss 0.00411035 - time (sec): 219.40 - samples/sec: 286.87 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 15:44:59,188 epoch 9 - iter 924/1546 - loss 0.00460724 - time (sec): 262.08 - samples/sec: 286.97 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 15:45:41,536 epoch 9 - iter 1078/1546 - loss 0.00470863 - time (sec): 304.43 - samples/sec: 285.12 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 15:46:24,703 epoch 9 - iter 1232/1546 - loss 0.00446922 - time (sec): 347.59 - samples/sec: 286.82 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 15:47:06,807 epoch 9 - iter 1386/1546 - loss 0.00474740 - time (sec): 389.70 - samples/sec: 286.97 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 15:47:49,686 epoch 9 - iter 1540/1546 - loss 0.00495037 - time (sec): 432.58 - samples/sec: 286.68 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 15:47:51,269 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 15:47:51,269 EPOCH 9 done: loss 0.0049 - lr: 0.000017 |
|
2023-10-13 15:48:09,434 DEV : loss 0.11771833896636963 - f1-score (micro avg) 0.8055 |
|
2023-10-13 15:48:09,464 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 15:48:52,741 epoch 10 - iter 154/1546 - loss 0.00245226 - time (sec): 43.27 - samples/sec: 283.79 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 15:49:36,642 epoch 10 - iter 308/1546 - loss 0.00267455 - time (sec): 87.18 - samples/sec: 288.77 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 15:50:20,139 epoch 10 - iter 462/1546 - loss 0.00312780 - time (sec): 130.67 - samples/sec: 286.68 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 15:51:03,639 epoch 10 - iter 616/1546 - loss 0.00294916 - time (sec): 174.17 - samples/sec: 286.04 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 15:51:47,281 epoch 10 - iter 770/1546 - loss 0.00279124 - time (sec): 217.82 - samples/sec: 285.29 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 15:52:29,781 epoch 10 - iter 924/1546 - loss 0.00302675 - time (sec): 260.32 - samples/sec: 283.54 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 15:53:13,281 epoch 10 - iter 1078/1546 - loss 0.00305268 - time (sec): 303.82 - samples/sec: 283.37 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 15:53:55,974 epoch 10 - iter 1232/1546 - loss 0.00337322 - time (sec): 346.51 - samples/sec: 282.16 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 15:54:40,527 epoch 10 - iter 1386/1546 - loss 0.00314639 - time (sec): 391.06 - samples/sec: 284.24 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 15:55:23,930 epoch 10 - iter 1540/1546 - loss 0.00309558 - time (sec): 434.46 - samples/sec: 284.63 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 15:55:25,701 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 15:55:25,701 EPOCH 10 done: loss 0.0031 - lr: 0.000000 |
|
2023-10-13 15:55:42,753 DEV : loss 0.1177176758646965 - f1-score (micro avg) 0.8072 |
|
2023-10-13 15:55:43,706 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 15:55:43,708 Loading model from best epoch ... |
|
2023-10-13 15:55:48,227 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET |
|
2023-10-13 15:56:42,961 |
|
Results: |
|
- F-score (micro) 0.7962 |
|
- F-score (macro) 0.711 |
|
- Accuracy 0.6819 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8097 0.8679 0.8378 946 |
|
BUILDING 0.6264 0.5892 0.6072 185 |
|
STREET 0.6232 0.7679 0.6880 56 |
|
|
|
micro avg 0.7741 0.8197 0.7962 1187 |
|
macro avg 0.6864 0.7416 0.7110 1187 |
|
weighted avg 0.7723 0.8197 0.7948 1187 |
|
|
|
2023-10-13 15:56:42,961 ---------------------------------------------------------------------------------------------------- |
|
|