|
2023-10-13 18:27:41,589 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:27:41,592 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 18:27:41,592 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:27:41,593 MultiCorpus: 6183 train + 680 dev + 2113 test sentences |
|
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator |
|
2023-10-13 18:27:41,593 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:27:41,593 Train: 6183 sentences |
|
2023-10-13 18:27:41,593 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 18:27:41,593 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:27:41,593 Training Params: |
|
2023-10-13 18:27:41,593 - learning_rate: "0.00016" |
|
2023-10-13 18:27:41,593 - mini_batch_size: "8" |
|
2023-10-13 18:27:41,593 - max_epochs: "10" |
|
2023-10-13 18:27:41,593 - shuffle: "True" |
|
2023-10-13 18:27:41,593 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:27:41,593 Plugins: |
|
2023-10-13 18:27:41,594 - TensorboardLogger |
|
2023-10-13 18:27:41,594 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 18:27:41,594 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:27:41,594 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 18:27:41,594 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 18:27:41,594 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:27:41,594 Computation: |
|
2023-10-13 18:27:41,594 - compute on device: cuda:0 |
|
2023-10-13 18:27:41,594 - embedding storage: none |
|
2023-10-13 18:27:41,594 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:27:41,594 Model training base path: "hmbench-topres19th/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-13 18:27:41,594 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:27:41,594 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:27:41,595 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-13 18:28:20,526 epoch 1 - iter 77/773 - loss 2.53718000 - time (sec): 38.93 - samples/sec: 296.56 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 18:29:01,595 epoch 1 - iter 154/773 - loss 2.48898463 - time (sec): 80.00 - samples/sec: 300.97 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 18:29:42,476 epoch 1 - iter 231/773 - loss 2.31540883 - time (sec): 120.88 - samples/sec: 302.87 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 18:30:23,429 epoch 1 - iter 308/773 - loss 2.09255303 - time (sec): 161.83 - samples/sec: 303.88 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-13 18:31:04,341 epoch 1 - iter 385/773 - loss 1.87066774 - time (sec): 202.74 - samples/sec: 301.42 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-13 18:31:45,007 epoch 1 - iter 462/773 - loss 1.63966595 - time (sec): 243.41 - samples/sec: 301.19 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-13 18:32:25,330 epoch 1 - iter 539/773 - loss 1.44183445 - time (sec): 283.73 - samples/sec: 301.82 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-13 18:33:05,452 epoch 1 - iter 616/773 - loss 1.28582098 - time (sec): 323.85 - samples/sec: 303.52 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-13 18:33:45,138 epoch 1 - iter 693/773 - loss 1.16590252 - time (sec): 363.54 - samples/sec: 304.47 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-13 18:34:25,653 epoch 1 - iter 770/773 - loss 1.05920427 - time (sec): 404.06 - samples/sec: 306.69 - lr: 0.000159 - momentum: 0.000000 |
|
2023-10-13 18:34:27,131 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:34:27,131 EPOCH 1 done: loss 1.0564 - lr: 0.000159 |
|
2023-10-13 18:34:44,289 DEV : loss 0.09498981386423111 - f1-score (micro avg) 0.1032 |
|
2023-10-13 18:34:44,326 saving best model |
|
2023-10-13 18:34:45,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:35:24,550 epoch 2 - iter 77/773 - loss 0.13415039 - time (sec): 39.29 - samples/sec: 285.68 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-13 18:36:04,547 epoch 2 - iter 154/773 - loss 0.12506112 - time (sec): 79.29 - samples/sec: 292.52 - lr: 0.000156 - momentum: 0.000000 |
|
2023-10-13 18:36:45,745 epoch 2 - iter 231/773 - loss 0.12251225 - time (sec): 120.49 - samples/sec: 301.08 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-13 18:37:25,876 epoch 2 - iter 308/773 - loss 0.12159004 - time (sec): 160.62 - samples/sec: 304.42 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-13 18:38:06,724 epoch 2 - iter 385/773 - loss 0.11966517 - time (sec): 201.47 - samples/sec: 305.43 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-13 18:38:47,004 epoch 2 - iter 462/773 - loss 0.11708834 - time (sec): 241.75 - samples/sec: 302.56 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-13 18:39:27,016 epoch 2 - iter 539/773 - loss 0.11412131 - time (sec): 281.76 - samples/sec: 301.08 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-13 18:40:09,251 epoch 2 - iter 616/773 - loss 0.11026100 - time (sec): 323.99 - samples/sec: 304.15 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-13 18:40:50,301 epoch 2 - iter 693/773 - loss 0.10712482 - time (sec): 365.04 - samples/sec: 302.70 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-13 18:41:31,846 epoch 2 - iter 770/773 - loss 0.10501421 - time (sec): 406.59 - samples/sec: 304.87 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-13 18:41:33,254 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:41:33,255 EPOCH 2 done: loss 0.1052 - lr: 0.000142 |
|
2023-10-13 18:41:50,173 DEV : loss 0.06088424101471901 - f1-score (micro avg) 0.7383 |
|
2023-10-13 18:41:50,217 saving best model |
|
2023-10-13 18:41:52,843 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:42:34,205 epoch 3 - iter 77/773 - loss 0.06486646 - time (sec): 41.36 - samples/sec: 310.77 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-13 18:43:14,268 epoch 3 - iter 154/773 - loss 0.07228987 - time (sec): 81.42 - samples/sec: 309.51 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-13 18:43:54,505 epoch 3 - iter 231/773 - loss 0.06639483 - time (sec): 121.66 - samples/sec: 307.49 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-13 18:44:34,248 epoch 3 - iter 308/773 - loss 0.06650222 - time (sec): 161.40 - samples/sec: 306.63 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-13 18:45:13,628 epoch 3 - iter 385/773 - loss 0.06605857 - time (sec): 200.78 - samples/sec: 305.42 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-13 18:45:53,131 epoch 3 - iter 462/773 - loss 0.06528790 - time (sec): 240.28 - samples/sec: 307.54 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-13 18:46:33,090 epoch 3 - iter 539/773 - loss 0.06329194 - time (sec): 280.24 - samples/sec: 308.45 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-13 18:47:13,860 epoch 3 - iter 616/773 - loss 0.06104666 - time (sec): 321.01 - samples/sec: 307.92 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-13 18:47:54,228 epoch 3 - iter 693/773 - loss 0.06137439 - time (sec): 361.38 - samples/sec: 307.73 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-13 18:48:34,430 epoch 3 - iter 770/773 - loss 0.06200833 - time (sec): 401.58 - samples/sec: 307.89 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-13 18:48:36,054 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:48:36,054 EPOCH 3 done: loss 0.0622 - lr: 0.000125 |
|
2023-10-13 18:48:53,579 DEV : loss 0.06357744336128235 - f1-score (micro avg) 0.766 |
|
2023-10-13 18:48:53,609 saving best model |
|
2023-10-13 18:48:56,619 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:49:36,633 epoch 4 - iter 77/773 - loss 0.03897836 - time (sec): 40.01 - samples/sec: 298.95 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-13 18:50:16,835 epoch 4 - iter 154/773 - loss 0.04517362 - time (sec): 80.21 - samples/sec: 300.48 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-13 18:50:57,845 epoch 4 - iter 231/773 - loss 0.04320695 - time (sec): 121.22 - samples/sec: 303.25 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-13 18:51:37,322 epoch 4 - iter 308/773 - loss 0.04049507 - time (sec): 160.70 - samples/sec: 301.14 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-13 18:52:16,144 epoch 4 - iter 385/773 - loss 0.04133745 - time (sec): 199.52 - samples/sec: 304.51 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-13 18:52:58,384 epoch 4 - iter 462/773 - loss 0.04043949 - time (sec): 241.76 - samples/sec: 305.71 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-13 18:53:39,335 epoch 4 - iter 539/773 - loss 0.04060850 - time (sec): 282.71 - samples/sec: 304.07 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-13 18:54:20,890 epoch 4 - iter 616/773 - loss 0.04091750 - time (sec): 324.27 - samples/sec: 304.60 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-13 18:55:02,576 epoch 4 - iter 693/773 - loss 0.04019435 - time (sec): 365.95 - samples/sec: 304.98 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-13 18:55:44,589 epoch 4 - iter 770/773 - loss 0.03982455 - time (sec): 407.97 - samples/sec: 303.41 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-13 18:55:46,197 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:55:46,198 EPOCH 4 done: loss 0.0398 - lr: 0.000107 |
|
2023-10-13 18:56:02,928 DEV : loss 0.06280948221683502 - f1-score (micro avg) 0.7692 |
|
2023-10-13 18:56:02,957 saving best model |
|
2023-10-13 18:56:05,585 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 18:56:47,185 epoch 5 - iter 77/773 - loss 0.02565174 - time (sec): 41.60 - samples/sec: 314.77 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-13 18:57:28,583 epoch 5 - iter 154/773 - loss 0.02318898 - time (sec): 82.99 - samples/sec: 294.64 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-13 18:58:10,782 epoch 5 - iter 231/773 - loss 0.02462375 - time (sec): 125.19 - samples/sec: 297.66 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-13 18:58:53,107 epoch 5 - iter 308/773 - loss 0.02474654 - time (sec): 167.52 - samples/sec: 297.31 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-13 18:59:35,526 epoch 5 - iter 385/773 - loss 0.02677004 - time (sec): 209.94 - samples/sec: 299.11 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-13 19:00:18,220 epoch 5 - iter 462/773 - loss 0.02659318 - time (sec): 252.63 - samples/sec: 298.87 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-13 19:01:00,301 epoch 5 - iter 539/773 - loss 0.02661496 - time (sec): 294.71 - samples/sec: 297.66 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-13 19:01:41,476 epoch 5 - iter 616/773 - loss 0.02613464 - time (sec): 335.89 - samples/sec: 298.65 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-13 19:02:22,988 epoch 5 - iter 693/773 - loss 0.02685763 - time (sec): 377.40 - samples/sec: 297.89 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-13 19:03:02,927 epoch 5 - iter 770/773 - loss 0.02640570 - time (sec): 417.34 - samples/sec: 296.87 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-13 19:03:04,362 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 19:03:04,362 EPOCH 5 done: loss 0.0265 - lr: 0.000089 |
|
2023-10-13 19:03:21,469 DEV : loss 0.07900705188512802 - f1-score (micro avg) 0.7648 |
|
2023-10-13 19:03:21,500 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 19:03:59,895 epoch 6 - iter 77/773 - loss 0.01890750 - time (sec): 38.39 - samples/sec: 344.13 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-13 19:04:36,506 epoch 6 - iter 154/773 - loss 0.01803645 - time (sec): 75.00 - samples/sec: 328.28 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-13 19:05:14,634 epoch 6 - iter 231/773 - loss 0.01845279 - time (sec): 113.13 - samples/sec: 334.94 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-13 19:05:52,977 epoch 6 - iter 308/773 - loss 0.01799186 - time (sec): 151.48 - samples/sec: 331.14 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-13 19:06:33,411 epoch 6 - iter 385/773 - loss 0.01752207 - time (sec): 191.91 - samples/sec: 321.86 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-13 19:07:14,518 epoch 6 - iter 462/773 - loss 0.01782621 - time (sec): 233.02 - samples/sec: 319.82 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-13 19:07:54,522 epoch 6 - iter 539/773 - loss 0.01774540 - time (sec): 273.02 - samples/sec: 316.30 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-13 19:08:33,471 epoch 6 - iter 616/773 - loss 0.01739339 - time (sec): 311.97 - samples/sec: 314.26 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-13 19:09:13,056 epoch 6 - iter 693/773 - loss 0.01719655 - time (sec): 351.55 - samples/sec: 313.71 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-13 19:09:54,830 epoch 6 - iter 770/773 - loss 0.01749948 - time (sec): 393.33 - samples/sec: 315.00 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-13 19:09:56,293 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 19:09:56,293 EPOCH 6 done: loss 0.0176 - lr: 0.000071 |
|
2023-10-13 19:10:13,261 DEV : loss 0.08565299212932587 - f1-score (micro avg) 0.7816 |
|
2023-10-13 19:10:13,290 saving best model |
|
2023-10-13 19:10:15,923 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 19:10:57,165 epoch 7 - iter 77/773 - loss 0.00810359 - time (sec): 41.24 - samples/sec: 309.57 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-13 19:11:38,072 epoch 7 - iter 154/773 - loss 0.01107264 - time (sec): 82.14 - samples/sec: 302.53 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-13 19:12:18,844 epoch 7 - iter 231/773 - loss 0.01177975 - time (sec): 122.92 - samples/sec: 303.13 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-13 19:12:59,044 epoch 7 - iter 308/773 - loss 0.01189685 - time (sec): 163.12 - samples/sec: 306.33 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-13 19:13:38,918 epoch 7 - iter 385/773 - loss 0.01172982 - time (sec): 202.99 - samples/sec: 306.43 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-13 19:14:19,066 epoch 7 - iter 462/773 - loss 0.01138355 - time (sec): 243.14 - samples/sec: 304.92 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-13 19:14:58,917 epoch 7 - iter 539/773 - loss 0.01127963 - time (sec): 282.99 - samples/sec: 305.52 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-13 19:15:38,339 epoch 7 - iter 616/773 - loss 0.01107735 - time (sec): 322.41 - samples/sec: 307.52 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-13 19:16:18,228 epoch 7 - iter 693/773 - loss 0.01205704 - time (sec): 362.30 - samples/sec: 307.05 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-13 19:16:58,631 epoch 7 - iter 770/773 - loss 0.01195266 - time (sec): 402.70 - samples/sec: 307.39 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-13 19:17:00,148 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 19:17:00,148 EPOCH 7 done: loss 0.0120 - lr: 0.000054 |
|
2023-10-13 19:17:17,015 DEV : loss 0.08720681816339493 - f1-score (micro avg) 0.7968 |
|
2023-10-13 19:17:17,044 saving best model |
|
2023-10-13 19:17:19,668 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 19:18:00,532 epoch 8 - iter 77/773 - loss 0.01118203 - time (sec): 40.86 - samples/sec: 326.21 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-13 19:18:40,280 epoch 8 - iter 154/773 - loss 0.01018958 - time (sec): 80.61 - samples/sec: 322.64 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-13 19:19:19,475 epoch 8 - iter 231/773 - loss 0.01021190 - time (sec): 119.80 - samples/sec: 316.30 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 19:19:58,615 epoch 8 - iter 308/773 - loss 0.00973996 - time (sec): 158.94 - samples/sec: 316.65 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 19:20:39,703 epoch 8 - iter 385/773 - loss 0.01031258 - time (sec): 200.03 - samples/sec: 318.96 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 19:21:21,213 epoch 8 - iter 462/773 - loss 0.01139138 - time (sec): 241.54 - samples/sec: 314.56 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 19:22:00,888 epoch 8 - iter 539/773 - loss 0.01032455 - time (sec): 281.22 - samples/sec: 313.49 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 19:22:39,467 epoch 8 - iter 616/773 - loss 0.01013119 - time (sec): 319.79 - samples/sec: 309.88 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 19:23:19,791 epoch 8 - iter 693/773 - loss 0.00963902 - time (sec): 360.12 - samples/sec: 308.51 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-13 19:24:00,541 epoch 8 - iter 770/773 - loss 0.00920310 - time (sec): 400.87 - samples/sec: 309.03 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 19:24:02,059 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 19:24:02,059 EPOCH 8 done: loss 0.0093 - lr: 0.000036 |
|
2023-10-13 19:24:20,534 DEV : loss 0.09538504481315613 - f1-score (micro avg) 0.7842 |
|
2023-10-13 19:24:20,568 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 19:25:04,481 epoch 9 - iter 77/773 - loss 0.00761122 - time (sec): 43.91 - samples/sec: 292.32 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 19:25:47,057 epoch 9 - iter 154/773 - loss 0.00548572 - time (sec): 86.49 - samples/sec: 298.14 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 19:26:27,944 epoch 9 - iter 231/773 - loss 0.00513351 - time (sec): 127.37 - samples/sec: 292.68 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 19:27:07,960 epoch 9 - iter 308/773 - loss 0.00485361 - time (sec): 167.39 - samples/sec: 294.65 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 19:27:49,243 epoch 9 - iter 385/773 - loss 0.00561098 - time (sec): 208.67 - samples/sec: 294.06 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 19:28:30,009 epoch 9 - iter 462/773 - loss 0.00629165 - time (sec): 249.44 - samples/sec: 292.00 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 19:29:10,691 epoch 9 - iter 539/773 - loss 0.00715536 - time (sec): 290.12 - samples/sec: 295.69 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 19:29:50,910 epoch 9 - iter 616/773 - loss 0.00747397 - time (sec): 330.34 - samples/sec: 296.84 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 19:30:30,991 epoch 9 - iter 693/773 - loss 0.00752234 - time (sec): 370.42 - samples/sec: 297.66 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 19:31:12,553 epoch 9 - iter 770/773 - loss 0.00710341 - time (sec): 411.98 - samples/sec: 300.62 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 19:31:14,034 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 19:31:14,034 EPOCH 9 done: loss 0.0071 - lr: 0.000018 |
|
2023-10-13 19:31:31,165 DEV : loss 0.09801825881004333 - f1-score (micro avg) 0.8065 |
|
2023-10-13 19:31:31,194 saving best model |
|
2023-10-13 19:31:33,834 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 19:32:15,371 epoch 10 - iter 77/773 - loss 0.00436375 - time (sec): 41.53 - samples/sec: 292.59 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 19:32:55,311 epoch 10 - iter 154/773 - loss 0.00453929 - time (sec): 81.47 - samples/sec: 289.80 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 19:33:36,007 epoch 10 - iter 231/773 - loss 0.00590183 - time (sec): 122.17 - samples/sec: 289.44 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 19:34:16,478 epoch 10 - iter 308/773 - loss 0.00508722 - time (sec): 162.64 - samples/sec: 295.92 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 19:34:57,766 epoch 10 - iter 385/773 - loss 0.00493063 - time (sec): 203.93 - samples/sec: 296.65 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 19:35:38,905 epoch 10 - iter 462/773 - loss 0.00476355 - time (sec): 245.07 - samples/sec: 300.42 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 19:36:19,615 epoch 10 - iter 539/773 - loss 0.00511343 - time (sec): 285.78 - samples/sec: 301.60 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 19:37:01,549 epoch 10 - iter 616/773 - loss 0.00517065 - time (sec): 327.71 - samples/sec: 303.13 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 19:37:42,385 epoch 10 - iter 693/773 - loss 0.00527857 - time (sec): 368.55 - samples/sec: 302.91 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 19:38:22,556 epoch 10 - iter 770/773 - loss 0.00509281 - time (sec): 408.72 - samples/sec: 302.80 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 19:38:24,138 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 19:38:24,138 EPOCH 10 done: loss 0.0051 - lr: 0.000000 |
|
2023-10-13 19:38:41,062 DEV : loss 0.10153676569461823 - f1-score (micro avg) 0.8008 |
|
2023-10-13 19:38:42,015 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 19:38:42,017 Loading model from best epoch ... |
|
2023-10-13 19:38:46,544 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET |
|
2023-10-13 19:39:42,682 |
|
Results: |
|
- F-score (micro) 0.7988 |
|
- F-score (macro) 0.7062 |
|
- Accuracy 0.6839 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8541 0.8541 0.8541 946 |
|
BUILDING 0.5500 0.4757 0.5101 185 |
|
STREET 0.7414 0.7679 0.7544 56 |
|
|
|
micro avg 0.8067 0.7911 0.7988 1187 |
|
macro avg 0.7152 0.6992 0.7062 1187 |
|
weighted avg 0.8014 0.7911 0.7958 1187 |
|
|
|
2023-10-13 19:39:42,682 ---------------------------------------------------------------------------------------------------- |
|
|