|
2023-10-13 20:58:05,783 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 20:58:05,786 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 20:58:05,786 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 20:58:05,787 MultiCorpus: 6183 train + 680 dev + 2113 test sentences |
|
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator |
|
2023-10-13 20:58:05,787 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 20:58:05,787 Train: 6183 sentences |
|
2023-10-13 20:58:05,787 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 20:58:05,787 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 20:58:05,787 Training Params: |
|
2023-10-13 20:58:05,787 - learning_rate: "0.00016" |
|
2023-10-13 20:58:05,787 - mini_batch_size: "4" |
|
2023-10-13 20:58:05,787 - max_epochs: "10" |
|
2023-10-13 20:58:05,787 - shuffle: "True" |
|
2023-10-13 20:58:05,787 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 20:58:05,787 Plugins: |
|
2023-10-13 20:58:05,787 - TensorboardLogger |
|
2023-10-13 20:58:05,788 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 20:58:05,788 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 20:58:05,788 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 20:58:05,788 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 20:58:05,788 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 20:58:05,788 Computation: |
|
2023-10-13 20:58:05,788 - compute on device: cuda:0 |
|
2023-10-13 20:58:05,788 - embedding storage: none |
|
2023-10-13 20:58:05,788 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 20:58:05,788 Model training base path: "hmbench-topres19th/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-13 20:58:05,788 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 20:58:05,788 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 20:58:05,789 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-13 20:58:49,004 epoch 1 - iter 154/1546 - loss 2.53231029 - time (sec): 43.21 - samples/sec: 267.16 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 20:59:32,658 epoch 1 - iter 308/1546 - loss 2.41126778 - time (sec): 86.87 - samples/sec: 277.17 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 21:00:16,388 epoch 1 - iter 462/1546 - loss 2.13403547 - time (sec): 130.60 - samples/sec: 280.34 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 21:01:00,109 epoch 1 - iter 616/1546 - loss 1.82534353 - time (sec): 174.32 - samples/sec: 282.12 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-13 21:01:43,709 epoch 1 - iter 770/1546 - loss 1.55027770 - time (sec): 217.92 - samples/sec: 280.43 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-13 21:02:26,951 epoch 1 - iter 924/1546 - loss 1.33327065 - time (sec): 261.16 - samples/sec: 280.72 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-13 21:03:11,451 epoch 1 - iter 1078/1546 - loss 1.17102836 - time (sec): 305.66 - samples/sec: 280.17 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-13 21:03:54,682 epoch 1 - iter 1232/1546 - loss 1.04226789 - time (sec): 348.89 - samples/sec: 281.74 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-13 21:04:38,843 epoch 1 - iter 1386/1546 - loss 0.94299904 - time (sec): 393.05 - samples/sec: 281.61 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-13 21:05:23,431 epoch 1 - iter 1540/1546 - loss 0.85609274 - time (sec): 437.64 - samples/sec: 283.15 - lr: 0.000159 - momentum: 0.000000 |
|
2023-10-13 21:05:24,901 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 21:05:24,901 EPOCH 1 done: loss 0.8540 - lr: 0.000159 |
|
2023-10-13 21:05:41,576 DEV : loss 0.09267304092645645 - f1-score (micro avg) 0.6009 |
|
2023-10-13 21:05:41,608 saving best model |
|
2023-10-13 21:05:42,536 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 21:06:26,579 epoch 2 - iter 154/1546 - loss 0.11615964 - time (sec): 44.04 - samples/sec: 254.88 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-13 21:07:10,522 epoch 2 - iter 308/1546 - loss 0.11061761 - time (sec): 87.98 - samples/sec: 263.62 - lr: 0.000156 - momentum: 0.000000 |
|
2023-10-13 21:07:55,175 epoch 2 - iter 462/1546 - loss 0.10758656 - time (sec): 132.64 - samples/sec: 273.51 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-13 21:08:41,329 epoch 2 - iter 616/1546 - loss 0.10808746 - time (sec): 178.79 - samples/sec: 273.48 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-13 21:09:27,938 epoch 2 - iter 770/1546 - loss 0.10648110 - time (sec): 225.40 - samples/sec: 273.00 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-13 21:10:14,282 epoch 2 - iter 924/1546 - loss 0.10566464 - time (sec): 271.74 - samples/sec: 269.16 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-13 21:11:00,253 epoch 2 - iter 1078/1546 - loss 0.10384736 - time (sec): 317.71 - samples/sec: 267.01 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-13 21:11:48,176 epoch 2 - iter 1232/1546 - loss 0.10079906 - time (sec): 365.64 - samples/sec: 269.51 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-13 21:12:35,146 epoch 2 - iter 1386/1546 - loss 0.09804350 - time (sec): 412.61 - samples/sec: 267.81 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-13 21:13:22,800 epoch 2 - iter 1540/1546 - loss 0.09640835 - time (sec): 460.26 - samples/sec: 269.31 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-13 21:13:24,440 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 21:13:24,441 EPOCH 2 done: loss 0.0965 - lr: 0.000142 |
|
2023-10-13 21:13:41,749 DEV : loss 0.06338806450366974 - f1-score (micro avg) 0.7627 |
|
2023-10-13 21:13:41,778 saving best model |
|
2023-10-13 21:13:44,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 21:14:29,695 epoch 3 - iter 154/1546 - loss 0.05410580 - time (sec): 44.87 - samples/sec: 286.43 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-13 21:15:13,367 epoch 3 - iter 308/1546 - loss 0.06094115 - time (sec): 88.55 - samples/sec: 284.61 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-13 21:15:57,684 epoch 3 - iter 462/1546 - loss 0.05647310 - time (sec): 132.86 - samples/sec: 281.56 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-13 21:16:41,468 epoch 3 - iter 616/1546 - loss 0.05893520 - time (sec): 176.65 - samples/sec: 280.16 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-13 21:17:22,843 epoch 3 - iter 770/1546 - loss 0.05949715 - time (sec): 218.02 - samples/sec: 281.27 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-13 21:18:03,773 epoch 3 - iter 924/1546 - loss 0.06026043 - time (sec): 258.95 - samples/sec: 285.37 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-13 21:18:44,789 epoch 3 - iter 1078/1546 - loss 0.05796925 - time (sec): 299.97 - samples/sec: 288.16 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-13 21:19:26,339 epoch 3 - iter 1232/1546 - loss 0.05578592 - time (sec): 341.52 - samples/sec: 289.43 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-13 21:20:09,255 epoch 3 - iter 1386/1546 - loss 0.05615699 - time (sec): 384.43 - samples/sec: 289.28 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-13 21:20:52,873 epoch 3 - iter 1540/1546 - loss 0.05656863 - time (sec): 428.05 - samples/sec: 288.85 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-13 21:20:54,649 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 21:20:54,649 EPOCH 3 done: loss 0.0567 - lr: 0.000125 |
|
2023-10-13 21:21:11,549 DEV : loss 0.0654740035533905 - f1-score (micro avg) 0.766 |
|
2023-10-13 21:21:11,579 saving best model |
|
2023-10-13 21:21:14,186 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 21:21:57,158 epoch 4 - iter 154/1546 - loss 0.03165886 - time (sec): 42.97 - samples/sec: 278.38 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-13 21:22:39,777 epoch 4 - iter 308/1546 - loss 0.03776274 - time (sec): 85.59 - samples/sec: 281.61 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-13 21:23:23,569 epoch 4 - iter 462/1546 - loss 0.03624070 - time (sec): 129.38 - samples/sec: 284.14 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-13 21:24:06,560 epoch 4 - iter 616/1546 - loss 0.03377782 - time (sec): 172.37 - samples/sec: 280.75 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-13 21:24:49,543 epoch 4 - iter 770/1546 - loss 0.03634239 - time (sec): 215.35 - samples/sec: 282.12 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-13 21:25:33,555 epoch 4 - iter 924/1546 - loss 0.03508408 - time (sec): 259.36 - samples/sec: 284.97 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-13 21:26:17,081 epoch 4 - iter 1078/1546 - loss 0.03498885 - time (sec): 302.89 - samples/sec: 283.82 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-13 21:27:01,339 epoch 4 - iter 1232/1546 - loss 0.03528830 - time (sec): 347.15 - samples/sec: 284.53 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-13 21:27:45,612 epoch 4 - iter 1386/1546 - loss 0.03507875 - time (sec): 391.42 - samples/sec: 285.14 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-13 21:28:29,305 epoch 4 - iter 1540/1546 - loss 0.03462670 - time (sec): 435.11 - samples/sec: 284.48 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-13 21:28:31,045 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 21:28:31,046 EPOCH 4 done: loss 0.0347 - lr: 0.000107 |
|
2023-10-13 21:28:48,762 DEV : loss 0.08016947656869888 - f1-score (micro avg) 0.7992 |
|
2023-10-13 21:28:48,791 saving best model |
|
2023-10-13 21:28:51,424 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 21:29:35,833 epoch 5 - iter 154/1546 - loss 0.02532910 - time (sec): 44.40 - samples/sec: 294.86 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-13 21:30:17,792 epoch 5 - iter 308/1546 - loss 0.01972287 - time (sec): 86.36 - samples/sec: 283.14 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-13 21:31:00,804 epoch 5 - iter 462/1546 - loss 0.02131220 - time (sec): 129.38 - samples/sec: 288.04 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-13 21:31:44,172 epoch 5 - iter 616/1546 - loss 0.02104739 - time (sec): 172.74 - samples/sec: 288.32 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-13 21:32:27,586 epoch 5 - iter 770/1546 - loss 0.02339049 - time (sec): 216.16 - samples/sec: 290.50 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-13 21:33:12,384 epoch 5 - iter 924/1546 - loss 0.02327555 - time (sec): 260.96 - samples/sec: 289.34 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-13 21:33:56,075 epoch 5 - iter 1078/1546 - loss 0.02357816 - time (sec): 304.65 - samples/sec: 287.95 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-13 21:34:40,157 epoch 5 - iter 1232/1546 - loss 0.02310391 - time (sec): 348.73 - samples/sec: 287.65 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-13 21:35:23,594 epoch 5 - iter 1386/1546 - loss 0.02246346 - time (sec): 392.17 - samples/sec: 286.67 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-13 21:36:06,499 epoch 5 - iter 1540/1546 - loss 0.02243696 - time (sec): 435.07 - samples/sec: 284.77 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-13 21:36:08,071 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 21:36:08,071 EPOCH 5 done: loss 0.0224 - lr: 0.000089 |
|
2023-10-13 21:36:25,916 DEV : loss 0.08204298466444016 - f1-score (micro avg) 0.7821 |
|
2023-10-13 21:36:25,946 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 21:37:11,154 epoch 6 - iter 154/1546 - loss 0.01814063 - time (sec): 45.21 - samples/sec: 292.26 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-13 21:37:54,160 epoch 6 - iter 308/1546 - loss 0.01774314 - time (sec): 88.21 - samples/sec: 279.12 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-13 21:38:37,357 epoch 6 - iter 462/1546 - loss 0.01713563 - time (sec): 131.41 - samples/sec: 288.35 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-13 21:39:20,425 epoch 6 - iter 616/1546 - loss 0.01628056 - time (sec): 174.48 - samples/sec: 287.49 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-13 21:40:04,240 epoch 6 - iter 770/1546 - loss 0.01595664 - time (sec): 218.29 - samples/sec: 282.96 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-13 21:40:48,691 epoch 6 - iter 924/1546 - loss 0.01621568 - time (sec): 262.74 - samples/sec: 283.64 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-13 21:41:31,631 epoch 6 - iter 1078/1546 - loss 0.01541703 - time (sec): 305.68 - samples/sec: 282.50 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-13 21:42:14,622 epoch 6 - iter 1232/1546 - loss 0.01535718 - time (sec): 348.67 - samples/sec: 281.17 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-13 21:42:58,358 epoch 6 - iter 1386/1546 - loss 0.01520733 - time (sec): 392.41 - samples/sec: 281.05 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-13 21:43:42,477 epoch 6 - iter 1540/1546 - loss 0.01526695 - time (sec): 436.53 - samples/sec: 283.83 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-13 21:43:44,085 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 21:43:44,086 EPOCH 6 done: loss 0.0153 - lr: 0.000071 |
|
2023-10-13 21:44:01,974 DEV : loss 0.0963883250951767 - f1-score (micro avg) 0.7893 |
|
2023-10-13 21:44:02,014 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 21:44:46,895 epoch 7 - iter 154/1546 - loss 0.00960781 - time (sec): 44.88 - samples/sec: 284.46 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-13 21:45:29,473 epoch 7 - iter 308/1546 - loss 0.01117041 - time (sec): 87.46 - samples/sec: 284.15 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-13 21:46:12,373 epoch 7 - iter 462/1546 - loss 0.01014518 - time (sec): 130.36 - samples/sec: 285.83 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-13 21:46:54,873 epoch 7 - iter 616/1546 - loss 0.01028503 - time (sec): 172.86 - samples/sec: 289.07 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-13 21:47:37,986 epoch 7 - iter 770/1546 - loss 0.00991293 - time (sec): 215.97 - samples/sec: 288.02 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-13 21:48:20,564 epoch 7 - iter 924/1546 - loss 0.01000884 - time (sec): 258.55 - samples/sec: 286.75 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-13 21:49:03,107 epoch 7 - iter 1078/1546 - loss 0.01016652 - time (sec): 301.09 - samples/sec: 287.15 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-13 21:49:47,239 epoch 7 - iter 1232/1546 - loss 0.00978862 - time (sec): 345.22 - samples/sec: 287.20 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-13 21:50:29,729 epoch 7 - iter 1386/1546 - loss 0.00993261 - time (sec): 387.71 - samples/sec: 286.92 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-13 21:51:12,855 epoch 7 - iter 1540/1546 - loss 0.00981392 - time (sec): 430.84 - samples/sec: 287.31 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-13 21:51:14,469 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 21:51:14,469 EPOCH 7 done: loss 0.0098 - lr: 0.000053 |
|
2023-10-13 21:51:31,675 DEV : loss 0.10647968202829361 - f1-score (micro avg) 0.809 |
|
2023-10-13 21:51:31,704 saving best model |
|
2023-10-13 21:51:34,296 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 21:52:18,945 epoch 8 - iter 154/1546 - loss 0.00622852 - time (sec): 44.64 - samples/sec: 298.56 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-13 21:53:02,997 epoch 8 - iter 308/1546 - loss 0.00494139 - time (sec): 88.70 - samples/sec: 293.22 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-13 21:53:45,552 epoch 8 - iter 462/1546 - loss 0.00507210 - time (sec): 131.25 - samples/sec: 288.71 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-13 21:54:29,414 epoch 8 - iter 616/1546 - loss 0.00497498 - time (sec): 175.11 - samples/sec: 287.41 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-13 21:55:13,230 epoch 8 - iter 770/1546 - loss 0.00534911 - time (sec): 218.93 - samples/sec: 291.42 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-13 21:55:56,708 epoch 8 - iter 924/1546 - loss 0.00637688 - time (sec): 262.41 - samples/sec: 289.55 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-13 21:56:39,906 epoch 8 - iter 1078/1546 - loss 0.00653282 - time (sec): 305.61 - samples/sec: 288.47 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-13 21:57:22,544 epoch 8 - iter 1232/1546 - loss 0.00664004 - time (sec): 348.24 - samples/sec: 284.57 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-13 21:58:06,005 epoch 8 - iter 1386/1546 - loss 0.00638657 - time (sec): 391.70 - samples/sec: 283.64 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-13 21:58:49,445 epoch 8 - iter 1540/1546 - loss 0.00613878 - time (sec): 435.14 - samples/sec: 284.69 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-13 21:58:51,045 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 21:58:51,045 EPOCH 8 done: loss 0.0062 - lr: 0.000036 |
|
2023-10-13 21:59:07,991 DEV : loss 0.1150372177362442 - f1-score (micro avg) 0.7935 |
|
2023-10-13 21:59:08,019 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 21:59:52,685 epoch 9 - iter 154/1546 - loss 0.00366725 - time (sec): 44.66 - samples/sec: 287.39 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-13 22:00:35,676 epoch 9 - iter 308/1546 - loss 0.00268623 - time (sec): 87.65 - samples/sec: 294.17 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-13 22:01:17,375 epoch 9 - iter 462/1546 - loss 0.00276166 - time (sec): 129.35 - samples/sec: 288.20 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 22:02:01,087 epoch 9 - iter 616/1546 - loss 0.00250196 - time (sec): 173.07 - samples/sec: 284.98 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 22:02:44,103 epoch 9 - iter 770/1546 - loss 0.00303337 - time (sec): 216.08 - samples/sec: 283.98 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 22:03:27,347 epoch 9 - iter 924/1546 - loss 0.00314542 - time (sec): 259.33 - samples/sec: 280.87 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 22:04:11,161 epoch 9 - iter 1078/1546 - loss 0.00328251 - time (sec): 303.14 - samples/sec: 282.99 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 22:04:54,454 epoch 9 - iter 1232/1546 - loss 0.00306510 - time (sec): 346.43 - samples/sec: 283.05 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 22:05:37,259 epoch 9 - iter 1386/1546 - loss 0.00332886 - time (sec): 389.24 - samples/sec: 283.27 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 22:06:21,528 epoch 9 - iter 1540/1546 - loss 0.00316735 - time (sec): 433.51 - samples/sec: 285.69 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 22:06:23,212 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:06:23,213 EPOCH 9 done: loss 0.0032 - lr: 0.000018 |
|
2023-10-13 22:06:40,917 DEV : loss 0.12184790521860123 - f1-score (micro avg) 0.8 |
|
2023-10-13 22:06:40,949 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:07:23,978 epoch 10 - iter 154/1546 - loss 0.00257130 - time (sec): 43.03 - samples/sec: 282.43 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 22:08:06,526 epoch 10 - iter 308/1546 - loss 0.00146925 - time (sec): 85.57 - samples/sec: 275.91 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 22:08:49,779 epoch 10 - iter 462/1546 - loss 0.00267861 - time (sec): 128.83 - samples/sec: 274.48 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 22:09:32,666 epoch 10 - iter 616/1546 - loss 0.00254695 - time (sec): 171.71 - samples/sec: 280.28 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 22:10:15,409 epoch 10 - iter 770/1546 - loss 0.00258972 - time (sec): 214.46 - samples/sec: 282.09 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 22:10:59,014 epoch 10 - iter 924/1546 - loss 0.00290086 - time (sec): 258.06 - samples/sec: 285.29 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 22:11:42,430 epoch 10 - iter 1078/1546 - loss 0.00274036 - time (sec): 301.48 - samples/sec: 285.89 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 22:12:27,030 epoch 10 - iter 1232/1546 - loss 0.00254910 - time (sec): 346.08 - samples/sec: 287.04 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 22:13:10,817 epoch 10 - iter 1386/1546 - loss 0.00252901 - time (sec): 389.87 - samples/sec: 286.35 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 22:13:54,027 epoch 10 - iter 1540/1546 - loss 0.00239455 - time (sec): 433.08 - samples/sec: 285.77 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 22:13:55,684 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:13:55,684 EPOCH 10 done: loss 0.0024 - lr: 0.000000 |
|
2023-10-13 22:14:12,745 DEV : loss 0.1256277710199356 - f1-score (micro avg) 0.7844 |
|
2023-10-13 22:14:13,677 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 22:14:13,679 Loading model from best epoch ... |
|
2023-10-13 22:14:18,037 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET |
|
2023-10-13 22:15:13,065 |
|
Results: |
|
- F-score (micro) 0.7916 |
|
- F-score (macro) 0.7114 |
|
- Accuracy 0.6714 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8150 0.8615 0.8376 946 |
|
BUILDING 0.5973 0.4811 0.5329 185 |
|
STREET 0.7778 0.7500 0.7636 56 |
|
|
|
micro avg 0.7864 0.7970 0.7916 1187 |
|
macro avg 0.7300 0.6975 0.7114 1187 |
|
weighted avg 0.7793 0.7970 0.7866 1187 |
|
|
|
2023-10-13 22:15:13,066 ---------------------------------------------------------------------------------------------------- |
|
|