|
2023-10-11 09:22:17,384 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:22:17,386 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 09:22:17,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:22:17,387 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-11 09:22:17,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:22:17,387 Train: 1085 sentences |
|
2023-10-11 09:22:17,387 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 09:22:17,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:22:17,387 Training Params: |
|
2023-10-11 09:22:17,387 - learning_rate: "0.00016" |
|
2023-10-11 09:22:17,387 - mini_batch_size: "8" |
|
2023-10-11 09:22:17,387 - max_epochs: "10" |
|
2023-10-11 09:22:17,388 - shuffle: "True" |
|
2023-10-11 09:22:17,388 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:22:17,388 Plugins: |
|
2023-10-11 09:22:17,388 - TensorboardLogger |
|
2023-10-11 09:22:17,388 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 09:22:17,388 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:22:17,388 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 09:22:17,388 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 09:22:17,388 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:22:17,388 Computation: |
|
2023-10-11 09:22:17,388 - compute on device: cuda:0 |
|
2023-10-11 09:22:17,388 - embedding storage: none |
|
2023-10-11 09:22:17,388 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:22:17,388 Model training base path: "hmbench-newseye/sv-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-11 09:22:17,388 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:22:17,389 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:22:17,389 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 09:22:26,095 epoch 1 - iter 13/136 - loss 2.85446923 - time (sec): 8.70 - samples/sec: 588.34 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 09:22:34,155 epoch 1 - iter 26/136 - loss 2.84819784 - time (sec): 16.76 - samples/sec: 557.37 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 09:22:43,000 epoch 1 - iter 39/136 - loss 2.83695644 - time (sec): 25.61 - samples/sec: 571.65 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 09:22:51,863 epoch 1 - iter 52/136 - loss 2.81681872 - time (sec): 34.47 - samples/sec: 574.45 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-11 09:23:00,750 epoch 1 - iter 65/136 - loss 2.78286085 - time (sec): 43.36 - samples/sec: 577.54 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-11 09:23:09,466 epoch 1 - iter 78/136 - loss 2.72872399 - time (sec): 52.08 - samples/sec: 573.28 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-11 09:23:18,296 epoch 1 - iter 91/136 - loss 2.65661692 - time (sec): 60.91 - samples/sec: 572.03 - lr: 0.000106 - momentum: 0.000000 |
|
2023-10-11 09:23:26,980 epoch 1 - iter 104/136 - loss 2.58114034 - time (sec): 69.59 - samples/sec: 569.30 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-11 09:23:36,080 epoch 1 - iter 117/136 - loss 2.49034961 - time (sec): 78.69 - samples/sec: 571.68 - lr: 0.000136 - momentum: 0.000000 |
|
2023-10-11 09:23:44,902 epoch 1 - iter 130/136 - loss 2.40576009 - time (sec): 87.51 - samples/sec: 573.18 - lr: 0.000152 - momentum: 0.000000 |
|
2023-10-11 09:23:48,395 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:23:48,396 EPOCH 1 done: loss 2.3753 - lr: 0.000152 |
|
2023-10-11 09:23:53,682 DEV : loss 1.356597661972046 - f1-score (micro avg) 0.0 |
|
2023-10-11 09:23:53,690 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:24:02,835 epoch 2 - iter 13/136 - loss 1.33385722 - time (sec): 9.14 - samples/sec: 608.94 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-11 09:24:11,349 epoch 2 - iter 26/136 - loss 1.23868983 - time (sec): 17.66 - samples/sec: 587.74 - lr: 0.000157 - momentum: 0.000000 |
|
2023-10-11 09:24:20,755 epoch 2 - iter 39/136 - loss 1.17813516 - time (sec): 27.06 - samples/sec: 597.44 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-11 09:24:29,258 epoch 2 - iter 52/136 - loss 1.09171959 - time (sec): 35.57 - samples/sec: 589.32 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-11 09:24:37,955 epoch 2 - iter 65/136 - loss 1.03129374 - time (sec): 44.26 - samples/sec: 583.57 - lr: 0.000152 - momentum: 0.000000 |
|
2023-10-11 09:24:46,413 epoch 2 - iter 78/136 - loss 0.97605171 - time (sec): 52.72 - samples/sec: 573.76 - lr: 0.000150 - momentum: 0.000000 |
|
2023-10-11 09:24:55,176 epoch 2 - iter 91/136 - loss 0.93113778 - time (sec): 61.48 - samples/sec: 567.38 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 09:25:04,202 epoch 2 - iter 104/136 - loss 0.87668593 - time (sec): 70.51 - samples/sec: 565.45 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-11 09:25:12,821 epoch 2 - iter 117/136 - loss 0.84924304 - time (sec): 79.13 - samples/sec: 563.58 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-11 09:25:21,129 epoch 2 - iter 130/136 - loss 0.83219789 - time (sec): 87.44 - samples/sec: 563.70 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-11 09:25:25,283 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:25:25,283 EPOCH 2 done: loss 0.8170 - lr: 0.000143 |
|
2023-10-11 09:25:31,157 DEV : loss 0.4814496636390686 - f1-score (micro avg) 0.0 |
|
2023-10-11 09:25:31,166 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:25:39,752 epoch 3 - iter 13/136 - loss 0.52849933 - time (sec): 8.58 - samples/sec: 605.95 - lr: 0.000141 - momentum: 0.000000 |
|
2023-10-11 09:25:48,099 epoch 3 - iter 26/136 - loss 0.51390315 - time (sec): 16.93 - samples/sec: 590.19 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-11 09:25:56,461 epoch 3 - iter 39/136 - loss 0.47008844 - time (sec): 25.29 - samples/sec: 584.37 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 09:26:04,473 epoch 3 - iter 52/136 - loss 0.45885997 - time (sec): 33.31 - samples/sec: 577.32 - lr: 0.000136 - momentum: 0.000000 |
|
2023-10-11 09:26:13,660 epoch 3 - iter 65/136 - loss 0.44940332 - time (sec): 42.49 - samples/sec: 588.92 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-11 09:26:22,010 epoch 3 - iter 78/136 - loss 0.42990217 - time (sec): 50.84 - samples/sec: 587.54 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 09:26:31,612 epoch 3 - iter 91/136 - loss 0.41954018 - time (sec): 60.44 - samples/sec: 594.21 - lr: 0.000131 - momentum: 0.000000 |
|
2023-10-11 09:26:40,220 epoch 3 - iter 104/136 - loss 0.41242956 - time (sec): 69.05 - samples/sec: 593.42 - lr: 0.000129 - momentum: 0.000000 |
|
2023-10-11 09:26:49,193 epoch 3 - iter 117/136 - loss 0.39861445 - time (sec): 78.03 - samples/sec: 588.86 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-11 09:26:57,219 epoch 3 - iter 130/136 - loss 0.39461741 - time (sec): 86.05 - samples/sec: 581.03 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-11 09:27:00,933 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:27:00,933 EPOCH 3 done: loss 0.3964 - lr: 0.000126 |
|
2023-10-11 09:27:07,019 DEV : loss 0.294859379529953 - f1-score (micro avg) 0.2986 |
|
2023-10-11 09:27:07,030 saving best model |
|
2023-10-11 09:27:07,991 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:27:16,576 epoch 4 - iter 13/136 - loss 0.31357415 - time (sec): 8.58 - samples/sec: 556.39 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-11 09:27:24,618 epoch 4 - iter 26/136 - loss 0.31884578 - time (sec): 16.62 - samples/sec: 550.03 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-11 09:27:33,627 epoch 4 - iter 39/136 - loss 0.30684042 - time (sec): 25.63 - samples/sec: 581.74 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-11 09:27:42,467 epoch 4 - iter 52/136 - loss 0.30096490 - time (sec): 34.47 - samples/sec: 595.44 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-11 09:27:50,400 epoch 4 - iter 65/136 - loss 0.29893478 - time (sec): 42.41 - samples/sec: 591.96 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-11 09:27:59,033 epoch 4 - iter 78/136 - loss 0.29398395 - time (sec): 51.04 - samples/sec: 595.38 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-11 09:28:07,206 epoch 4 - iter 91/136 - loss 0.30030830 - time (sec): 59.21 - samples/sec: 591.36 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-11 09:28:15,497 epoch 4 - iter 104/136 - loss 0.29277071 - time (sec): 67.50 - samples/sec: 592.49 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-11 09:28:23,710 epoch 4 - iter 117/136 - loss 0.30022584 - time (sec): 75.72 - samples/sec: 591.45 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-11 09:28:32,379 epoch 4 - iter 130/136 - loss 0.30199746 - time (sec): 84.39 - samples/sec: 588.99 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-11 09:28:36,190 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:28:36,190 EPOCH 4 done: loss 0.2996 - lr: 0.000108 |
|
2023-10-11 09:28:41,678 DEV : loss 0.25123921036720276 - f1-score (micro avg) 0.387 |
|
2023-10-11 09:28:41,687 saving best model |
|
2023-10-11 09:28:44,236 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:28:52,096 epoch 5 - iter 13/136 - loss 0.27743068 - time (sec): 7.86 - samples/sec: 577.12 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 09:29:00,409 epoch 5 - iter 26/136 - loss 0.26814576 - time (sec): 16.17 - samples/sec: 599.92 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-11 09:29:09,814 epoch 5 - iter 39/136 - loss 0.25408909 - time (sec): 25.57 - samples/sec: 587.51 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-11 09:29:17,569 epoch 5 - iter 52/136 - loss 0.25896282 - time (sec): 33.33 - samples/sec: 578.63 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 09:29:26,113 epoch 5 - iter 65/136 - loss 0.24479651 - time (sec): 41.87 - samples/sec: 576.79 - lr: 0.000099 - momentum: 0.000000 |
|
2023-10-11 09:29:35,466 epoch 5 - iter 78/136 - loss 0.23665481 - time (sec): 51.23 - samples/sec: 575.51 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-11 09:29:44,334 epoch 5 - iter 91/136 - loss 0.23896649 - time (sec): 60.09 - samples/sec: 574.14 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-11 09:29:53,316 epoch 5 - iter 104/136 - loss 0.24722321 - time (sec): 69.08 - samples/sec: 574.93 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-11 09:30:02,483 epoch 5 - iter 117/136 - loss 0.24854195 - time (sec): 78.24 - samples/sec: 575.02 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-11 09:30:11,085 epoch 5 - iter 130/136 - loss 0.24845778 - time (sec): 86.84 - samples/sec: 571.43 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-11 09:30:15,093 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:30:15,093 EPOCH 5 done: loss 0.2484 - lr: 0.000090 |
|
2023-10-11 09:30:21,001 DEV : loss 0.2224646955728531 - f1-score (micro avg) 0.4767 |
|
2023-10-11 09:30:21,011 saving best model |
|
2023-10-11 09:30:23,572 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:30:32,235 epoch 6 - iter 13/136 - loss 0.24336816 - time (sec): 8.66 - samples/sec: 599.38 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-11 09:30:40,564 epoch 6 - iter 26/136 - loss 0.23662061 - time (sec): 16.99 - samples/sec: 583.20 - lr: 0.000086 - momentum: 0.000000 |
|
2023-10-11 09:30:49,284 epoch 6 - iter 39/136 - loss 0.22757752 - time (sec): 25.71 - samples/sec: 581.75 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-11 09:30:58,212 epoch 6 - iter 52/136 - loss 0.23103914 - time (sec): 34.64 - samples/sec: 592.25 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-11 09:31:06,295 epoch 6 - iter 65/136 - loss 0.22567006 - time (sec): 42.72 - samples/sec: 580.96 - lr: 0.000081 - momentum: 0.000000 |
|
2023-10-11 09:31:15,137 epoch 6 - iter 78/136 - loss 0.21375693 - time (sec): 51.56 - samples/sec: 583.82 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-11 09:31:24,022 epoch 6 - iter 91/136 - loss 0.20775684 - time (sec): 60.45 - samples/sec: 586.75 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-11 09:31:32,341 epoch 6 - iter 104/136 - loss 0.21174013 - time (sec): 68.76 - samples/sec: 583.41 - lr: 0.000076 - momentum: 0.000000 |
|
2023-10-11 09:31:40,953 epoch 6 - iter 117/136 - loss 0.20969673 - time (sec): 77.38 - samples/sec: 580.99 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-11 09:31:49,704 epoch 6 - iter 130/136 - loss 0.20516424 - time (sec): 86.13 - samples/sec: 579.94 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-11 09:31:53,414 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:31:53,414 EPOCH 6 done: loss 0.2041 - lr: 0.000072 |
|
2023-10-11 09:31:59,207 DEV : loss 0.20685341954231262 - f1-score (micro avg) 0.5169 |
|
2023-10-11 09:31:59,224 saving best model |
|
2023-10-11 09:32:01,786 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:32:10,439 epoch 7 - iter 13/136 - loss 0.17704560 - time (sec): 8.65 - samples/sec: 532.79 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-11 09:32:19,595 epoch 7 - iter 26/136 - loss 0.16608999 - time (sec): 17.80 - samples/sec: 561.43 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-11 09:32:28,971 epoch 7 - iter 39/136 - loss 0.16113561 - time (sec): 27.18 - samples/sec: 576.92 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-11 09:32:37,322 epoch 7 - iter 52/136 - loss 0.16321262 - time (sec): 35.53 - samples/sec: 568.91 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-11 09:32:46,529 epoch 7 - iter 65/136 - loss 0.16337568 - time (sec): 44.74 - samples/sec: 572.50 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-11 09:32:55,317 epoch 7 - iter 78/136 - loss 0.16471333 - time (sec): 53.52 - samples/sec: 568.85 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-11 09:33:03,307 epoch 7 - iter 91/136 - loss 0.16880727 - time (sec): 61.51 - samples/sec: 557.86 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-11 09:33:12,128 epoch 7 - iter 104/136 - loss 0.16668869 - time (sec): 70.34 - samples/sec: 560.26 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-11 09:33:21,089 epoch 7 - iter 117/136 - loss 0.16868453 - time (sec): 79.30 - samples/sec: 563.62 - lr: 0.000056 - momentum: 0.000000 |
|
2023-10-11 09:33:29,353 epoch 7 - iter 130/136 - loss 0.16881225 - time (sec): 87.56 - samples/sec: 562.45 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-11 09:33:33,534 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:33:33,534 EPOCH 7 done: loss 0.1688 - lr: 0.000055 |
|
2023-10-11 09:33:39,299 DEV : loss 0.18640807271003723 - f1-score (micro avg) 0.5329 |
|
2023-10-11 09:33:39,307 saving best model |
|
2023-10-11 09:33:41,873 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:33:50,064 epoch 8 - iter 13/136 - loss 0.16538513 - time (sec): 8.19 - samples/sec: 551.98 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-11 09:33:58,121 epoch 8 - iter 26/136 - loss 0.13848192 - time (sec): 16.24 - samples/sec: 552.95 - lr: 0.000051 - momentum: 0.000000 |
|
2023-10-11 09:34:07,212 epoch 8 - iter 39/136 - loss 0.14824327 - time (sec): 25.33 - samples/sec: 581.65 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-11 09:34:15,755 epoch 8 - iter 52/136 - loss 0.15156730 - time (sec): 33.88 - samples/sec: 586.35 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-11 09:34:24,020 epoch 8 - iter 65/136 - loss 0.15316354 - time (sec): 42.14 - samples/sec: 583.73 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 09:34:32,389 epoch 8 - iter 78/136 - loss 0.15664533 - time (sec): 50.51 - samples/sec: 584.78 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-11 09:34:41,102 epoch 8 - iter 91/136 - loss 0.15321152 - time (sec): 59.22 - samples/sec: 587.00 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-11 09:34:50,395 epoch 8 - iter 104/136 - loss 0.14996935 - time (sec): 68.52 - samples/sec: 589.11 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-11 09:34:59,295 epoch 8 - iter 117/136 - loss 0.14674671 - time (sec): 77.42 - samples/sec: 586.03 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-11 09:35:07,742 epoch 8 - iter 130/136 - loss 0.14590270 - time (sec): 85.86 - samples/sec: 580.16 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-11 09:35:11,635 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:35:11,635 EPOCH 8 done: loss 0.1442 - lr: 0.000037 |
|
2023-10-11 09:35:17,701 DEV : loss 0.1840282827615738 - f1-score (micro avg) 0.5674 |
|
2023-10-11 09:35:17,709 saving best model |
|
2023-10-11 09:35:20,283 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:35:28,030 epoch 9 - iter 13/136 - loss 0.16181475 - time (sec): 7.74 - samples/sec: 533.14 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-11 09:35:36,255 epoch 9 - iter 26/136 - loss 0.14681724 - time (sec): 15.97 - samples/sec: 554.11 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-11 09:35:44,795 epoch 9 - iter 39/136 - loss 0.13820279 - time (sec): 24.51 - samples/sec: 571.17 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-11 09:35:53,078 epoch 9 - iter 52/136 - loss 0.14489701 - time (sec): 32.79 - samples/sec: 569.88 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 09:36:01,201 epoch 9 - iter 65/136 - loss 0.15027000 - time (sec): 40.91 - samples/sec: 567.62 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-11 09:36:10,032 epoch 9 - iter 78/136 - loss 0.14053266 - time (sec): 49.74 - samples/sec: 575.66 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-11 09:36:18,379 epoch 9 - iter 91/136 - loss 0.14000543 - time (sec): 58.09 - samples/sec: 576.16 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-11 09:36:27,550 epoch 9 - iter 104/136 - loss 0.13747365 - time (sec): 67.26 - samples/sec: 583.59 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-11 09:36:36,206 epoch 9 - iter 117/136 - loss 0.13158513 - time (sec): 75.92 - samples/sec: 586.65 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-11 09:36:44,906 epoch 9 - iter 130/136 - loss 0.12865231 - time (sec): 84.62 - samples/sec: 586.80 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-11 09:36:48,811 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:36:48,811 EPOCH 9 done: loss 0.1281 - lr: 0.000019 |
|
2023-10-11 09:36:54,438 DEV : loss 0.18203255534172058 - f1-score (micro avg) 0.6004 |
|
2023-10-11 09:36:54,447 saving best model |
|
2023-10-11 09:36:57,285 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:37:06,059 epoch 10 - iter 13/136 - loss 0.14217771 - time (sec): 8.77 - samples/sec: 584.97 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-11 09:37:14,553 epoch 10 - iter 26/136 - loss 0.13969245 - time (sec): 17.26 - samples/sec: 567.04 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-11 09:37:22,983 epoch 10 - iter 39/136 - loss 0.13303950 - time (sec): 25.69 - samples/sec: 563.53 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-11 09:37:31,331 epoch 10 - iter 52/136 - loss 0.13098438 - time (sec): 34.04 - samples/sec: 556.79 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-11 09:37:40,514 epoch 10 - iter 65/136 - loss 0.12985671 - time (sec): 43.22 - samples/sec: 571.90 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-11 09:37:49,115 epoch 10 - iter 78/136 - loss 0.12227963 - time (sec): 51.83 - samples/sec: 576.27 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-11 09:37:57,502 epoch 10 - iter 91/136 - loss 0.12446202 - time (sec): 60.21 - samples/sec: 576.03 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-11 09:38:06,175 epoch 10 - iter 104/136 - loss 0.12499547 - time (sec): 68.89 - samples/sec: 575.22 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-11 09:38:14,736 epoch 10 - iter 117/136 - loss 0.12096331 - time (sec): 77.45 - samples/sec: 578.51 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-11 09:38:23,127 epoch 10 - iter 130/136 - loss 0.12143672 - time (sec): 85.84 - samples/sec: 578.24 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-11 09:38:26,880 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:38:26,880 EPOCH 10 done: loss 0.1213 - lr: 0.000002 |
|
2023-10-11 09:38:32,414 DEV : loss 0.17927290499210358 - f1-score (micro avg) 0.5942 |
|
2023-10-11 09:38:33,260 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 09:38:33,262 Loading model from best epoch ... |
|
2023-10-11 09:38:36,955 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-11 09:38:48,596 |
|
Results: |
|
- F-score (micro) 0.5031 |
|
- F-score (macro) 0.3361 |
|
- Accuracy 0.3827 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.5573 0.6859 0.6149 312 |
|
PER 0.4034 0.4615 0.4305 208 |
|
HumanProd 0.2000 0.5909 0.2989 22 |
|
ORG 0.0000 0.0000 0.0000 55 |
|
|
|
micro avg 0.4702 0.5410 0.5031 597 |
|
macro avg 0.2902 0.4346 0.3361 597 |
|
weighted avg 0.4392 0.5410 0.4824 597 |
|
|
|
2023-10-11 09:38:48,596 ---------------------------------------------------------------------------------------------------- |
|
|