|
2023-10-11 08:10:20,580 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:10:20,582 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 08:10:20,582 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:10:20,582 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-11 08:10:20,582 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:10:20,582 Train: 1085 sentences |
|
2023-10-11 08:10:20,582 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 08:10:20,583 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:10:20,583 Training Params: |
|
2023-10-11 08:10:20,583 - learning_rate: "0.00016" |
|
2023-10-11 08:10:20,583 - mini_batch_size: "8" |
|
2023-10-11 08:10:20,583 - max_epochs: "10" |
|
2023-10-11 08:10:20,583 - shuffle: "True" |
|
2023-10-11 08:10:20,583 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:10:20,583 Plugins: |
|
2023-10-11 08:10:20,583 - TensorboardLogger |
|
2023-10-11 08:10:20,583 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 08:10:20,583 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:10:20,583 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 08:10:20,583 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 08:10:20,583 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:10:20,584 Computation: |
|
2023-10-11 08:10:20,584 - compute on device: cuda:0 |
|
2023-10-11 08:10:20,584 - embedding storage: none |
|
2023-10-11 08:10:20,584 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:10:20,584 Model training base path: "hmbench-newseye/sv-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-11 08:10:20,584 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:10:20,584 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:10:20,584 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 08:10:29,004 epoch 1 - iter 13/136 - loss 2.83008951 - time (sec): 8.42 - samples/sec: 585.64 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 08:10:37,673 epoch 1 - iter 26/136 - loss 2.82368887 - time (sec): 17.09 - samples/sec: 600.40 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 08:10:46,719 epoch 1 - iter 39/136 - loss 2.81340369 - time (sec): 26.13 - samples/sec: 596.28 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 08:10:55,351 epoch 1 - iter 52/136 - loss 2.79752564 - time (sec): 34.77 - samples/sec: 594.06 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-11 08:11:03,384 epoch 1 - iter 65/136 - loss 2.77294396 - time (sec): 42.80 - samples/sec: 582.92 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-11 08:11:12,489 epoch 1 - iter 78/136 - loss 2.72088276 - time (sec): 51.90 - samples/sec: 584.10 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-11 08:11:21,313 epoch 1 - iter 91/136 - loss 2.65722387 - time (sec): 60.73 - samples/sec: 580.82 - lr: 0.000106 - momentum: 0.000000 |
|
2023-10-11 08:11:30,471 epoch 1 - iter 104/136 - loss 2.57792535 - time (sec): 69.89 - samples/sec: 581.99 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-11 08:11:39,124 epoch 1 - iter 117/136 - loss 2.49490382 - time (sec): 78.54 - samples/sec: 584.23 - lr: 0.000136 - momentum: 0.000000 |
|
2023-10-11 08:11:47,132 epoch 1 - iter 130/136 - loss 2.42291945 - time (sec): 86.55 - samples/sec: 580.68 - lr: 0.000152 - momentum: 0.000000 |
|
2023-10-11 08:11:50,587 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:11:50,588 EPOCH 1 done: loss 2.3906 - lr: 0.000152 |
|
2023-10-11 08:11:55,498 DEV : loss 1.359419345855713 - f1-score (micro avg) 0.0 |
|
2023-10-11 08:11:55,507 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:12:03,838 epoch 2 - iter 13/136 - loss 1.34412946 - time (sec): 8.33 - samples/sec: 563.42 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-11 08:12:12,311 epoch 2 - iter 26/136 - loss 1.24937968 - time (sec): 16.80 - samples/sec: 571.46 - lr: 0.000157 - momentum: 0.000000 |
|
2023-10-11 08:12:21,109 epoch 2 - iter 39/136 - loss 1.17010663 - time (sec): 25.60 - samples/sec: 587.53 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-11 08:12:29,655 epoch 2 - iter 52/136 - loss 1.07940129 - time (sec): 34.15 - samples/sec: 591.20 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-11 08:12:37,608 epoch 2 - iter 65/136 - loss 1.02694391 - time (sec): 42.10 - samples/sec: 579.66 - lr: 0.000152 - momentum: 0.000000 |
|
2023-10-11 08:12:46,472 epoch 2 - iter 78/136 - loss 0.97987925 - time (sec): 50.96 - samples/sec: 587.74 - lr: 0.000150 - momentum: 0.000000 |
|
2023-10-11 08:12:55,008 epoch 2 - iter 91/136 - loss 0.93022685 - time (sec): 59.50 - samples/sec: 585.55 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 08:13:03,896 epoch 2 - iter 104/136 - loss 0.88253436 - time (sec): 68.39 - samples/sec: 587.50 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-11 08:13:12,864 epoch 2 - iter 117/136 - loss 0.83798555 - time (sec): 77.36 - samples/sec: 587.97 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-11 08:13:21,277 epoch 2 - iter 130/136 - loss 0.80219015 - time (sec): 85.77 - samples/sec: 585.96 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-11 08:13:24,864 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:13:24,864 EPOCH 2 done: loss 0.7897 - lr: 0.000143 |
|
2023-10-11 08:13:30,879 DEV : loss 0.3818140923976898 - f1-score (micro avg) 0.0 |
|
2023-10-11 08:13:30,887 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:13:38,973 epoch 3 - iter 13/136 - loss 0.43805734 - time (sec): 8.08 - samples/sec: 479.35 - lr: 0.000141 - momentum: 0.000000 |
|
2023-10-11 08:13:47,955 epoch 3 - iter 26/136 - loss 0.41234292 - time (sec): 17.07 - samples/sec: 518.27 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-11 08:13:56,633 epoch 3 - iter 39/136 - loss 0.40579529 - time (sec): 25.74 - samples/sec: 537.68 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 08:14:05,469 epoch 3 - iter 52/136 - loss 0.39245856 - time (sec): 34.58 - samples/sec: 544.04 - lr: 0.000136 - momentum: 0.000000 |
|
2023-10-11 08:14:14,347 epoch 3 - iter 65/136 - loss 0.39994977 - time (sec): 43.46 - samples/sec: 552.71 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-11 08:14:22,952 epoch 3 - iter 78/136 - loss 0.38752465 - time (sec): 52.06 - samples/sec: 555.44 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 08:14:32,366 epoch 3 - iter 91/136 - loss 0.38301996 - time (sec): 61.48 - samples/sec: 564.22 - lr: 0.000131 - momentum: 0.000000 |
|
2023-10-11 08:14:40,970 epoch 3 - iter 104/136 - loss 0.38396675 - time (sec): 70.08 - samples/sec: 564.65 - lr: 0.000129 - momentum: 0.000000 |
|
2023-10-11 08:14:50,261 epoch 3 - iter 117/136 - loss 0.37265386 - time (sec): 79.37 - samples/sec: 569.43 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-11 08:14:58,659 epoch 3 - iter 130/136 - loss 0.36462475 - time (sec): 87.77 - samples/sec: 568.86 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-11 08:15:02,285 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:15:02,285 EPOCH 3 done: loss 0.3658 - lr: 0.000126 |
|
2023-10-11 08:15:08,253 DEV : loss 0.2684977948665619 - f1-score (micro avg) 0.3173 |
|
2023-10-11 08:15:08,261 saving best model |
|
2023-10-11 08:15:09,130 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:15:17,061 epoch 4 - iter 13/136 - loss 0.33221349 - time (sec): 7.93 - samples/sec: 539.80 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-11 08:15:26,385 epoch 4 - iter 26/136 - loss 0.30977777 - time (sec): 17.25 - samples/sec: 595.15 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-11 08:15:34,738 epoch 4 - iter 39/136 - loss 0.30789013 - time (sec): 25.61 - samples/sec: 588.82 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-11 08:15:43,242 epoch 4 - iter 52/136 - loss 0.29487947 - time (sec): 34.11 - samples/sec: 587.66 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-11 08:15:51,970 epoch 4 - iter 65/136 - loss 0.27112810 - time (sec): 42.84 - samples/sec: 595.38 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-11 08:16:00,574 epoch 4 - iter 78/136 - loss 0.27205837 - time (sec): 51.44 - samples/sec: 590.14 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-11 08:16:09,933 epoch 4 - iter 91/136 - loss 0.26009433 - time (sec): 60.80 - samples/sec: 593.62 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-11 08:16:18,318 epoch 4 - iter 104/136 - loss 0.25642960 - time (sec): 69.19 - samples/sec: 588.68 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-11 08:16:27,314 epoch 4 - iter 117/136 - loss 0.26043586 - time (sec): 78.18 - samples/sec: 587.27 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-11 08:16:35,665 epoch 4 - iter 130/136 - loss 0.26681548 - time (sec): 86.53 - samples/sec: 585.45 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-11 08:16:38,680 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:16:38,681 EPOCH 4 done: loss 0.2657 - lr: 0.000108 |
|
2023-10-11 08:16:44,299 DEV : loss 0.2195644974708557 - f1-score (micro avg) 0.4686 |
|
2023-10-11 08:16:44,307 saving best model |
|
2023-10-11 08:16:46,842 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:16:55,497 epoch 5 - iter 13/136 - loss 0.19784218 - time (sec): 8.65 - samples/sec: 614.63 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 08:17:03,906 epoch 5 - iter 26/136 - loss 0.20836578 - time (sec): 17.06 - samples/sec: 576.51 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-11 08:17:12,106 epoch 5 - iter 39/136 - loss 0.22776639 - time (sec): 25.26 - samples/sec: 563.39 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-11 08:17:20,882 epoch 5 - iter 52/136 - loss 0.21597829 - time (sec): 34.04 - samples/sec: 571.63 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 08:17:29,434 epoch 5 - iter 65/136 - loss 0.21210043 - time (sec): 42.59 - samples/sec: 573.83 - lr: 0.000099 - momentum: 0.000000 |
|
2023-10-11 08:17:38,652 epoch 5 - iter 78/136 - loss 0.21109159 - time (sec): 51.81 - samples/sec: 565.16 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-11 08:17:47,133 epoch 5 - iter 91/136 - loss 0.20674644 - time (sec): 60.29 - samples/sec: 559.84 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-11 08:17:56,654 epoch 5 - iter 104/136 - loss 0.20252494 - time (sec): 69.81 - samples/sec: 566.80 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-11 08:18:05,375 epoch 5 - iter 117/136 - loss 0.20014526 - time (sec): 78.53 - samples/sec: 564.73 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-11 08:18:15,479 epoch 5 - iter 130/136 - loss 0.19880601 - time (sec): 88.63 - samples/sec: 560.94 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-11 08:18:19,494 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:18:19,494 EPOCH 5 done: loss 0.1981 - lr: 0.000090 |
|
2023-10-11 08:18:25,593 DEV : loss 0.18384359776973724 - f1-score (micro avg) 0.6234 |
|
2023-10-11 08:18:25,602 saving best model |
|
2023-10-11 08:18:28,201 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:18:37,304 epoch 6 - iter 13/136 - loss 0.14650725 - time (sec): 9.10 - samples/sec: 556.15 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-11 08:18:46,013 epoch 6 - iter 26/136 - loss 0.16471832 - time (sec): 17.81 - samples/sec: 533.37 - lr: 0.000086 - momentum: 0.000000 |
|
2023-10-11 08:18:55,340 epoch 6 - iter 39/136 - loss 0.17446982 - time (sec): 27.13 - samples/sec: 554.95 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-11 08:19:04,064 epoch 6 - iter 52/136 - loss 0.16576348 - time (sec): 35.86 - samples/sec: 554.79 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-11 08:19:12,449 epoch 6 - iter 65/136 - loss 0.16736478 - time (sec): 44.24 - samples/sec: 546.55 - lr: 0.000081 - momentum: 0.000000 |
|
2023-10-11 08:19:21,134 epoch 6 - iter 78/136 - loss 0.16608000 - time (sec): 52.93 - samples/sec: 546.61 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-11 08:19:29,900 epoch 6 - iter 91/136 - loss 0.16395224 - time (sec): 61.69 - samples/sec: 546.16 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-11 08:19:38,795 epoch 6 - iter 104/136 - loss 0.16307175 - time (sec): 70.59 - samples/sec: 548.09 - lr: 0.000076 - momentum: 0.000000 |
|
2023-10-11 08:19:48,719 epoch 6 - iter 117/136 - loss 0.15725774 - time (sec): 80.51 - samples/sec: 556.95 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-11 08:19:56,956 epoch 6 - iter 130/136 - loss 0.15389111 - time (sec): 88.75 - samples/sec: 554.58 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-11 08:20:01,203 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:20:01,204 EPOCH 6 done: loss 0.1515 - lr: 0.000072 |
|
2023-10-11 08:20:07,027 DEV : loss 0.16635040938854218 - f1-score (micro avg) 0.6201 |
|
2023-10-11 08:20:07,036 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:20:14,907 epoch 7 - iter 13/136 - loss 0.15338606 - time (sec): 7.87 - samples/sec: 471.84 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-11 08:20:24,730 epoch 7 - iter 26/136 - loss 0.14255855 - time (sec): 17.69 - samples/sec: 571.99 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-11 08:20:33,927 epoch 7 - iter 39/136 - loss 0.12679887 - time (sec): 26.89 - samples/sec: 579.04 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-11 08:20:42,728 epoch 7 - iter 52/136 - loss 0.13133836 - time (sec): 35.69 - samples/sec: 577.13 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-11 08:20:51,857 epoch 7 - iter 65/136 - loss 0.13057029 - time (sec): 44.82 - samples/sec: 579.70 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-11 08:21:00,459 epoch 7 - iter 78/136 - loss 0.12851078 - time (sec): 53.42 - samples/sec: 576.71 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-11 08:21:09,029 epoch 7 - iter 91/136 - loss 0.12530179 - time (sec): 61.99 - samples/sec: 575.03 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-11 08:21:17,073 epoch 7 - iter 104/136 - loss 0.12324245 - time (sec): 70.04 - samples/sec: 568.12 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-11 08:21:25,676 epoch 7 - iter 117/136 - loss 0.12249819 - time (sec): 78.64 - samples/sec: 569.42 - lr: 0.000056 - momentum: 0.000000 |
|
2023-10-11 08:21:34,239 epoch 7 - iter 130/136 - loss 0.11928789 - time (sec): 87.20 - samples/sec: 568.99 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-11 08:21:38,203 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:21:38,203 EPOCH 7 done: loss 0.1190 - lr: 0.000055 |
|
2023-10-11 08:21:44,012 DEV : loss 0.15652361512184143 - f1-score (micro avg) 0.6535 |
|
2023-10-11 08:21:44,021 saving best model |
|
2023-10-11 08:21:46,574 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:21:55,242 epoch 8 - iter 13/136 - loss 0.11592295 - time (sec): 8.66 - samples/sec: 569.40 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-11 08:22:03,447 epoch 8 - iter 26/136 - loss 0.10352624 - time (sec): 16.87 - samples/sec: 566.94 - lr: 0.000051 - momentum: 0.000000 |
|
2023-10-11 08:22:12,831 epoch 8 - iter 39/136 - loss 0.11076953 - time (sec): 26.25 - samples/sec: 586.80 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-11 08:22:21,315 epoch 8 - iter 52/136 - loss 0.10546970 - time (sec): 34.74 - samples/sec: 571.88 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-11 08:22:30,565 epoch 8 - iter 65/136 - loss 0.10215561 - time (sec): 43.99 - samples/sec: 572.24 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 08:22:39,732 epoch 8 - iter 78/136 - loss 0.10068370 - time (sec): 53.15 - samples/sec: 569.00 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-11 08:22:48,381 epoch 8 - iter 91/136 - loss 0.10082195 - time (sec): 61.80 - samples/sec: 563.30 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-11 08:22:57,403 epoch 8 - iter 104/136 - loss 0.09810977 - time (sec): 70.82 - samples/sec: 564.68 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-11 08:23:05,932 epoch 8 - iter 117/136 - loss 0.09729601 - time (sec): 79.35 - samples/sec: 560.82 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-11 08:23:15,356 epoch 8 - iter 130/136 - loss 0.09850851 - time (sec): 88.78 - samples/sec: 563.38 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-11 08:23:19,046 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:23:19,047 EPOCH 8 done: loss 0.0980 - lr: 0.000037 |
|
2023-10-11 08:23:25,011 DEV : loss 0.14729855954647064 - f1-score (micro avg) 0.6524 |
|
2023-10-11 08:23:25,019 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:23:32,916 epoch 9 - iter 13/136 - loss 0.08399411 - time (sec): 7.89 - samples/sec: 522.49 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-11 08:23:42,217 epoch 9 - iter 26/136 - loss 0.07704439 - time (sec): 17.20 - samples/sec: 578.09 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-11 08:23:50,654 epoch 9 - iter 39/136 - loss 0.08056037 - time (sec): 25.63 - samples/sec: 578.79 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-11 08:23:59,442 epoch 9 - iter 52/136 - loss 0.08098512 - time (sec): 34.42 - samples/sec: 576.92 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 08:24:08,244 epoch 9 - iter 65/136 - loss 0.08177950 - time (sec): 43.22 - samples/sec: 578.77 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-11 08:24:17,310 epoch 9 - iter 78/136 - loss 0.08167083 - time (sec): 52.29 - samples/sec: 583.52 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-11 08:24:26,105 epoch 9 - iter 91/136 - loss 0.08275593 - time (sec): 61.08 - samples/sec: 578.63 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-11 08:24:34,647 epoch 9 - iter 104/136 - loss 0.08395371 - time (sec): 69.63 - samples/sec: 572.54 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-11 08:24:43,409 epoch 9 - iter 117/136 - loss 0.08520719 - time (sec): 78.39 - samples/sec: 566.16 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-11 08:24:52,367 epoch 9 - iter 130/136 - loss 0.08850575 - time (sec): 87.35 - samples/sec: 568.06 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-11 08:24:56,600 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:24:56,601 EPOCH 9 done: loss 0.0874 - lr: 0.000019 |
|
2023-10-11 08:25:02,537 DEV : loss 0.14513665437698364 - f1-score (micro avg) 0.6908 |
|
2023-10-11 08:25:02,545 saving best model |
|
2023-10-11 08:25:05,116 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:25:13,810 epoch 10 - iter 13/136 - loss 0.06488614 - time (sec): 8.69 - samples/sec: 535.05 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-11 08:25:22,179 epoch 10 - iter 26/136 - loss 0.06997266 - time (sec): 17.06 - samples/sec: 511.30 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-11 08:25:30,800 epoch 10 - iter 39/136 - loss 0.07622472 - time (sec): 25.68 - samples/sec: 510.60 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-11 08:25:40,000 epoch 10 - iter 52/136 - loss 0.07291127 - time (sec): 34.88 - samples/sec: 527.10 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-11 08:25:49,353 epoch 10 - iter 65/136 - loss 0.07368018 - time (sec): 44.23 - samples/sec: 546.46 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-11 08:25:59,836 epoch 10 - iter 78/136 - loss 0.07542636 - time (sec): 54.72 - samples/sec: 565.14 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-11 08:26:09,182 epoch 10 - iter 91/136 - loss 0.07656023 - time (sec): 64.06 - samples/sec: 568.36 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-11 08:26:18,254 epoch 10 - iter 104/136 - loss 0.07981575 - time (sec): 73.13 - samples/sec: 558.65 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-11 08:26:26,888 epoch 10 - iter 117/136 - loss 0.07941501 - time (sec): 81.77 - samples/sec: 554.61 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-11 08:26:35,531 epoch 10 - iter 130/136 - loss 0.07955517 - time (sec): 90.41 - samples/sec: 550.38 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-11 08:26:39,290 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:26:39,290 EPOCH 10 done: loss 0.0802 - lr: 0.000002 |
|
2023-10-11 08:26:45,344 DEV : loss 0.14399686455726624 - f1-score (micro avg) 0.7063 |
|
2023-10-11 08:26:45,353 saving best model |
|
2023-10-11 08:26:52,577 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 08:26:52,579 Loading model from best epoch ... |
|
2023-10-11 08:26:57,447 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-11 08:27:09,616 |
|
Results: |
|
- F-score (micro) 0.6682 |
|
- F-score (macro) 0.4708 |
|
- Accuracy 0.5556 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6383 0.8654 0.7347 312 |
|
PER 0.7249 0.6587 0.6902 208 |
|
HumanProd 0.2931 0.7727 0.4250 22 |
|
ORG 0.2000 0.0182 0.0333 55 |
|
|
|
micro avg 0.6296 0.7119 0.6682 597 |
|
macro avg 0.4641 0.5787 0.4708 597 |
|
weighted avg 0.6154 0.7119 0.6432 597 |
|
|
|
2023-10-11 08:27:09,617 ---------------------------------------------------------------------------------------------------- |
|
|