|
2023-10-11 11:42:52,153 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:42:52,155 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 11:42:52,155 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:42:52,156 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-11 11:42:52,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:42:52,156 Train: 1085 sentences |
|
2023-10-11 11:42:52,156 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 11:42:52,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:42:52,156 Training Params: |
|
2023-10-11 11:42:52,156 - learning_rate: "0.00016" |
|
2023-10-11 11:42:52,156 - mini_batch_size: "8" |
|
2023-10-11 11:42:52,156 - max_epochs: "10" |
|
2023-10-11 11:42:52,156 - shuffle: "True" |
|
2023-10-11 11:42:52,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:42:52,156 Plugins: |
|
2023-10-11 11:42:52,156 - TensorboardLogger |
|
2023-10-11 11:42:52,157 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 11:42:52,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:42:52,157 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 11:42:52,157 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 11:42:52,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:42:52,157 Computation: |
|
2023-10-11 11:42:52,157 - compute on device: cuda:0 |
|
2023-10-11 11:42:52,157 - embedding storage: none |
|
2023-10-11 11:42:52,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:42:52,157 Model training base path: "hmbench-newseye/sv-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-11 11:42:52,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:42:52,157 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:42:52,157 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 11:43:00,927 epoch 1 - iter 13/136 - loss 2.84089892 - time (sec): 8.77 - samples/sec: 615.10 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 11:43:10,129 epoch 1 - iter 26/136 - loss 2.83498049 - time (sec): 17.97 - samples/sec: 615.85 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 11:43:18,889 epoch 1 - iter 39/136 - loss 2.82406903 - time (sec): 26.73 - samples/sec: 601.04 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 11:43:27,568 epoch 1 - iter 52/136 - loss 2.80264022 - time (sec): 35.41 - samples/sec: 593.72 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-11 11:43:35,878 epoch 1 - iter 65/136 - loss 2.76592580 - time (sec): 43.72 - samples/sec: 582.84 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-11 11:43:44,185 epoch 1 - iter 78/136 - loss 2.71169051 - time (sec): 52.03 - samples/sec: 577.85 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-11 11:43:52,478 epoch 1 - iter 91/136 - loss 2.64797024 - time (sec): 60.32 - samples/sec: 571.71 - lr: 0.000106 - momentum: 0.000000 |
|
2023-10-11 11:44:01,392 epoch 1 - iter 104/136 - loss 2.56715392 - time (sec): 69.23 - samples/sec: 576.23 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-11 11:44:09,241 epoch 1 - iter 117/136 - loss 2.50180341 - time (sec): 77.08 - samples/sec: 572.72 - lr: 0.000136 - momentum: 0.000000 |
|
2023-10-11 11:44:18,110 epoch 1 - iter 130/136 - loss 2.41593244 - time (sec): 85.95 - samples/sec: 571.04 - lr: 0.000152 - momentum: 0.000000 |
|
2023-10-11 11:44:22,531 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:44:22,532 EPOCH 1 done: loss 2.3622 - lr: 0.000152 |
|
2023-10-11 11:44:27,578 DEV : loss 1.331149935722351 - f1-score (micro avg) 0.0 |
|
2023-10-11 11:44:27,587 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:44:36,368 epoch 2 - iter 13/136 - loss 1.32887985 - time (sec): 8.78 - samples/sec: 602.57 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-11 11:44:44,636 epoch 2 - iter 26/136 - loss 1.26733038 - time (sec): 17.05 - samples/sec: 574.22 - lr: 0.000157 - momentum: 0.000000 |
|
2023-10-11 11:44:52,303 epoch 2 - iter 39/136 - loss 1.20119028 - time (sec): 24.71 - samples/sec: 550.00 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-11 11:45:01,401 epoch 2 - iter 52/136 - loss 1.10723903 - time (sec): 33.81 - samples/sec: 560.91 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-11 11:45:11,107 epoch 2 - iter 65/136 - loss 1.02852991 - time (sec): 43.52 - samples/sec: 563.78 - lr: 0.000152 - momentum: 0.000000 |
|
2023-10-11 11:45:20,058 epoch 2 - iter 78/136 - loss 0.96147737 - time (sec): 52.47 - samples/sec: 564.12 - lr: 0.000150 - momentum: 0.000000 |
|
2023-10-11 11:45:28,380 epoch 2 - iter 91/136 - loss 0.90827453 - time (sec): 60.79 - samples/sec: 560.43 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 11:45:36,838 epoch 2 - iter 104/136 - loss 0.85911401 - time (sec): 69.25 - samples/sec: 560.82 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-11 11:45:44,748 epoch 2 - iter 117/136 - loss 0.83120687 - time (sec): 77.16 - samples/sec: 559.18 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-11 11:45:54,150 epoch 2 - iter 130/136 - loss 0.79282972 - time (sec): 86.56 - samples/sec: 570.10 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-11 11:45:58,234 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:45:58,235 EPOCH 2 done: loss 0.7761 - lr: 0.000143 |
|
2023-10-11 11:46:03,660 DEV : loss 0.3906053304672241 - f1-score (micro avg) 0.0 |
|
2023-10-11 11:46:03,668 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:46:12,037 epoch 3 - iter 13/136 - loss 0.34739828 - time (sec): 8.37 - samples/sec: 556.71 - lr: 0.000141 - momentum: 0.000000 |
|
2023-10-11 11:46:21,281 epoch 3 - iter 26/136 - loss 0.39816021 - time (sec): 17.61 - samples/sec: 600.01 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-11 11:46:29,696 epoch 3 - iter 39/136 - loss 0.40218390 - time (sec): 26.03 - samples/sec: 601.13 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 11:46:38,051 epoch 3 - iter 52/136 - loss 0.40576799 - time (sec): 34.38 - samples/sec: 595.88 - lr: 0.000136 - momentum: 0.000000 |
|
2023-10-11 11:46:46,295 epoch 3 - iter 65/136 - loss 0.39917278 - time (sec): 42.63 - samples/sec: 592.30 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-11 11:46:54,812 epoch 3 - iter 78/136 - loss 0.39934212 - time (sec): 51.14 - samples/sec: 592.05 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 11:47:03,667 epoch 3 - iter 91/136 - loss 0.39725495 - time (sec): 60.00 - samples/sec: 596.39 - lr: 0.000131 - momentum: 0.000000 |
|
2023-10-11 11:47:11,817 epoch 3 - iter 104/136 - loss 0.39381516 - time (sec): 68.15 - samples/sec: 589.08 - lr: 0.000129 - momentum: 0.000000 |
|
2023-10-11 11:47:20,374 epoch 3 - iter 117/136 - loss 0.38541873 - time (sec): 76.70 - samples/sec: 590.97 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-11 11:47:28,386 epoch 3 - iter 130/136 - loss 0.38481317 - time (sec): 84.72 - samples/sec: 587.10 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-11 11:47:32,136 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:47:32,136 EPOCH 3 done: loss 0.3826 - lr: 0.000126 |
|
2023-10-11 11:47:37,658 DEV : loss 0.2841358780860901 - f1-score (micro avg) 0.303 |
|
2023-10-11 11:47:37,666 saving best model |
|
2023-10-11 11:47:38,500 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:47:46,931 epoch 4 - iter 13/136 - loss 0.30981951 - time (sec): 8.43 - samples/sec: 565.84 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-11 11:47:55,313 epoch 4 - iter 26/136 - loss 0.26209209 - time (sec): 16.81 - samples/sec: 572.91 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-11 11:48:04,432 epoch 4 - iter 39/136 - loss 0.29742145 - time (sec): 25.93 - samples/sec: 602.75 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-11 11:48:12,946 epoch 4 - iter 52/136 - loss 0.30087458 - time (sec): 34.44 - samples/sec: 600.55 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-11 11:48:21,575 epoch 4 - iter 65/136 - loss 0.29630507 - time (sec): 43.07 - samples/sec: 597.25 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-11 11:48:30,045 epoch 4 - iter 78/136 - loss 0.29026697 - time (sec): 51.54 - samples/sec: 593.86 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-11 11:48:38,662 epoch 4 - iter 91/136 - loss 0.29000751 - time (sec): 60.16 - samples/sec: 596.55 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-11 11:48:46,927 epoch 4 - iter 104/136 - loss 0.28862524 - time (sec): 68.42 - samples/sec: 591.68 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-11 11:48:55,506 epoch 4 - iter 117/136 - loss 0.29291057 - time (sec): 77.00 - samples/sec: 588.96 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-11 11:49:03,284 epoch 4 - iter 130/136 - loss 0.29357331 - time (sec): 84.78 - samples/sec: 580.03 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-11 11:49:07,528 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:49:07,528 EPOCH 4 done: loss 0.2924 - lr: 0.000108 |
|
2023-10-11 11:49:13,047 DEV : loss 0.24194754660129547 - f1-score (micro avg) 0.361 |
|
2023-10-11 11:49:13,055 saving best model |
|
2023-10-11 11:49:15,756 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:49:24,571 epoch 5 - iter 13/136 - loss 0.25599446 - time (sec): 8.81 - samples/sec: 607.84 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 11:49:33,041 epoch 5 - iter 26/136 - loss 0.26637160 - time (sec): 17.28 - samples/sec: 592.68 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-11 11:49:41,925 epoch 5 - iter 39/136 - loss 0.26414833 - time (sec): 26.16 - samples/sec: 599.21 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-11 11:49:51,071 epoch 5 - iter 52/136 - loss 0.25660484 - time (sec): 35.31 - samples/sec: 606.04 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 11:49:59,386 epoch 5 - iter 65/136 - loss 0.25490717 - time (sec): 43.63 - samples/sec: 600.20 - lr: 0.000099 - momentum: 0.000000 |
|
2023-10-11 11:50:07,873 epoch 5 - iter 78/136 - loss 0.25012610 - time (sec): 52.11 - samples/sec: 591.76 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-11 11:50:16,091 epoch 5 - iter 91/136 - loss 0.24408605 - time (sec): 60.33 - samples/sec: 585.86 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-11 11:50:24,451 epoch 5 - iter 104/136 - loss 0.24082052 - time (sec): 68.69 - samples/sec: 583.79 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-11 11:50:32,782 epoch 5 - iter 117/136 - loss 0.23933921 - time (sec): 77.02 - samples/sec: 581.82 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-11 11:50:41,368 epoch 5 - iter 130/136 - loss 0.23509442 - time (sec): 85.61 - samples/sec: 581.75 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-11 11:50:45,040 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:50:45,041 EPOCH 5 done: loss 0.2334 - lr: 0.000090 |
|
2023-10-11 11:50:50,569 DEV : loss 0.2023283690214157 - f1-score (micro avg) 0.5352 |
|
2023-10-11 11:50:50,577 saving best model |
|
2023-10-11 11:50:53,072 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:51:01,826 epoch 6 - iter 13/136 - loss 0.18517967 - time (sec): 8.75 - samples/sec: 571.85 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-11 11:51:10,387 epoch 6 - iter 26/136 - loss 0.18845994 - time (sec): 17.31 - samples/sec: 580.12 - lr: 0.000086 - momentum: 0.000000 |
|
2023-10-11 11:51:19,918 epoch 6 - iter 39/136 - loss 0.18613737 - time (sec): 26.84 - samples/sec: 598.85 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-11 11:51:27,768 epoch 6 - iter 52/136 - loss 0.18257001 - time (sec): 34.69 - samples/sec: 587.24 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-11 11:51:35,920 epoch 6 - iter 65/136 - loss 0.17864499 - time (sec): 42.85 - samples/sec: 585.99 - lr: 0.000081 - momentum: 0.000000 |
|
2023-10-11 11:51:44,330 epoch 6 - iter 78/136 - loss 0.17689256 - time (sec): 51.26 - samples/sec: 584.99 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-11 11:51:52,541 epoch 6 - iter 91/136 - loss 0.18469652 - time (sec): 59.47 - samples/sec: 586.24 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-11 11:52:00,605 epoch 6 - iter 104/136 - loss 0.18264529 - time (sec): 67.53 - samples/sec: 582.48 - lr: 0.000076 - momentum: 0.000000 |
|
2023-10-11 11:52:09,270 epoch 6 - iter 117/136 - loss 0.18078056 - time (sec): 76.19 - samples/sec: 585.17 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-11 11:52:18,305 epoch 6 - iter 130/136 - loss 0.17744831 - time (sec): 85.23 - samples/sec: 589.03 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-11 11:52:21,741 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:52:21,741 EPOCH 6 done: loss 0.1785 - lr: 0.000072 |
|
2023-10-11 11:52:27,298 DEV : loss 0.18196691572666168 - f1-score (micro avg) 0.6184 |
|
2023-10-11 11:52:27,306 saving best model |
|
2023-10-11 11:52:29,802 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:52:38,343 epoch 7 - iter 13/136 - loss 0.13573850 - time (sec): 8.54 - samples/sec: 550.59 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-11 11:52:46,362 epoch 7 - iter 26/136 - loss 0.15154756 - time (sec): 16.56 - samples/sec: 539.09 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-11 11:52:54,607 epoch 7 - iter 39/136 - loss 0.14147926 - time (sec): 24.80 - samples/sec: 545.99 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-11 11:53:02,804 epoch 7 - iter 52/136 - loss 0.14239010 - time (sec): 33.00 - samples/sec: 544.29 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-11 11:53:13,375 epoch 7 - iter 65/136 - loss 0.13702337 - time (sec): 43.57 - samples/sec: 546.13 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-11 11:53:22,841 epoch 7 - iter 78/136 - loss 0.13307308 - time (sec): 53.03 - samples/sec: 556.81 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-11 11:53:31,166 epoch 7 - iter 91/136 - loss 0.13621321 - time (sec): 61.36 - samples/sec: 553.56 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-11 11:53:39,831 epoch 7 - iter 104/136 - loss 0.13813726 - time (sec): 70.02 - samples/sec: 553.89 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-11 11:53:48,877 epoch 7 - iter 117/136 - loss 0.13858223 - time (sec): 79.07 - samples/sec: 560.08 - lr: 0.000056 - momentum: 0.000000 |
|
2023-10-11 11:53:57,820 epoch 7 - iter 130/136 - loss 0.13942011 - time (sec): 88.01 - samples/sec: 563.86 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-11 11:54:01,781 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:54:01,781 EPOCH 7 done: loss 0.1413 - lr: 0.000055 |
|
2023-10-11 11:54:07,495 DEV : loss 0.1657494753599167 - f1-score (micro avg) 0.6118 |
|
2023-10-11 11:54:07,504 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:54:16,368 epoch 8 - iter 13/136 - loss 0.09943176 - time (sec): 8.86 - samples/sec: 569.40 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-11 11:54:25,369 epoch 8 - iter 26/136 - loss 0.12241690 - time (sec): 17.86 - samples/sec: 583.50 - lr: 0.000051 - momentum: 0.000000 |
|
2023-10-11 11:54:34,298 epoch 8 - iter 39/136 - loss 0.12160070 - time (sec): 26.79 - samples/sec: 591.13 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-11 11:54:42,211 epoch 8 - iter 52/136 - loss 0.11904427 - time (sec): 34.71 - samples/sec: 571.27 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-11 11:54:51,653 epoch 8 - iter 65/136 - loss 0.11543585 - time (sec): 44.15 - samples/sec: 581.03 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 11:55:01,140 epoch 8 - iter 78/136 - loss 0.11519079 - time (sec): 53.63 - samples/sec: 590.70 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-11 11:55:09,607 epoch 8 - iter 91/136 - loss 0.11768833 - time (sec): 62.10 - samples/sec: 583.56 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-11 11:55:17,593 epoch 8 - iter 104/136 - loss 0.11737793 - time (sec): 70.09 - samples/sec: 577.78 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-11 11:55:26,457 epoch 8 - iter 117/136 - loss 0.11880016 - time (sec): 78.95 - samples/sec: 580.16 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-11 11:55:34,585 epoch 8 - iter 130/136 - loss 0.11635025 - time (sec): 87.08 - samples/sec: 577.11 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-11 11:55:37,961 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:55:37,961 EPOCH 8 done: loss 0.1166 - lr: 0.000037 |
|
2023-10-11 11:55:43,939 DEV : loss 0.1629943549633026 - f1-score (micro avg) 0.6403 |
|
2023-10-11 11:55:43,947 saving best model |
|
2023-10-11 11:55:46,470 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:55:54,803 epoch 9 - iter 13/136 - loss 0.12409007 - time (sec): 8.33 - samples/sec: 568.60 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-11 11:56:02,461 epoch 9 - iter 26/136 - loss 0.12338939 - time (sec): 15.99 - samples/sec: 549.75 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-11 11:56:10,655 epoch 9 - iter 39/136 - loss 0.11349678 - time (sec): 24.18 - samples/sec: 555.76 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-11 11:56:19,197 epoch 9 - iter 52/136 - loss 0.10693226 - time (sec): 32.72 - samples/sec: 566.11 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 11:56:28,242 epoch 9 - iter 65/136 - loss 0.10839615 - time (sec): 41.77 - samples/sec: 576.28 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-11 11:56:36,534 epoch 9 - iter 78/136 - loss 0.10371986 - time (sec): 50.06 - samples/sec: 575.29 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-11 11:56:45,697 epoch 9 - iter 91/136 - loss 0.09994810 - time (sec): 59.22 - samples/sec: 578.63 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-11 11:56:54,262 epoch 9 - iter 104/136 - loss 0.10282509 - time (sec): 67.79 - samples/sec: 577.37 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-11 11:57:03,462 epoch 9 - iter 117/136 - loss 0.10471070 - time (sec): 76.99 - samples/sec: 580.27 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-11 11:57:12,384 epoch 9 - iter 130/136 - loss 0.10371980 - time (sec): 85.91 - samples/sec: 582.84 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-11 11:57:15,903 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:57:15,903 EPOCH 9 done: loss 0.1028 - lr: 0.000019 |
|
2023-10-11 11:57:21,970 DEV : loss 0.15900003910064697 - f1-score (micro avg) 0.6486 |
|
2023-10-11 11:57:21,979 saving best model |
|
2023-10-11 11:57:24,526 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:57:33,259 epoch 10 - iter 13/136 - loss 0.08673914 - time (sec): 8.73 - samples/sec: 577.96 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-11 11:57:41,971 epoch 10 - iter 26/136 - loss 0.10283299 - time (sec): 17.44 - samples/sec: 581.89 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-11 11:57:50,068 epoch 10 - iter 39/136 - loss 0.10096336 - time (sec): 25.54 - samples/sec: 576.52 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-11 11:57:58,268 epoch 10 - iter 52/136 - loss 0.10508712 - time (sec): 33.74 - samples/sec: 576.09 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-11 11:58:07,497 epoch 10 - iter 65/136 - loss 0.09663908 - time (sec): 42.97 - samples/sec: 582.26 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-11 11:58:15,641 epoch 10 - iter 78/136 - loss 0.09515246 - time (sec): 51.11 - samples/sec: 576.29 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-11 11:58:24,729 epoch 10 - iter 91/136 - loss 0.09381065 - time (sec): 60.20 - samples/sec: 577.73 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-11 11:58:33,569 epoch 10 - iter 104/136 - loss 0.09477746 - time (sec): 69.04 - samples/sec: 575.19 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-11 11:58:42,433 epoch 10 - iter 117/136 - loss 0.09519411 - time (sec): 77.90 - samples/sec: 577.00 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-11 11:58:51,065 epoch 10 - iter 130/136 - loss 0.09571016 - time (sec): 86.54 - samples/sec: 576.79 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-11 11:58:54,747 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:58:54,747 EPOCH 10 done: loss 0.0960 - lr: 0.000002 |
|
2023-10-11 11:59:00,673 DEV : loss 0.16201113164424896 - f1-score (micro avg) 0.6306 |
|
2023-10-11 11:59:01,568 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 11:59:01,570 Loading model from best epoch ... |
|
2023-10-11 11:59:05,259 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-11 11:59:17,559 |
|
Results: |
|
- F-score (micro) 0.6285 |
|
- F-score (macro) 0.4257 |
|
- Accuracy 0.5154 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6256 0.8462 0.7193 312 |
|
PER 0.6557 0.5769 0.6138 208 |
|
HumanProd 0.2429 0.7727 0.3696 22 |
|
ORG 0.0000 0.0000 0.0000 55 |
|
|
|
micro avg 0.5906 0.6717 0.6285 597 |
|
macro avg 0.3810 0.5490 0.4257 597 |
|
weighted avg 0.5644 0.6717 0.6034 597 |
|
|
|
2023-10-11 11:59:17,559 ---------------------------------------------------------------------------------------------------- |
|
|