|
2023-10-11 00:50:21,777 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:50:21,779 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 00:50:21,779 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:50:21,780 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-11 00:50:21,780 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:50:21,780 Train: 1166 sentences |
|
2023-10-11 00:50:21,780 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 00:50:21,780 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:50:21,780 Training Params: |
|
2023-10-11 00:50:21,780 - learning_rate: "0.00015" |
|
2023-10-11 00:50:21,780 - mini_batch_size: "4" |
|
2023-10-11 00:50:21,780 - max_epochs: "10" |
|
2023-10-11 00:50:21,780 - shuffle: "True" |
|
2023-10-11 00:50:21,780 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:50:21,780 Plugins: |
|
2023-10-11 00:50:21,780 - TensorboardLogger |
|
2023-10-11 00:50:21,781 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 00:50:21,781 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:50:21,781 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 00:50:21,781 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 00:50:21,781 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:50:21,781 Computation: |
|
2023-10-11 00:50:21,781 - compute on device: cuda:0 |
|
2023-10-11 00:50:21,781 - embedding storage: none |
|
2023-10-11 00:50:21,781 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:50:21,781 Model training base path: "hmbench-newseye/fi-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-11 00:50:21,781 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:50:21,781 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:50:21,781 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 00:50:30,994 epoch 1 - iter 29/292 - loss 2.82159671 - time (sec): 9.21 - samples/sec: 420.37 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 00:50:41,271 epoch 1 - iter 58/292 - loss 2.81150796 - time (sec): 19.49 - samples/sec: 431.35 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 00:50:51,081 epoch 1 - iter 87/292 - loss 2.79154959 - time (sec): 29.30 - samples/sec: 427.89 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-11 00:51:00,476 epoch 1 - iter 116/292 - loss 2.73211083 - time (sec): 38.69 - samples/sec: 434.21 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-11 00:51:10,963 epoch 1 - iter 145/292 - loss 2.63886376 - time (sec): 49.18 - samples/sec: 436.01 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-11 00:51:21,819 epoch 1 - iter 174/292 - loss 2.53457496 - time (sec): 60.04 - samples/sec: 444.40 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-11 00:51:32,024 epoch 1 - iter 203/292 - loss 2.42260030 - time (sec): 70.24 - samples/sec: 447.65 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-11 00:51:41,242 epoch 1 - iter 232/292 - loss 2.32706750 - time (sec): 79.46 - samples/sec: 443.27 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-11 00:51:51,124 epoch 1 - iter 261/292 - loss 2.20506010 - time (sec): 89.34 - samples/sec: 442.90 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-11 00:52:01,243 epoch 1 - iter 290/292 - loss 2.08434498 - time (sec): 99.46 - samples/sec: 442.73 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 00:52:01,954 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:52:01,954 EPOCH 1 done: loss 2.0728 - lr: 0.000148 |
|
2023-10-11 00:52:07,532 DEV : loss 0.7312660813331604 - f1-score (micro avg) 0.0 |
|
2023-10-11 00:52:07,542 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:52:16,725 epoch 2 - iter 29/292 - loss 0.76128386 - time (sec): 9.18 - samples/sec: 429.57 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 00:52:26,150 epoch 2 - iter 58/292 - loss 0.70992578 - time (sec): 18.61 - samples/sec: 428.77 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-11 00:52:35,816 epoch 2 - iter 87/292 - loss 0.67722563 - time (sec): 28.27 - samples/sec: 432.27 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-11 00:52:45,386 epoch 2 - iter 116/292 - loss 0.65637471 - time (sec): 37.84 - samples/sec: 439.59 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-11 00:52:55,475 epoch 2 - iter 145/292 - loss 0.60692602 - time (sec): 47.93 - samples/sec: 444.72 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-11 00:53:05,279 epoch 2 - iter 174/292 - loss 0.60829187 - time (sec): 57.74 - samples/sec: 448.68 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-11 00:53:14,906 epoch 2 - iter 203/292 - loss 0.58745485 - time (sec): 67.36 - samples/sec: 447.07 - lr: 0.000138 - momentum: 0.000000 |
|
2023-10-11 00:53:24,801 epoch 2 - iter 232/292 - loss 0.56126617 - time (sec): 77.26 - samples/sec: 449.20 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 00:53:34,328 epoch 2 - iter 261/292 - loss 0.54068959 - time (sec): 86.78 - samples/sec: 448.11 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-11 00:53:44,906 epoch 2 - iter 290/292 - loss 0.52073465 - time (sec): 97.36 - samples/sec: 453.12 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-11 00:53:45,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:53:45,475 EPOCH 2 done: loss 0.5195 - lr: 0.000134 |
|
2023-10-11 00:53:51,452 DEV : loss 0.2922310531139374 - f1-score (micro avg) 0.2024 |
|
2023-10-11 00:53:51,462 saving best model |
|
2023-10-11 00:53:52,758 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:54:03,115 epoch 3 - iter 29/292 - loss 0.37983079 - time (sec): 10.35 - samples/sec: 491.11 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 00:54:13,462 epoch 3 - iter 58/292 - loss 0.33791672 - time (sec): 20.70 - samples/sec: 499.93 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-11 00:54:23,057 epoch 3 - iter 87/292 - loss 0.37366879 - time (sec): 30.30 - samples/sec: 491.97 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-11 00:54:32,279 epoch 3 - iter 116/292 - loss 0.35491092 - time (sec): 39.52 - samples/sec: 478.21 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-11 00:54:42,347 epoch 3 - iter 145/292 - loss 0.34147437 - time (sec): 49.59 - samples/sec: 484.65 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-11 00:54:51,471 epoch 3 - iter 174/292 - loss 0.33747745 - time (sec): 58.71 - samples/sec: 476.29 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-11 00:55:00,729 epoch 3 - iter 203/292 - loss 0.32562316 - time (sec): 67.97 - samples/sec: 471.82 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-11 00:55:09,329 epoch 3 - iter 232/292 - loss 0.32501433 - time (sec): 76.57 - samples/sec: 464.71 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-11 00:55:17,972 epoch 3 - iter 261/292 - loss 0.32013085 - time (sec): 85.21 - samples/sec: 458.79 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-11 00:55:28,050 epoch 3 - iter 290/292 - loss 0.31021482 - time (sec): 95.29 - samples/sec: 463.04 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-11 00:55:28,624 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:55:28,624 EPOCH 3 done: loss 0.3091 - lr: 0.000117 |
|
2023-10-11 00:55:34,180 DEV : loss 0.21407955884933472 - f1-score (micro avg) 0.4866 |
|
2023-10-11 00:55:34,189 saving best model |
|
2023-10-11 00:55:42,377 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:55:51,695 epoch 4 - iter 29/292 - loss 0.23965969 - time (sec): 9.31 - samples/sec: 427.66 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-11 00:56:01,955 epoch 4 - iter 58/292 - loss 0.22892559 - time (sec): 19.57 - samples/sec: 454.35 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-11 00:56:11,004 epoch 4 - iter 87/292 - loss 0.22086500 - time (sec): 28.62 - samples/sec: 442.91 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-11 00:56:20,715 epoch 4 - iter 116/292 - loss 0.22494750 - time (sec): 38.33 - samples/sec: 446.64 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-11 00:56:30,836 epoch 4 - iter 145/292 - loss 0.22468010 - time (sec): 48.45 - samples/sec: 460.35 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-11 00:56:39,803 epoch 4 - iter 174/292 - loss 0.22000229 - time (sec): 57.42 - samples/sec: 456.40 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-11 00:56:49,312 epoch 4 - iter 203/292 - loss 0.21233482 - time (sec): 66.93 - samples/sec: 457.76 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 00:56:58,687 epoch 4 - iter 232/292 - loss 0.21144903 - time (sec): 76.31 - samples/sec: 459.69 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-11 00:57:08,090 epoch 4 - iter 261/292 - loss 0.20979618 - time (sec): 85.71 - samples/sec: 457.83 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-11 00:57:18,747 epoch 4 - iter 290/292 - loss 0.20074265 - time (sec): 96.37 - samples/sec: 460.34 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 00:57:19,133 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:57:19,133 EPOCH 4 done: loss 0.2005 - lr: 0.000100 |
|
2023-10-11 00:57:25,028 DEV : loss 0.16525955498218536 - f1-score (micro avg) 0.6345 |
|
2023-10-11 00:57:25,039 saving best model |
|
2023-10-11 00:57:35,183 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:57:45,167 epoch 5 - iter 29/292 - loss 0.17002188 - time (sec): 9.98 - samples/sec: 466.46 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-11 00:57:55,039 epoch 5 - iter 58/292 - loss 0.14429493 - time (sec): 19.85 - samples/sec: 451.54 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-11 00:58:04,598 epoch 5 - iter 87/292 - loss 0.15299122 - time (sec): 29.41 - samples/sec: 437.80 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-11 00:58:14,352 epoch 5 - iter 116/292 - loss 0.16624329 - time (sec): 39.16 - samples/sec: 432.72 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-11 00:58:24,628 epoch 5 - iter 145/292 - loss 0.15294436 - time (sec): 49.44 - samples/sec: 436.70 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-11 00:58:35,266 epoch 5 - iter 174/292 - loss 0.14924424 - time (sec): 60.08 - samples/sec: 447.08 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-11 00:58:45,086 epoch 5 - iter 203/292 - loss 0.14572104 - time (sec): 69.90 - samples/sec: 451.00 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-11 00:58:55,405 epoch 5 - iter 232/292 - loss 0.14222555 - time (sec): 80.22 - samples/sec: 447.48 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-11 00:59:05,798 epoch 5 - iter 261/292 - loss 0.13996227 - time (sec): 90.61 - samples/sec: 448.04 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-11 00:59:14,850 epoch 5 - iter 290/292 - loss 0.13738637 - time (sec): 99.66 - samples/sec: 443.93 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-11 00:59:15,341 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:59:15,341 EPOCH 5 done: loss 0.1373 - lr: 0.000084 |
|
2023-10-11 00:59:21,247 DEV : loss 0.15909573435783386 - f1-score (micro avg) 0.75 |
|
2023-10-11 00:59:21,258 saving best model |
|
2023-10-11 00:59:29,578 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 00:59:39,736 epoch 6 - iter 29/292 - loss 0.07533053 - time (sec): 10.15 - samples/sec: 488.63 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-11 00:59:49,011 epoch 6 - iter 58/292 - loss 0.08647009 - time (sec): 19.43 - samples/sec: 465.26 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-11 00:59:58,345 epoch 6 - iter 87/292 - loss 0.08324412 - time (sec): 28.76 - samples/sec: 456.79 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-11 01:00:08,495 epoch 6 - iter 116/292 - loss 0.08045100 - time (sec): 38.91 - samples/sec: 457.03 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-11 01:00:17,971 epoch 6 - iter 145/292 - loss 0.09413655 - time (sec): 48.39 - samples/sec: 447.21 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-11 01:00:30,185 epoch 6 - iter 174/292 - loss 0.10256312 - time (sec): 60.60 - samples/sec: 452.71 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-11 01:00:39,954 epoch 6 - iter 203/292 - loss 0.10433424 - time (sec): 70.37 - samples/sec: 449.83 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-11 01:00:49,717 epoch 6 - iter 232/292 - loss 0.09984966 - time (sec): 80.13 - samples/sec: 451.42 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-11 01:00:58,833 epoch 6 - iter 261/292 - loss 0.09911210 - time (sec): 89.25 - samples/sec: 449.80 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-11 01:01:08,672 epoch 6 - iter 290/292 - loss 0.09772248 - time (sec): 99.09 - samples/sec: 447.11 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-11 01:01:09,105 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:01:09,105 EPOCH 6 done: loss 0.0977 - lr: 0.000067 |
|
2023-10-11 01:01:15,130 DEV : loss 0.1364792436361313 - f1-score (micro avg) 0.7425 |
|
2023-10-11 01:01:15,141 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:01:25,276 epoch 7 - iter 29/292 - loss 0.06875395 - time (sec): 10.13 - samples/sec: 470.05 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-11 01:01:35,793 epoch 7 - iter 58/292 - loss 0.07266422 - time (sec): 20.65 - samples/sec: 472.46 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-11 01:01:45,287 epoch 7 - iter 87/292 - loss 0.07403621 - time (sec): 30.14 - samples/sec: 462.39 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-11 01:01:54,612 epoch 7 - iter 116/292 - loss 0.06805350 - time (sec): 39.47 - samples/sec: 458.29 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-11 01:02:03,847 epoch 7 - iter 145/292 - loss 0.07208524 - time (sec): 48.70 - samples/sec: 456.64 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-11 01:02:12,537 epoch 7 - iter 174/292 - loss 0.07485159 - time (sec): 57.39 - samples/sec: 454.56 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-11 01:02:22,221 epoch 7 - iter 203/292 - loss 0.07712604 - time (sec): 67.08 - samples/sec: 458.86 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-11 01:02:31,161 epoch 7 - iter 232/292 - loss 0.07674406 - time (sec): 76.02 - samples/sec: 453.74 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-11 01:02:42,087 epoch 7 - iter 261/292 - loss 0.07805652 - time (sec): 86.94 - samples/sec: 459.18 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-11 01:02:51,824 epoch 7 - iter 290/292 - loss 0.07712795 - time (sec): 96.68 - samples/sec: 456.75 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-11 01:02:52,422 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:02:52,423 EPOCH 7 done: loss 0.0767 - lr: 0.000050 |
|
2023-10-11 01:02:58,358 DEV : loss 0.13527634739875793 - f1-score (micro avg) 0.7421 |
|
2023-10-11 01:02:58,368 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:03:09,439 epoch 8 - iter 29/292 - loss 0.05749003 - time (sec): 11.07 - samples/sec: 481.78 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-11 01:03:18,707 epoch 8 - iter 58/292 - loss 0.07179638 - time (sec): 20.34 - samples/sec: 457.10 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-11 01:03:28,519 epoch 8 - iter 87/292 - loss 0.06961649 - time (sec): 30.15 - samples/sec: 439.32 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 01:03:38,114 epoch 8 - iter 116/292 - loss 0.07056370 - time (sec): 39.74 - samples/sec: 444.19 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-11 01:03:47,883 epoch 8 - iter 145/292 - loss 0.07275745 - time (sec): 49.51 - samples/sec: 448.83 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-11 01:03:57,135 epoch 8 - iter 174/292 - loss 0.07168536 - time (sec): 58.77 - samples/sec: 443.44 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-11 01:04:07,330 epoch 8 - iter 203/292 - loss 0.06796946 - time (sec): 68.96 - samples/sec: 442.53 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-11 01:04:16,921 epoch 8 - iter 232/292 - loss 0.06456009 - time (sec): 78.55 - samples/sec: 440.82 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-11 01:04:27,689 epoch 8 - iter 261/292 - loss 0.06065561 - time (sec): 89.32 - samples/sec: 445.33 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-11 01:04:37,905 epoch 8 - iter 290/292 - loss 0.06311309 - time (sec): 99.54 - samples/sec: 443.45 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-11 01:04:38,543 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:04:38,544 EPOCH 8 done: loss 0.0639 - lr: 0.000034 |
|
2023-10-11 01:04:44,596 DEV : loss 0.13573415577411652 - f1-score (micro avg) 0.7526 |
|
2023-10-11 01:04:44,606 saving best model |
|
2023-10-11 01:04:53,186 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:05:04,113 epoch 9 - iter 29/292 - loss 0.06634260 - time (sec): 10.92 - samples/sec: 442.52 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 01:05:15,299 epoch 9 - iter 58/292 - loss 0.05300830 - time (sec): 22.11 - samples/sec: 429.58 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-11 01:05:25,289 epoch 9 - iter 87/292 - loss 0.04974619 - time (sec): 32.10 - samples/sec: 419.03 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 01:05:35,229 epoch 9 - iter 116/292 - loss 0.05095022 - time (sec): 42.04 - samples/sec: 431.58 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-11 01:05:45,780 epoch 9 - iter 145/292 - loss 0.05630806 - time (sec): 52.59 - samples/sec: 436.18 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-11 01:05:55,631 epoch 9 - iter 174/292 - loss 0.05258594 - time (sec): 62.44 - samples/sec: 438.89 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-11 01:06:04,960 epoch 9 - iter 203/292 - loss 0.05164650 - time (sec): 71.77 - samples/sec: 438.92 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-11 01:06:14,983 epoch 9 - iter 232/292 - loss 0.04999494 - time (sec): 81.79 - samples/sec: 439.41 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-11 01:06:24,698 epoch 9 - iter 261/292 - loss 0.05561816 - time (sec): 91.51 - samples/sec: 439.53 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-11 01:06:34,087 epoch 9 - iter 290/292 - loss 0.05665320 - time (sec): 100.90 - samples/sec: 438.57 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-11 01:06:34,575 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:06:34,575 EPOCH 9 done: loss 0.0565 - lr: 0.000017 |
|
2023-10-11 01:06:40,540 DEV : loss 0.13554613292217255 - f1-score (micro avg) 0.7342 |
|
2023-10-11 01:06:40,552 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:06:50,750 epoch 10 - iter 29/292 - loss 0.04670804 - time (sec): 10.20 - samples/sec: 483.70 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-11 01:07:00,562 epoch 10 - iter 58/292 - loss 0.05131042 - time (sec): 20.01 - samples/sec: 475.56 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 01:07:10,586 epoch 10 - iter 87/292 - loss 0.05819764 - time (sec): 30.03 - samples/sec: 487.05 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-11 01:07:20,049 epoch 10 - iter 116/292 - loss 0.05376000 - time (sec): 39.50 - samples/sec: 481.95 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-11 01:07:29,378 epoch 10 - iter 145/292 - loss 0.05463939 - time (sec): 48.82 - samples/sec: 481.20 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-11 01:07:38,497 epoch 10 - iter 174/292 - loss 0.05237695 - time (sec): 57.94 - samples/sec: 475.74 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-11 01:07:47,611 epoch 10 - iter 203/292 - loss 0.05012569 - time (sec): 67.06 - samples/sec: 473.06 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-11 01:07:57,036 epoch 10 - iter 232/292 - loss 0.04789444 - time (sec): 76.48 - samples/sec: 472.60 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-11 01:08:05,894 epoch 10 - iter 261/292 - loss 0.04957126 - time (sec): 85.34 - samples/sec: 467.49 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-11 01:08:15,463 epoch 10 - iter 290/292 - loss 0.04988927 - time (sec): 94.91 - samples/sec: 467.25 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-11 01:08:15,844 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:08:15,845 EPOCH 10 done: loss 0.0499 - lr: 0.000000 |
|
2023-10-11 01:08:21,444 DEV : loss 0.13735945522785187 - f1-score (micro avg) 0.7489 |
|
2023-10-11 01:08:22,663 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 01:08:22,665 Loading model from best epoch ... |
|
2023-10-11 01:08:26,661 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-11 01:08:39,358 |
|
Results: |
|
- F-score (micro) 0.7334 |
|
- F-score (macro) 0.6963 |
|
- Accuracy 0.5943 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8164 0.8305 0.8234 348 |
|
LOC 0.5777 0.8123 0.6752 261 |
|
ORG 0.4231 0.4231 0.4231 52 |
|
HumanProd 0.8636 0.8636 0.8636 22 |
|
|
|
micro avg 0.6818 0.7936 0.7334 683 |
|
macro avg 0.6702 0.7324 0.6963 683 |
|
weighted avg 0.6967 0.7936 0.7375 683 |
|
|
|
2023-10-11 01:08:39,358 ---------------------------------------------------------------------------------------------------- |
|
|