|
2023-10-10 22:38:50,022 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:38:50,024 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-10 22:38:50,025 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:38:50,025 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-10 22:38:50,025 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:38:50,025 Train: 1166 sentences |
|
2023-10-10 22:38:50,025 (train_with_dev=False, train_with_test=False) |
|
2023-10-10 22:38:50,025 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:38:50,025 Training Params: |
|
2023-10-10 22:38:50,025 - learning_rate: "0.00016" |
|
2023-10-10 22:38:50,025 - mini_batch_size: "4" |
|
2023-10-10 22:38:50,025 - max_epochs: "10" |
|
2023-10-10 22:38:50,025 - shuffle: "True" |
|
2023-10-10 22:38:50,025 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:38:50,025 Plugins: |
|
2023-10-10 22:38:50,026 - TensorboardLogger |
|
2023-10-10 22:38:50,026 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-10 22:38:50,026 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:38:50,026 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-10 22:38:50,026 - metric: "('micro avg', 'f1-score')" |
|
2023-10-10 22:38:50,026 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:38:50,026 Computation: |
|
2023-10-10 22:38:50,026 - compute on device: cuda:0 |
|
2023-10-10 22:38:50,026 - embedding storage: none |
|
2023-10-10 22:38:50,026 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:38:50,026 Model training base path: "hmbench-newseye/fi-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-10 22:38:50,026 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:38:50,026 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:38:50,026 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-10 22:38:59,893 epoch 1 - iter 29/292 - loss 2.82904011 - time (sec): 9.86 - samples/sec: 517.82 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-10 22:39:08,951 epoch 1 - iter 58/292 - loss 2.82009083 - time (sec): 18.92 - samples/sec: 482.18 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-10 22:39:19,259 epoch 1 - iter 87/292 - loss 2.79608502 - time (sec): 29.23 - samples/sec: 501.01 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-10 22:39:28,698 epoch 1 - iter 116/292 - loss 2.75939629 - time (sec): 38.67 - samples/sec: 482.24 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-10 22:39:38,409 epoch 1 - iter 145/292 - loss 2.68289368 - time (sec): 48.38 - samples/sec: 465.91 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-10 22:39:47,654 epoch 1 - iter 174/292 - loss 2.58924155 - time (sec): 57.63 - samples/sec: 455.94 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-10 22:39:57,761 epoch 1 - iter 203/292 - loss 2.45043910 - time (sec): 67.73 - samples/sec: 458.95 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-10 22:40:08,012 epoch 1 - iter 232/292 - loss 2.32335868 - time (sec): 77.98 - samples/sec: 458.42 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-10 22:40:17,693 epoch 1 - iter 261/292 - loss 2.20131381 - time (sec): 87.66 - samples/sec: 455.90 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-10 22:40:27,577 epoch 1 - iter 290/292 - loss 2.08347326 - time (sec): 97.55 - samples/sec: 453.75 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-10 22:40:28,055 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:40:28,055 EPOCH 1 done: loss 2.0787 - lr: 0.000158 |
|
2023-10-10 22:40:33,670 DEV : loss 0.6691536903381348 - f1-score (micro avg) 0.0 |
|
2023-10-10 22:40:33,679 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:40:43,569 epoch 2 - iter 29/292 - loss 0.73979818 - time (sec): 9.89 - samples/sec: 473.00 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-10 22:40:53,455 epoch 2 - iter 58/292 - loss 0.79010563 - time (sec): 19.77 - samples/sec: 464.71 - lr: 0.000157 - momentum: 0.000000 |
|
2023-10-10 22:41:03,434 epoch 2 - iter 87/292 - loss 0.68326802 - time (sec): 29.75 - samples/sec: 460.46 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-10 22:41:13,798 epoch 2 - iter 116/292 - loss 0.60934981 - time (sec): 40.12 - samples/sec: 460.43 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-10 22:41:24,470 epoch 2 - iter 145/292 - loss 0.58125380 - time (sec): 50.79 - samples/sec: 463.84 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-10 22:41:34,361 epoch 2 - iter 174/292 - loss 0.54661045 - time (sec): 60.68 - samples/sec: 456.43 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-10 22:41:44,174 epoch 2 - iter 203/292 - loss 0.53389626 - time (sec): 70.49 - samples/sec: 447.60 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-10 22:41:54,010 epoch 2 - iter 232/292 - loss 0.51870783 - time (sec): 80.33 - samples/sec: 443.48 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-10 22:42:03,637 epoch 2 - iter 261/292 - loss 0.49765927 - time (sec): 89.96 - samples/sec: 445.87 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-10 22:42:13,850 epoch 2 - iter 290/292 - loss 0.49732765 - time (sec): 100.17 - samples/sec: 442.98 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-10 22:42:14,221 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:42:14,222 EPOCH 2 done: loss 0.4968 - lr: 0.000142 |
|
2023-10-10 22:42:20,227 DEV : loss 0.28831687569618225 - f1-score (micro avg) 0.1468 |
|
2023-10-10 22:42:20,237 saving best model |
|
2023-10-10 22:42:21,190 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:42:30,312 epoch 3 - iter 29/292 - loss 0.37250554 - time (sec): 9.12 - samples/sec: 407.06 - lr: 0.000141 - momentum: 0.000000 |
|
2023-10-10 22:42:41,368 epoch 3 - iter 58/292 - loss 0.28788479 - time (sec): 20.18 - samples/sec: 440.44 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-10 22:42:50,894 epoch 3 - iter 87/292 - loss 0.31377648 - time (sec): 29.70 - samples/sec: 440.72 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-10 22:43:00,160 epoch 3 - iter 116/292 - loss 0.30860177 - time (sec): 38.97 - samples/sec: 439.06 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-10 22:43:09,738 epoch 3 - iter 145/292 - loss 0.30045066 - time (sec): 48.54 - samples/sec: 444.89 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-10 22:43:18,762 epoch 3 - iter 174/292 - loss 0.30780945 - time (sec): 57.57 - samples/sec: 444.68 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-10 22:43:29,441 epoch 3 - iter 203/292 - loss 0.31651504 - time (sec): 68.25 - samples/sec: 455.95 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-10 22:43:39,300 epoch 3 - iter 232/292 - loss 0.31485026 - time (sec): 78.11 - samples/sec: 456.80 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-10 22:43:49,319 epoch 3 - iter 261/292 - loss 0.30698124 - time (sec): 88.13 - samples/sec: 459.46 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-10 22:43:58,539 epoch 3 - iter 290/292 - loss 0.30215031 - time (sec): 97.35 - samples/sec: 455.07 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-10 22:43:58,987 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:43:58,988 EPOCH 3 done: loss 0.3015 - lr: 0.000125 |
|
2023-10-10 22:44:05,062 DEV : loss 0.21253274381160736 - f1-score (micro avg) 0.4454 |
|
2023-10-10 22:44:05,072 saving best model |
|
2023-10-10 22:44:14,113 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:44:24,977 epoch 4 - iter 29/292 - loss 0.20247520 - time (sec): 10.86 - samples/sec: 460.60 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-10 22:44:34,949 epoch 4 - iter 58/292 - loss 0.18905862 - time (sec): 20.83 - samples/sec: 437.66 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-10 22:44:45,648 epoch 4 - iter 87/292 - loss 0.22813786 - time (sec): 31.53 - samples/sec: 444.87 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-10 22:44:55,775 epoch 4 - iter 116/292 - loss 0.23168511 - time (sec): 41.66 - samples/sec: 441.17 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-10 22:45:06,379 epoch 4 - iter 145/292 - loss 0.22417524 - time (sec): 52.26 - samples/sec: 437.63 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-10 22:45:16,848 epoch 4 - iter 174/292 - loss 0.22225430 - time (sec): 62.73 - samples/sec: 431.91 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-10 22:45:27,869 epoch 4 - iter 203/292 - loss 0.21451680 - time (sec): 73.75 - samples/sec: 430.08 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-10 22:45:37,709 epoch 4 - iter 232/292 - loss 0.21129060 - time (sec): 83.59 - samples/sec: 431.92 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-10 22:45:47,385 epoch 4 - iter 261/292 - loss 0.20883694 - time (sec): 93.27 - samples/sec: 425.79 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-10 22:45:57,486 epoch 4 - iter 290/292 - loss 0.20421114 - time (sec): 103.37 - samples/sec: 427.91 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-10 22:45:57,953 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:45:57,954 EPOCH 4 done: loss 0.2034 - lr: 0.000107 |
|
2023-10-10 22:46:03,949 DEV : loss 0.16762110590934753 - f1-score (micro avg) 0.636 |
|
2023-10-10 22:46:03,958 saving best model |
|
2023-10-10 22:46:12,656 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:46:22,429 epoch 5 - iter 29/292 - loss 0.15228624 - time (sec): 9.77 - samples/sec: 444.71 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-10 22:46:33,178 epoch 5 - iter 58/292 - loss 0.16824065 - time (sec): 20.52 - samples/sec: 456.34 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-10 22:46:42,550 epoch 5 - iter 87/292 - loss 0.16516610 - time (sec): 29.89 - samples/sec: 454.50 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-10 22:46:52,308 epoch 5 - iter 116/292 - loss 0.14358303 - time (sec): 39.65 - samples/sec: 451.98 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-10 22:47:02,466 epoch 5 - iter 145/292 - loss 0.14189975 - time (sec): 49.81 - samples/sec: 456.77 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-10 22:47:11,868 epoch 5 - iter 174/292 - loss 0.13815073 - time (sec): 59.21 - samples/sec: 447.30 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-10 22:47:22,402 epoch 5 - iter 203/292 - loss 0.13687599 - time (sec): 69.74 - samples/sec: 450.72 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-10 22:47:32,919 epoch 5 - iter 232/292 - loss 0.13663132 - time (sec): 80.26 - samples/sec: 448.91 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-10 22:47:42,044 epoch 5 - iter 261/292 - loss 0.13323399 - time (sec): 89.38 - samples/sec: 444.34 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-10 22:47:52,280 epoch 5 - iter 290/292 - loss 0.13201349 - time (sec): 99.62 - samples/sec: 445.37 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-10 22:47:52,695 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:47:52,696 EPOCH 5 done: loss 0.1319 - lr: 0.000089 |
|
2023-10-10 22:47:58,790 DEV : loss 0.1449754387140274 - f1-score (micro avg) 0.7834 |
|
2023-10-10 22:47:58,800 saving best model |
|
2023-10-10 22:48:08,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:48:18,620 epoch 6 - iter 29/292 - loss 0.09780191 - time (sec): 10.56 - samples/sec: 508.71 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-10 22:48:27,513 epoch 6 - iter 58/292 - loss 0.10813143 - time (sec): 19.46 - samples/sec: 468.11 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-10 22:48:38,005 epoch 6 - iter 87/292 - loss 0.09566991 - time (sec): 29.95 - samples/sec: 464.73 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-10 22:48:47,930 epoch 6 - iter 116/292 - loss 0.09010732 - time (sec): 39.87 - samples/sec: 463.92 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-10 22:48:57,497 epoch 6 - iter 145/292 - loss 0.09093065 - time (sec): 49.44 - samples/sec: 463.49 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-10 22:49:06,968 epoch 6 - iter 174/292 - loss 0.09436299 - time (sec): 58.91 - samples/sec: 460.52 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-10 22:49:16,957 epoch 6 - iter 203/292 - loss 0.09269008 - time (sec): 68.90 - samples/sec: 448.05 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-10 22:49:27,267 epoch 6 - iter 232/292 - loss 0.09292884 - time (sec): 79.21 - samples/sec: 441.90 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-10 22:49:38,103 epoch 6 - iter 261/292 - loss 0.09295591 - time (sec): 90.05 - samples/sec: 442.24 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-10 22:49:48,309 epoch 6 - iter 290/292 - loss 0.09196060 - time (sec): 100.25 - samples/sec: 440.11 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-10 22:49:48,896 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:49:48,897 EPOCH 6 done: loss 0.0923 - lr: 0.000071 |
|
2023-10-10 22:49:55,279 DEV : loss 0.1280345916748047 - f1-score (micro avg) 0.7843 |
|
2023-10-10 22:49:55,289 saving best model |
|
2023-10-10 22:49:58,465 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:50:09,512 epoch 7 - iter 29/292 - loss 0.07453827 - time (sec): 11.04 - samples/sec: 443.98 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-10 22:50:19,617 epoch 7 - iter 58/292 - loss 0.06721636 - time (sec): 21.15 - samples/sec: 411.62 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-10 22:50:29,934 epoch 7 - iter 87/292 - loss 0.07117217 - time (sec): 31.46 - samples/sec: 417.23 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-10 22:50:40,481 epoch 7 - iter 116/292 - loss 0.06362150 - time (sec): 42.01 - samples/sec: 439.08 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-10 22:50:50,693 epoch 7 - iter 145/292 - loss 0.06235246 - time (sec): 52.22 - samples/sec: 430.49 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-10 22:51:00,958 epoch 7 - iter 174/292 - loss 0.06840778 - time (sec): 62.49 - samples/sec: 423.77 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-10 22:51:10,514 epoch 7 - iter 203/292 - loss 0.06625399 - time (sec): 72.05 - samples/sec: 423.90 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-10 22:51:21,312 epoch 7 - iter 232/292 - loss 0.06499604 - time (sec): 82.84 - samples/sec: 428.42 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-10 22:51:30,775 epoch 7 - iter 261/292 - loss 0.06666186 - time (sec): 92.31 - samples/sec: 431.34 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-10 22:51:40,482 epoch 7 - iter 290/292 - loss 0.06904362 - time (sec): 102.01 - samples/sec: 433.76 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-10 22:51:40,947 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:51:40,947 EPOCH 7 done: loss 0.0696 - lr: 0.000054 |
|
2023-10-10 22:51:46,921 DEV : loss 0.13766349852085114 - f1-score (micro avg) 0.757 |
|
2023-10-10 22:51:46,929 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:51:56,914 epoch 8 - iter 29/292 - loss 0.04980003 - time (sec): 9.98 - samples/sec: 506.44 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-10 22:52:06,746 epoch 8 - iter 58/292 - loss 0.05123584 - time (sec): 19.82 - samples/sec: 500.27 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-10 22:52:16,597 epoch 8 - iter 87/292 - loss 0.05568077 - time (sec): 29.67 - samples/sec: 485.27 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-10 22:52:25,412 epoch 8 - iter 116/292 - loss 0.05299151 - time (sec): 38.48 - samples/sec: 468.70 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-10 22:52:35,578 epoch 8 - iter 145/292 - loss 0.05256239 - time (sec): 48.65 - samples/sec: 468.99 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-10 22:52:44,728 epoch 8 - iter 174/292 - loss 0.05533329 - time (sec): 57.80 - samples/sec: 467.26 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-10 22:52:53,698 epoch 8 - iter 203/292 - loss 0.05536529 - time (sec): 66.77 - samples/sec: 463.07 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-10 22:53:03,670 epoch 8 - iter 232/292 - loss 0.05502038 - time (sec): 76.74 - samples/sec: 465.90 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-10 22:53:13,047 epoch 8 - iter 261/292 - loss 0.05399818 - time (sec): 86.12 - samples/sec: 461.30 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-10 22:53:23,422 epoch 8 - iter 290/292 - loss 0.05648797 - time (sec): 96.49 - samples/sec: 458.97 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-10 22:53:23,902 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:53:23,903 EPOCH 8 done: loss 0.0564 - lr: 0.000036 |
|
2023-10-10 22:53:29,955 DEV : loss 0.12616726756095886 - f1-score (micro avg) 0.7716 |
|
2023-10-10 22:53:29,964 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:53:39,918 epoch 9 - iter 29/292 - loss 0.04779883 - time (sec): 9.95 - samples/sec: 468.37 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-10 22:53:49,660 epoch 9 - iter 58/292 - loss 0.04408728 - time (sec): 19.69 - samples/sec: 475.59 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-10 22:53:59,507 epoch 9 - iter 87/292 - loss 0.04629535 - time (sec): 29.54 - samples/sec: 467.97 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-10 22:54:10,142 epoch 9 - iter 116/292 - loss 0.04202133 - time (sec): 40.18 - samples/sec: 457.87 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-10 22:54:19,541 epoch 9 - iter 145/292 - loss 0.04383999 - time (sec): 49.57 - samples/sec: 451.70 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-10 22:54:30,079 epoch 9 - iter 174/292 - loss 0.04672478 - time (sec): 60.11 - samples/sec: 452.65 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-10 22:54:39,735 epoch 9 - iter 203/292 - loss 0.04641879 - time (sec): 69.77 - samples/sec: 444.73 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-10 22:54:50,153 epoch 9 - iter 232/292 - loss 0.04595521 - time (sec): 80.19 - samples/sec: 449.04 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-10 22:55:00,375 epoch 9 - iter 261/292 - loss 0.04464917 - time (sec): 90.41 - samples/sec: 445.85 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-10 22:55:09,855 epoch 9 - iter 290/292 - loss 0.04722935 - time (sec): 99.89 - samples/sec: 442.99 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-10 22:55:10,343 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:55:10,343 EPOCH 9 done: loss 0.0472 - lr: 0.000018 |
|
2023-10-10 22:55:16,251 DEV : loss 0.1275288611650467 - f1-score (micro avg) 0.7846 |
|
2023-10-10 22:55:16,268 saving best model |
|
2023-10-10 22:55:21,311 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:55:31,028 epoch 10 - iter 29/292 - loss 0.03389118 - time (sec): 9.71 - samples/sec: 475.91 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-10 22:55:40,275 epoch 10 - iter 58/292 - loss 0.04001796 - time (sec): 18.96 - samples/sec: 451.56 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-10 22:55:49,745 epoch 10 - iter 87/292 - loss 0.03855770 - time (sec): 28.43 - samples/sec: 451.16 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-10 22:56:00,674 epoch 10 - iter 116/292 - loss 0.03345664 - time (sec): 39.36 - samples/sec: 461.08 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-10 22:56:10,848 epoch 10 - iter 145/292 - loss 0.03410413 - time (sec): 49.53 - samples/sec: 463.14 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-10 22:56:21,318 epoch 10 - iter 174/292 - loss 0.03396462 - time (sec): 60.00 - samples/sec: 455.00 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-10 22:56:32,199 epoch 10 - iter 203/292 - loss 0.03618776 - time (sec): 70.88 - samples/sec: 451.04 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-10 22:56:42,084 epoch 10 - iter 232/292 - loss 0.04153492 - time (sec): 80.77 - samples/sec: 445.45 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-10 22:56:51,564 epoch 10 - iter 261/292 - loss 0.04016965 - time (sec): 90.25 - samples/sec: 442.69 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-10 22:57:01,710 epoch 10 - iter 290/292 - loss 0.04323998 - time (sec): 100.40 - samples/sec: 440.51 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-10 22:57:02,248 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:57:02,249 EPOCH 10 done: loss 0.0434 - lr: 0.000000 |
|
2023-10-10 22:57:08,238 DEV : loss 0.1296752691268921 - f1-score (micro avg) 0.8017 |
|
2023-10-10 22:57:08,247 saving best model |
|
2023-10-10 22:57:13,206 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:57:13,209 Loading model from best epoch ... |
|
2023-10-10 22:57:17,160 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-10 22:57:30,053 |
|
Results: |
|
- F-score (micro) 0.7252 |
|
- F-score (macro) 0.6708 |
|
- Accuracy 0.587 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7920 0.8534 0.8216 348 |
|
LOC 0.5819 0.7625 0.6600 261 |
|
ORG 0.4000 0.3846 0.3922 52 |
|
HumanProd 0.8500 0.7727 0.8095 22 |
|
|
|
micro avg 0.6773 0.7804 0.7252 683 |
|
macro avg 0.6560 0.6933 0.6708 683 |
|
weighted avg 0.6837 0.7804 0.7268 683 |
|
|
|
2023-10-10 22:57:30,054 ---------------------------------------------------------------------------------------------------- |
|
|