|
2023-10-11 03:35:43,703 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:35:43,706 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 03:35:43,706 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:35:43,707 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-11 03:35:43,707 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:35:43,707 Train: 1166 sentences |
|
2023-10-11 03:35:43,707 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 03:35:43,707 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:35:43,707 Training Params: |
|
2023-10-11 03:35:43,707 - learning_rate: "0.00016" |
|
2023-10-11 03:35:43,707 - mini_batch_size: "4" |
|
2023-10-11 03:35:43,707 - max_epochs: "10" |
|
2023-10-11 03:35:43,707 - shuffle: "True" |
|
2023-10-11 03:35:43,707 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:35:43,707 Plugins: |
|
2023-10-11 03:35:43,707 - TensorboardLogger |
|
2023-10-11 03:35:43,707 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 03:35:43,708 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:35:43,708 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 03:35:43,708 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 03:35:43,708 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:35:43,708 Computation: |
|
2023-10-11 03:35:43,708 - compute on device: cuda:0 |
|
2023-10-11 03:35:43,708 - embedding storage: none |
|
2023-10-11 03:35:43,708 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:35:43,708 Model training base path: "hmbench-newseye/fi-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-11 03:35:43,708 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:35:43,708 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:35:43,708 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 03:35:54,860 epoch 1 - iter 29/292 - loss 2.81957970 - time (sec): 11.15 - samples/sec: 386.73 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-11 03:36:04,709 epoch 1 - iter 58/292 - loss 2.80589817 - time (sec): 21.00 - samples/sec: 393.07 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-11 03:36:14,324 epoch 1 - iter 87/292 - loss 2.77951816 - time (sec): 30.61 - samples/sec: 399.20 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-11 03:36:26,204 epoch 1 - iter 116/292 - loss 2.70206612 - time (sec): 42.49 - samples/sec: 411.23 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-11 03:36:35,857 epoch 1 - iter 145/292 - loss 2.61368049 - time (sec): 52.15 - samples/sec: 418.15 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-11 03:36:46,142 epoch 1 - iter 174/292 - loss 2.50306929 - time (sec): 62.43 - samples/sec: 428.31 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-11 03:36:57,372 epoch 1 - iter 203/292 - loss 2.37448119 - time (sec): 73.66 - samples/sec: 438.29 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-11 03:37:06,746 epoch 1 - iter 232/292 - loss 2.26594986 - time (sec): 83.04 - samples/sec: 438.73 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-11 03:37:16,335 epoch 1 - iter 261/292 - loss 2.14370781 - time (sec): 92.63 - samples/sec: 439.30 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-11 03:37:25,653 epoch 1 - iter 290/292 - loss 2.03951691 - time (sec): 101.94 - samples/sec: 434.93 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-11 03:37:26,081 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:37:26,082 EPOCH 1 done: loss 2.0362 - lr: 0.000158 |
|
2023-10-11 03:37:31,472 DEV : loss 0.6394248604774475 - f1-score (micro avg) 0.0 |
|
2023-10-11 03:37:31,482 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:37:41,271 epoch 2 - iter 29/292 - loss 0.68301608 - time (sec): 9.79 - samples/sec: 447.23 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-11 03:37:50,950 epoch 2 - iter 58/292 - loss 0.63527545 - time (sec): 19.47 - samples/sec: 440.97 - lr: 0.000157 - momentum: 0.000000 |
|
2023-10-11 03:38:00,592 epoch 2 - iter 87/292 - loss 0.63132550 - time (sec): 29.11 - samples/sec: 436.41 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-11 03:38:10,670 epoch 2 - iter 116/292 - loss 0.58196696 - time (sec): 39.19 - samples/sec: 443.97 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-11 03:38:20,932 epoch 2 - iter 145/292 - loss 0.60854308 - time (sec): 49.45 - samples/sec: 450.09 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-11 03:38:31,538 epoch 2 - iter 174/292 - loss 0.57661614 - time (sec): 60.05 - samples/sec: 445.20 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-11 03:38:41,306 epoch 2 - iter 203/292 - loss 0.54824891 - time (sec): 69.82 - samples/sec: 447.23 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 03:38:50,123 epoch 2 - iter 232/292 - loss 0.52349347 - time (sec): 78.64 - samples/sec: 443.56 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-11 03:38:59,450 epoch 2 - iter 261/292 - loss 0.51475348 - time (sec): 87.97 - samples/sec: 443.06 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-11 03:39:09,649 epoch 2 - iter 290/292 - loss 0.49892401 - time (sec): 98.16 - samples/sec: 450.27 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-11 03:39:10,190 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:39:10,190 EPOCH 2 done: loss 0.4983 - lr: 0.000142 |
|
2023-10-11 03:39:16,070 DEV : loss 0.29269105195999146 - f1-score (micro avg) 0.0 |
|
2023-10-11 03:39:16,079 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:39:25,841 epoch 3 - iter 29/292 - loss 0.35386531 - time (sec): 9.76 - samples/sec: 408.81 - lr: 0.000141 - momentum: 0.000000 |
|
2023-10-11 03:39:35,418 epoch 3 - iter 58/292 - loss 0.35190101 - time (sec): 19.34 - samples/sec: 411.64 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-11 03:39:45,503 epoch 3 - iter 87/292 - loss 0.33548860 - time (sec): 29.42 - samples/sec: 425.46 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 03:39:55,406 epoch 3 - iter 116/292 - loss 0.31617685 - time (sec): 39.32 - samples/sec: 437.00 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-11 03:40:05,107 epoch 3 - iter 145/292 - loss 0.30835528 - time (sec): 49.03 - samples/sec: 434.75 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-11 03:40:15,147 epoch 3 - iter 174/292 - loss 0.29080066 - time (sec): 59.07 - samples/sec: 441.22 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 03:40:25,273 epoch 3 - iter 203/292 - loss 0.30465050 - time (sec): 69.19 - samples/sec: 443.61 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-11 03:40:34,384 epoch 3 - iter 232/292 - loss 0.30244144 - time (sec): 78.30 - samples/sec: 442.32 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-11 03:40:45,048 epoch 3 - iter 261/292 - loss 0.29585035 - time (sec): 88.97 - samples/sec: 446.66 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-11 03:40:55,141 epoch 3 - iter 290/292 - loss 0.29281659 - time (sec): 99.06 - samples/sec: 445.33 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-11 03:40:55,768 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:40:55,768 EPOCH 3 done: loss 0.2912 - lr: 0.000125 |
|
2023-10-11 03:41:01,875 DEV : loss 0.19170591235160828 - f1-score (micro avg) 0.477 |
|
2023-10-11 03:41:01,885 saving best model |
|
2023-10-11 03:41:02,794 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:41:12,772 epoch 4 - iter 29/292 - loss 0.19259566 - time (sec): 9.98 - samples/sec: 473.64 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-11 03:41:23,316 epoch 4 - iter 58/292 - loss 0.16708108 - time (sec): 20.52 - samples/sec: 490.02 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-11 03:41:33,104 epoch 4 - iter 87/292 - loss 0.19487845 - time (sec): 30.31 - samples/sec: 484.26 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-11 03:41:42,943 epoch 4 - iter 116/292 - loss 0.19940283 - time (sec): 40.15 - samples/sec: 483.38 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-11 03:41:52,477 epoch 4 - iter 145/292 - loss 0.20218637 - time (sec): 49.68 - samples/sec: 479.92 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-11 03:42:01,818 epoch 4 - iter 174/292 - loss 0.19572403 - time (sec): 59.02 - samples/sec: 475.35 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-11 03:42:10,954 epoch 4 - iter 203/292 - loss 0.19603038 - time (sec): 68.16 - samples/sec: 468.68 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-11 03:42:20,751 epoch 4 - iter 232/292 - loss 0.19062971 - time (sec): 77.95 - samples/sec: 463.96 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-11 03:42:29,640 epoch 4 - iter 261/292 - loss 0.18725691 - time (sec): 86.84 - samples/sec: 455.07 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-11 03:42:39,637 epoch 4 - iter 290/292 - loss 0.18640083 - time (sec): 96.84 - samples/sec: 457.68 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-11 03:42:40,049 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:42:40,050 EPOCH 4 done: loss 0.1868 - lr: 0.000107 |
|
2023-10-11 03:42:45,788 DEV : loss 0.1456957757472992 - f1-score (micro avg) 0.6624 |
|
2023-10-11 03:42:45,798 saving best model |
|
2023-10-11 03:42:46,754 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:42:56,570 epoch 5 - iter 29/292 - loss 0.13994720 - time (sec): 9.81 - samples/sec: 456.00 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 03:43:06,990 epoch 5 - iter 58/292 - loss 0.11776309 - time (sec): 20.23 - samples/sec: 472.13 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-11 03:43:17,064 epoch 5 - iter 87/292 - loss 0.13368301 - time (sec): 30.31 - samples/sec: 472.38 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-11 03:43:27,262 epoch 5 - iter 116/292 - loss 0.13287921 - time (sec): 40.51 - samples/sec: 471.51 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 03:43:36,983 epoch 5 - iter 145/292 - loss 0.13471707 - time (sec): 50.23 - samples/sec: 469.55 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-11 03:43:46,317 epoch 5 - iter 174/292 - loss 0.13352509 - time (sec): 59.56 - samples/sec: 467.45 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-11 03:43:56,006 epoch 5 - iter 203/292 - loss 0.13150760 - time (sec): 69.25 - samples/sec: 458.57 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-11 03:44:04,831 epoch 5 - iter 232/292 - loss 0.12973404 - time (sec): 78.07 - samples/sec: 454.17 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-11 03:44:14,924 epoch 5 - iter 261/292 - loss 0.12458178 - time (sec): 88.17 - samples/sec: 456.46 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-11 03:44:24,115 epoch 5 - iter 290/292 - loss 0.12278117 - time (sec): 97.36 - samples/sec: 455.52 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-11 03:44:24,506 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:44:24,506 EPOCH 5 done: loss 0.1229 - lr: 0.000089 |
|
2023-10-11 03:44:30,384 DEV : loss 0.12754610180854797 - f1-score (micro avg) 0.7325 |
|
2023-10-11 03:44:30,395 saving best model |
|
2023-10-11 03:44:33,053 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:44:41,909 epoch 6 - iter 29/292 - loss 0.08298524 - time (sec): 8.85 - samples/sec: 421.75 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-11 03:44:51,064 epoch 6 - iter 58/292 - loss 0.11175773 - time (sec): 18.01 - samples/sec: 432.68 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-11 03:45:00,824 epoch 6 - iter 87/292 - loss 0.10165883 - time (sec): 27.77 - samples/sec: 447.56 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-11 03:45:10,403 epoch 6 - iter 116/292 - loss 0.09678991 - time (sec): 37.35 - samples/sec: 445.89 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-11 03:45:20,476 epoch 6 - iter 145/292 - loss 0.09397512 - time (sec): 47.42 - samples/sec: 455.48 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-11 03:45:30,725 epoch 6 - iter 174/292 - loss 0.08697011 - time (sec): 57.67 - samples/sec: 461.02 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-11 03:45:39,853 epoch 6 - iter 203/292 - loss 0.08810664 - time (sec): 66.80 - samples/sec: 455.49 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-11 03:45:48,933 epoch 6 - iter 232/292 - loss 0.08610263 - time (sec): 75.88 - samples/sec: 448.75 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-11 03:46:00,055 epoch 6 - iter 261/292 - loss 0.08625545 - time (sec): 87.00 - samples/sec: 453.30 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-11 03:46:09,886 epoch 6 - iter 290/292 - loss 0.08521640 - time (sec): 96.83 - samples/sec: 455.25 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-11 03:46:10,525 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:46:10,525 EPOCH 6 done: loss 0.0846 - lr: 0.000071 |
|
2023-10-11 03:46:16,096 DEV : loss 0.13130022585391998 - f1-score (micro avg) 0.7613 |
|
2023-10-11 03:46:16,105 saving best model |
|
2023-10-11 03:46:18,785 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:46:27,822 epoch 7 - iter 29/292 - loss 0.07101005 - time (sec): 9.03 - samples/sec: 422.44 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-11 03:46:36,891 epoch 7 - iter 58/292 - loss 0.06386797 - time (sec): 18.10 - samples/sec: 437.68 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-11 03:46:45,457 epoch 7 - iter 87/292 - loss 0.06618248 - time (sec): 26.67 - samples/sec: 427.44 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-11 03:46:56,357 epoch 7 - iter 116/292 - loss 0.06714686 - time (sec): 37.57 - samples/sec: 453.34 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-11 03:47:06,080 epoch 7 - iter 145/292 - loss 0.06385049 - time (sec): 47.29 - samples/sec: 446.11 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-11 03:47:15,959 epoch 7 - iter 174/292 - loss 0.06762135 - time (sec): 57.17 - samples/sec: 442.09 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-11 03:47:25,666 epoch 7 - iter 203/292 - loss 0.06778033 - time (sec): 66.88 - samples/sec: 443.83 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-11 03:47:36,174 epoch 7 - iter 232/292 - loss 0.06382795 - time (sec): 77.38 - samples/sec: 445.91 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-11 03:47:46,819 epoch 7 - iter 261/292 - loss 0.06495314 - time (sec): 88.03 - samples/sec: 446.36 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-11 03:47:57,307 epoch 7 - iter 290/292 - loss 0.06399569 - time (sec): 98.52 - samples/sec: 449.10 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-11 03:47:57,797 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:47:57,798 EPOCH 7 done: loss 0.0640 - lr: 0.000054 |
|
2023-10-11 03:48:03,890 DEV : loss 0.11788583546876907 - f1-score (micro avg) 0.7922 |
|
2023-10-11 03:48:03,899 saving best model |
|
2023-10-11 03:48:06,447 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:48:15,755 epoch 8 - iter 29/292 - loss 0.06045212 - time (sec): 9.30 - samples/sec: 440.49 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-11 03:48:25,290 epoch 8 - iter 58/292 - loss 0.05668848 - time (sec): 18.84 - samples/sec: 448.24 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-11 03:48:34,932 epoch 8 - iter 87/292 - loss 0.06064471 - time (sec): 28.48 - samples/sec: 448.98 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-11 03:48:45,261 epoch 8 - iter 116/292 - loss 0.06304311 - time (sec): 38.81 - samples/sec: 458.96 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-11 03:48:54,956 epoch 8 - iter 145/292 - loss 0.06183104 - time (sec): 48.50 - samples/sec: 456.93 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 03:49:03,980 epoch 8 - iter 174/292 - loss 0.05807061 - time (sec): 57.53 - samples/sec: 450.09 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-11 03:49:14,058 epoch 8 - iter 203/292 - loss 0.05673632 - time (sec): 67.61 - samples/sec: 454.78 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-11 03:49:24,114 epoch 8 - iter 232/292 - loss 0.05801900 - time (sec): 77.66 - samples/sec: 454.79 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-11 03:49:33,999 epoch 8 - iter 261/292 - loss 0.05350185 - time (sec): 87.55 - samples/sec: 447.75 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-11 03:49:44,408 epoch 8 - iter 290/292 - loss 0.05081514 - time (sec): 97.96 - samples/sec: 451.22 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-11 03:49:44,954 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:49:44,955 EPOCH 8 done: loss 0.0507 - lr: 0.000036 |
|
2023-10-11 03:49:50,893 DEV : loss 0.1216011717915535 - f1-score (micro avg) 0.7983 |
|
2023-10-11 03:49:50,903 saving best model |
|
2023-10-11 03:49:53,505 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:50:03,543 epoch 9 - iter 29/292 - loss 0.03943985 - time (sec): 10.03 - samples/sec: 458.19 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-11 03:50:12,469 epoch 9 - iter 58/292 - loss 0.04844555 - time (sec): 18.96 - samples/sec: 434.67 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 03:50:21,870 epoch 9 - iter 87/292 - loss 0.04471746 - time (sec): 28.36 - samples/sec: 437.51 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-11 03:50:31,201 epoch 9 - iter 116/292 - loss 0.04485485 - time (sec): 37.69 - samples/sec: 436.29 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 03:50:41,656 epoch 9 - iter 145/292 - loss 0.04694636 - time (sec): 48.15 - samples/sec: 446.77 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-11 03:50:51,729 epoch 9 - iter 174/292 - loss 0.04399638 - time (sec): 58.22 - samples/sec: 449.01 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-11 03:51:02,209 epoch 9 - iter 203/292 - loss 0.04168589 - time (sec): 68.70 - samples/sec: 447.28 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-11 03:51:12,835 epoch 9 - iter 232/292 - loss 0.04315952 - time (sec): 79.33 - samples/sec: 451.79 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-11 03:51:22,130 epoch 9 - iter 261/292 - loss 0.04201939 - time (sec): 88.62 - samples/sec: 451.08 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-11 03:51:31,943 epoch 9 - iter 290/292 - loss 0.04206582 - time (sec): 98.43 - samples/sec: 449.63 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-11 03:51:32,417 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:51:32,417 EPOCH 9 done: loss 0.0420 - lr: 0.000018 |
|
2023-10-11 03:51:38,147 DEV : loss 0.12305182963609695 - f1-score (micro avg) 0.794 |
|
2023-10-11 03:51:38,156 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:51:47,466 epoch 10 - iter 29/292 - loss 0.03708349 - time (sec): 9.31 - samples/sec: 452.73 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-11 03:51:56,648 epoch 10 - iter 58/292 - loss 0.04525890 - time (sec): 18.49 - samples/sec: 451.53 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 03:52:06,950 epoch 10 - iter 87/292 - loss 0.03777979 - time (sec): 28.79 - samples/sec: 458.84 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-11 03:52:16,910 epoch 10 - iter 116/292 - loss 0.03392100 - time (sec): 38.75 - samples/sec: 455.20 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-11 03:52:26,359 epoch 10 - iter 145/292 - loss 0.03634942 - time (sec): 48.20 - samples/sec: 457.11 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-11 03:52:35,578 epoch 10 - iter 174/292 - loss 0.03671887 - time (sec): 57.42 - samples/sec: 449.89 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-11 03:52:45,924 epoch 10 - iter 203/292 - loss 0.03799343 - time (sec): 67.77 - samples/sec: 457.28 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-11 03:52:55,496 epoch 10 - iter 232/292 - loss 0.03741065 - time (sec): 77.34 - samples/sec: 455.74 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-11 03:53:05,434 epoch 10 - iter 261/292 - loss 0.03813911 - time (sec): 87.28 - samples/sec: 456.28 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-11 03:53:15,261 epoch 10 - iter 290/292 - loss 0.03854453 - time (sec): 97.10 - samples/sec: 453.32 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-11 03:53:15,951 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:53:15,951 EPOCH 10 done: loss 0.0383 - lr: 0.000000 |
|
2023-10-11 03:53:21,696 DEV : loss 0.12359649688005447 - f1-score (micro avg) 0.7863 |
|
2023-10-11 03:53:22,642 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 03:53:22,644 Loading model from best epoch ... |
|
2023-10-11 03:53:27,154 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-11 03:53:40,165 |
|
Results: |
|
- F-score (micro) 0.7459 |
|
- F-score (macro) 0.7006 |
|
- Accuracy 0.6127 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8209 0.8563 0.8383 348 |
|
LOC 0.5972 0.8123 0.6883 261 |
|
ORG 0.4000 0.3846 0.3922 52 |
|
HumanProd 0.9048 0.8636 0.8837 22 |
|
|
|
micro avg 0.6958 0.8038 0.7459 683 |
|
macro avg 0.6807 0.7292 0.7006 683 |
|
weighted avg 0.7061 0.8038 0.7485 683 |
|
|
|
2023-10-11 03:53:40,165 ---------------------------------------------------------------------------------------------------- |
|
|