|
2023-10-10 22:00:43,304 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:00:43,306 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-10 22:00:43,306 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:00:43,307 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-10 22:00:43,307 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:00:43,307 Train: 1166 sentences |
|
2023-10-10 22:00:43,307 (train_with_dev=False, train_with_test=False) |
|
2023-10-10 22:00:43,307 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:00:43,307 Training Params: |
|
2023-10-10 22:00:43,307 - learning_rate: "0.00016" |
|
2023-10-10 22:00:43,307 - mini_batch_size: "8" |
|
2023-10-10 22:00:43,307 - max_epochs: "10" |
|
2023-10-10 22:00:43,307 - shuffle: "True" |
|
2023-10-10 22:00:43,307 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:00:43,307 Plugins: |
|
2023-10-10 22:00:43,307 - TensorboardLogger |
|
2023-10-10 22:00:43,307 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-10 22:00:43,308 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:00:43,308 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-10 22:00:43,308 - metric: "('micro avg', 'f1-score')" |
|
2023-10-10 22:00:43,308 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:00:43,308 Computation: |
|
2023-10-10 22:00:43,308 - compute on device: cuda:0 |
|
2023-10-10 22:00:43,308 - embedding storage: none |
|
2023-10-10 22:00:43,308 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:00:43,308 Model training base path: "hmbench-newseye/fi-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-10 22:00:43,308 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:00:43,308 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:00:43,308 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-10 22:00:52,394 epoch 1 - iter 14/146 - loss 2.83088289 - time (sec): 9.08 - samples/sec: 530.64 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-10 22:01:00,821 epoch 1 - iter 28/146 - loss 2.82628929 - time (sec): 17.51 - samples/sec: 492.26 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-10 22:01:10,396 epoch 1 - iter 42/146 - loss 2.81614191 - time (sec): 27.09 - samples/sec: 505.57 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-10 22:01:19,597 epoch 1 - iter 56/146 - loss 2.80155779 - time (sec): 36.29 - samples/sec: 497.39 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-10 22:01:28,047 epoch 1 - iter 70/146 - loss 2.77881902 - time (sec): 44.74 - samples/sec: 485.62 - lr: 0.000076 - momentum: 0.000000 |
|
2023-10-10 22:01:36,553 epoch 1 - iter 84/146 - loss 2.73383396 - time (sec): 53.24 - samples/sec: 482.56 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-10 22:01:44,474 epoch 1 - iter 98/146 - loss 2.67503100 - time (sec): 61.16 - samples/sec: 478.46 - lr: 0.000106 - momentum: 0.000000 |
|
2023-10-10 22:01:54,877 epoch 1 - iter 112/146 - loss 2.57315899 - time (sec): 71.57 - samples/sec: 485.69 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-10 22:02:03,653 epoch 1 - iter 126/146 - loss 2.49591763 - time (sec): 80.34 - samples/sec: 482.87 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-10 22:02:12,893 epoch 1 - iter 140/146 - loss 2.41324023 - time (sec): 89.58 - samples/sec: 479.18 - lr: 0.000152 - momentum: 0.000000 |
|
2023-10-10 22:02:16,311 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:02:16,311 EPOCH 1 done: loss 2.3834 - lr: 0.000152 |
|
2023-10-10 22:02:21,884 DEV : loss 1.2905865907669067 - f1-score (micro avg) 0.0 |
|
2023-10-10 22:02:21,893 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:02:31,067 epoch 2 - iter 14/146 - loss 1.32629947 - time (sec): 9.17 - samples/sec: 499.38 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-10 22:02:40,124 epoch 2 - iter 28/146 - loss 1.29112932 - time (sec): 18.23 - samples/sec: 488.68 - lr: 0.000157 - momentum: 0.000000 |
|
2023-10-10 22:02:49,458 epoch 2 - iter 42/146 - loss 1.15322127 - time (sec): 27.56 - samples/sec: 476.15 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-10 22:02:59,632 epoch 2 - iter 56/146 - loss 1.05695895 - time (sec): 37.74 - samples/sec: 467.96 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-10 22:03:10,417 epoch 2 - iter 70/146 - loss 0.98742812 - time (sec): 48.52 - samples/sec: 472.43 - lr: 0.000152 - momentum: 0.000000 |
|
2023-10-10 22:03:19,883 epoch 2 - iter 84/146 - loss 0.93247747 - time (sec): 57.99 - samples/sec: 463.34 - lr: 0.000150 - momentum: 0.000000 |
|
2023-10-10 22:03:29,139 epoch 2 - iter 98/146 - loss 0.88912171 - time (sec): 67.24 - samples/sec: 457.24 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-10 22:03:38,219 epoch 2 - iter 112/146 - loss 0.85474722 - time (sec): 76.32 - samples/sec: 452.98 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-10 22:03:46,844 epoch 2 - iter 126/146 - loss 0.82173264 - time (sec): 84.95 - samples/sec: 456.28 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-10 22:03:55,388 epoch 2 - iter 140/146 - loss 0.79360195 - time (sec): 93.49 - samples/sec: 459.36 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-10 22:03:58,923 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:03:58,923 EPOCH 2 done: loss 0.7905 - lr: 0.000143 |
|
2023-10-10 22:04:04,751 DEV : loss 0.3894149363040924 - f1-score (micro avg) 0.0 |
|
2023-10-10 22:04:04,760 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:04:13,057 epoch 3 - iter 14/146 - loss 0.51507142 - time (sec): 8.29 - samples/sec: 425.69 - lr: 0.000141 - momentum: 0.000000 |
|
2023-10-10 22:04:22,691 epoch 3 - iter 28/146 - loss 0.41692054 - time (sec): 17.93 - samples/sec: 477.51 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-10 22:04:31,502 epoch 3 - iter 42/146 - loss 0.43730739 - time (sec): 26.74 - samples/sec: 477.83 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-10 22:04:40,274 epoch 3 - iter 56/146 - loss 0.42787094 - time (sec): 35.51 - samples/sec: 470.86 - lr: 0.000136 - momentum: 0.000000 |
|
2023-10-10 22:04:49,689 epoch 3 - iter 70/146 - loss 0.41822906 - time (sec): 44.93 - samples/sec: 467.47 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-10 22:04:58,317 epoch 3 - iter 84/146 - loss 0.41836824 - time (sec): 53.55 - samples/sec: 459.77 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-10 22:05:07,820 epoch 3 - iter 98/146 - loss 0.44258611 - time (sec): 63.06 - samples/sec: 473.68 - lr: 0.000131 - momentum: 0.000000 |
|
2023-10-10 22:05:17,349 epoch 3 - iter 112/146 - loss 0.42988452 - time (sec): 72.59 - samples/sec: 478.20 - lr: 0.000129 - momentum: 0.000000 |
|
2023-10-10 22:05:26,127 epoch 3 - iter 126/146 - loss 0.41794490 - time (sec): 81.36 - samples/sec: 480.97 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-10 22:05:34,136 epoch 3 - iter 140/146 - loss 0.41620303 - time (sec): 89.37 - samples/sec: 476.97 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-10 22:05:37,768 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:05:37,768 EPOCH 3 done: loss 0.4112 - lr: 0.000125 |
|
2023-10-10 22:05:43,626 DEV : loss 0.28306612372398376 - f1-score (micro avg) 0.1609 |
|
2023-10-10 22:05:43,635 saving best model |
|
2023-10-10 22:05:44,623 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:05:54,251 epoch 4 - iter 14/146 - loss 0.29881808 - time (sec): 9.62 - samples/sec: 492.36 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-10 22:06:02,849 epoch 4 - iter 28/146 - loss 0.27924207 - time (sec): 18.22 - samples/sec: 471.98 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-10 22:06:12,815 epoch 4 - iter 42/146 - loss 0.34183632 - time (sec): 28.19 - samples/sec: 478.45 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-10 22:06:21,685 epoch 4 - iter 56/146 - loss 0.34079281 - time (sec): 37.06 - samples/sec: 471.65 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-10 22:06:30,830 epoch 4 - iter 70/146 - loss 0.33661860 - time (sec): 46.20 - samples/sec: 478.46 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-10 22:06:39,823 epoch 4 - iter 84/146 - loss 0.33761968 - time (sec): 55.20 - samples/sec: 477.34 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-10 22:06:50,180 epoch 4 - iter 98/146 - loss 0.32990070 - time (sec): 65.55 - samples/sec: 472.88 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-10 22:06:59,582 epoch 4 - iter 112/146 - loss 0.32645728 - time (sec): 74.96 - samples/sec: 464.23 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-10 22:07:08,979 epoch 4 - iter 126/146 - loss 0.32140043 - time (sec): 84.35 - samples/sec: 460.47 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-10 22:07:17,392 epoch 4 - iter 140/146 - loss 0.32290848 - time (sec): 92.77 - samples/sec: 457.95 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-10 22:07:21,205 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:07:21,205 EPOCH 4 done: loss 0.3174 - lr: 0.000108 |
|
2023-10-10 22:07:26,848 DEV : loss 0.2454710602760315 - f1-score (micro avg) 0.3563 |
|
2023-10-10 22:07:26,857 saving best model |
|
2023-10-10 22:07:35,506 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:07:44,550 epoch 5 - iter 14/146 - loss 0.27566779 - time (sec): 9.04 - samples/sec: 454.76 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-10 22:07:54,215 epoch 5 - iter 28/146 - loss 0.33438474 - time (sec): 18.70 - samples/sec: 484.48 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-10 22:08:02,953 epoch 5 - iter 42/146 - loss 0.32225237 - time (sec): 27.44 - samples/sec: 484.25 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-10 22:08:11,715 epoch 5 - iter 56/146 - loss 0.29084459 - time (sec): 36.20 - samples/sec: 478.55 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-10 22:08:21,054 epoch 5 - iter 70/146 - loss 0.27706780 - time (sec): 45.54 - samples/sec: 481.60 - lr: 0.000099 - momentum: 0.000000 |
|
2023-10-10 22:08:30,720 epoch 5 - iter 84/146 - loss 0.26855614 - time (sec): 55.21 - samples/sec: 469.63 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-10 22:08:40,750 epoch 5 - iter 98/146 - loss 0.26372697 - time (sec): 65.24 - samples/sec: 463.02 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-10 22:08:51,290 epoch 5 - iter 112/146 - loss 0.26061632 - time (sec): 75.78 - samples/sec: 458.22 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-10 22:09:00,154 epoch 5 - iter 126/146 - loss 0.25838387 - time (sec): 84.64 - samples/sec: 456.78 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-10 22:09:08,987 epoch 5 - iter 140/146 - loss 0.25502563 - time (sec): 93.48 - samples/sec: 456.41 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-10 22:09:12,868 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:09:12,869 EPOCH 5 done: loss 0.2561 - lr: 0.000090 |
|
2023-10-10 22:09:18,945 DEV : loss 0.20666354894638062 - f1-score (micro avg) 0.4602 |
|
2023-10-10 22:09:18,955 saving best model |
|
2023-10-10 22:09:25,879 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:09:35,820 epoch 6 - iter 14/146 - loss 0.21070114 - time (sec): 9.94 - samples/sec: 504.06 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-10 22:09:44,414 epoch 6 - iter 28/146 - loss 0.22631495 - time (sec): 18.53 - samples/sec: 483.31 - lr: 0.000086 - momentum: 0.000000 |
|
2023-10-10 22:09:53,224 epoch 6 - iter 42/146 - loss 0.21485580 - time (sec): 27.34 - samples/sec: 480.19 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-10 22:10:02,417 epoch 6 - iter 56/146 - loss 0.20592699 - time (sec): 36.53 - samples/sec: 482.40 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-10 22:10:11,772 epoch 6 - iter 70/146 - loss 0.20663057 - time (sec): 45.89 - samples/sec: 485.72 - lr: 0.000081 - momentum: 0.000000 |
|
2023-10-10 22:10:20,029 epoch 6 - iter 84/146 - loss 0.20369501 - time (sec): 54.15 - samples/sec: 478.12 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-10 22:10:28,856 epoch 6 - iter 98/146 - loss 0.20747226 - time (sec): 62.97 - samples/sec: 474.76 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-10 22:10:37,595 epoch 6 - iter 112/146 - loss 0.20151474 - time (sec): 71.71 - samples/sec: 474.56 - lr: 0.000076 - momentum: 0.000000 |
|
2023-10-10 22:10:46,686 epoch 6 - iter 126/146 - loss 0.20676841 - time (sec): 80.80 - samples/sec: 478.66 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-10 22:10:54,676 epoch 6 - iter 140/146 - loss 0.20652055 - time (sec): 88.79 - samples/sec: 474.05 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-10 22:10:58,983 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:10:58,983 EPOCH 6 done: loss 0.2048 - lr: 0.000072 |
|
2023-10-10 22:11:05,038 DEV : loss 0.18372002243995667 - f1-score (micro avg) 0.5195 |
|
2023-10-10 22:11:05,048 saving best model |
|
2023-10-10 22:11:14,179 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:11:23,472 epoch 7 - iter 14/146 - loss 0.16630834 - time (sec): 9.29 - samples/sec: 515.70 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-10 22:11:31,585 epoch 7 - iter 28/146 - loss 0.15910112 - time (sec): 17.40 - samples/sec: 494.00 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-10 22:11:39,899 epoch 7 - iter 42/146 - loss 0.18159621 - time (sec): 25.72 - samples/sec: 496.78 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-10 22:11:48,970 epoch 7 - iter 56/146 - loss 0.16818074 - time (sec): 34.79 - samples/sec: 507.13 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-10 22:11:57,416 epoch 7 - iter 70/146 - loss 0.16285809 - time (sec): 43.23 - samples/sec: 506.75 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-10 22:12:05,581 epoch 7 - iter 84/146 - loss 0.16942539 - time (sec): 51.40 - samples/sec: 493.58 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-10 22:12:14,887 epoch 7 - iter 98/146 - loss 0.16837468 - time (sec): 60.70 - samples/sec: 487.16 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-10 22:12:24,198 epoch 7 - iter 112/146 - loss 0.16411960 - time (sec): 70.01 - samples/sec: 491.76 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-10 22:12:33,243 epoch 7 - iter 126/146 - loss 0.16260578 - time (sec): 79.06 - samples/sec: 490.77 - lr: 0.000056 - momentum: 0.000000 |
|
2023-10-10 22:12:42,356 epoch 7 - iter 140/146 - loss 0.16658286 - time (sec): 88.17 - samples/sec: 480.73 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-10 22:12:46,633 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:12:46,633 EPOCH 7 done: loss 0.1690 - lr: 0.000055 |
|
2023-10-10 22:12:52,591 DEV : loss 0.17367814481258392 - f1-score (micro avg) 0.604 |
|
2023-10-10 22:12:52,600 saving best model |
|
2023-10-10 22:13:00,680 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:13:10,905 epoch 8 - iter 14/146 - loss 0.14405019 - time (sec): 10.21 - samples/sec: 481.65 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-10 22:13:20,300 epoch 8 - iter 28/146 - loss 0.15973513 - time (sec): 19.61 - samples/sec: 494.79 - lr: 0.000051 - momentum: 0.000000 |
|
2023-10-10 22:13:29,217 epoch 8 - iter 42/146 - loss 0.16052807 - time (sec): 28.53 - samples/sec: 486.59 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-10 22:13:38,038 epoch 8 - iter 56/146 - loss 0.15210947 - time (sec): 37.35 - samples/sec: 471.97 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-10 22:13:47,341 epoch 8 - iter 70/146 - loss 0.15143169 - time (sec): 46.65 - samples/sec: 474.18 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-10 22:13:56,442 epoch 8 - iter 84/146 - loss 0.15707453 - time (sec): 55.75 - samples/sec: 466.59 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-10 22:14:05,085 epoch 8 - iter 98/146 - loss 0.15161172 - time (sec): 64.39 - samples/sec: 465.47 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-10 22:14:14,292 epoch 8 - iter 112/146 - loss 0.14901872 - time (sec): 73.60 - samples/sec: 469.22 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-10 22:14:23,650 epoch 8 - iter 126/146 - loss 0.14657533 - time (sec): 82.96 - samples/sec: 464.50 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-10 22:14:32,792 epoch 8 - iter 140/146 - loss 0.14114562 - time (sec): 92.10 - samples/sec: 465.03 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-10 22:14:36,412 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:14:36,412 EPOCH 8 done: loss 0.1447 - lr: 0.000037 |
|
2023-10-10 22:14:42,561 DEV : loss 0.16653960943222046 - f1-score (micro avg) 0.6522 |
|
2023-10-10 22:14:42,570 saving best model |
|
2023-10-10 22:14:52,366 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:15:01,070 epoch 9 - iter 14/146 - loss 0.10564763 - time (sec): 8.70 - samples/sec: 483.88 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-10 22:15:10,860 epoch 9 - iter 28/146 - loss 0.13546293 - time (sec): 18.49 - samples/sec: 494.59 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-10 22:15:19,875 epoch 9 - iter 42/146 - loss 0.13424335 - time (sec): 27.50 - samples/sec: 482.35 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-10 22:15:28,801 epoch 9 - iter 56/146 - loss 0.12517220 - time (sec): 36.43 - samples/sec: 481.07 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-10 22:15:37,873 epoch 9 - iter 70/146 - loss 0.12523915 - time (sec): 45.50 - samples/sec: 480.35 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-10 22:15:46,336 epoch 9 - iter 84/146 - loss 0.12571857 - time (sec): 53.97 - samples/sec: 479.53 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-10 22:15:55,269 epoch 9 - iter 98/146 - loss 0.13040790 - time (sec): 62.90 - samples/sec: 480.15 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-10 22:16:03,942 epoch 9 - iter 112/146 - loss 0.12849922 - time (sec): 71.57 - samples/sec: 479.88 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-10 22:16:13,165 epoch 9 - iter 126/146 - loss 0.12666524 - time (sec): 80.79 - samples/sec: 484.20 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-10 22:16:21,714 epoch 9 - iter 140/146 - loss 0.12857980 - time (sec): 89.34 - samples/sec: 479.99 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-10 22:16:25,263 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:16:25,264 EPOCH 9 done: loss 0.1282 - lr: 0.000019 |
|
2023-10-10 22:16:31,347 DEV : loss 0.1624951958656311 - f1-score (micro avg) 0.6638 |
|
2023-10-10 22:16:31,356 saving best model |
|
2023-10-10 22:16:39,276 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:16:48,522 epoch 10 - iter 14/146 - loss 0.11168896 - time (sec): 9.24 - samples/sec: 485.28 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-10 22:16:57,095 epoch 10 - iter 28/146 - loss 0.12877886 - time (sec): 17.81 - samples/sec: 457.50 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-10 22:17:05,914 epoch 10 - iter 42/146 - loss 0.11898550 - time (sec): 26.63 - samples/sec: 463.39 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-10 22:17:15,949 epoch 10 - iter 56/146 - loss 0.10834056 - time (sec): 36.67 - samples/sec: 479.49 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-10 22:17:25,541 epoch 10 - iter 70/146 - loss 0.10628809 - time (sec): 46.26 - samples/sec: 483.48 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-10 22:17:34,310 epoch 10 - iter 84/146 - loss 0.10553800 - time (sec): 55.03 - samples/sec: 477.65 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-10 22:17:43,365 epoch 10 - iter 98/146 - loss 0.10631791 - time (sec): 64.08 - samples/sec: 474.69 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-10 22:17:52,344 epoch 10 - iter 112/146 - loss 0.10970536 - time (sec): 73.06 - samples/sec: 475.00 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-10 22:18:01,109 epoch 10 - iter 126/146 - loss 0.11345018 - time (sec): 81.83 - samples/sec: 473.11 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-10 22:18:10,282 epoch 10 - iter 140/146 - loss 0.11718872 - time (sec): 91.00 - samples/sec: 472.88 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-10 22:18:13,713 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:18:13,714 EPOCH 10 done: loss 0.1203 - lr: 0.000002 |
|
2023-10-10 22:18:19,724 DEV : loss 0.16049127280712128 - f1-score (micro avg) 0.6566 |
|
2023-10-10 22:18:20,641 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 22:18:20,643 Loading model from best epoch ... |
|
2023-10-10 22:18:24,871 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-10 22:18:37,888 |
|
Results: |
|
- F-score (micro) 0.6941 |
|
- F-score (macro) 0.5894 |
|
- Accuracy 0.5673 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7945 0.7443 0.7685 348 |
|
LOC 0.5994 0.7854 0.6799 261 |
|
ORG 0.3191 0.2885 0.3030 52 |
|
HumanProd 0.9091 0.4545 0.6061 22 |
|
|
|
micro avg 0.6736 0.7160 0.6941 683 |
|
macro avg 0.6555 0.5682 0.5894 683 |
|
weighted avg 0.6874 0.7160 0.6940 683 |
|
|
|
2023-10-10 22:18:37,888 ---------------------------------------------------------------------------------------------------- |
|
|