stefan-it's picture
Upload folder using huggingface_hub
f29e458
raw
history blame
25.5 kB
2023-10-14 00:15:26,580 ----------------------------------------------------------------------------------------------------
2023-10-14 00:15:26,582 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 00:15:26,582 ----------------------------------------------------------------------------------------------------
2023-10-14 00:15:26,582 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
2023-10-14 00:15:26,583 ----------------------------------------------------------------------------------------------------
2023-10-14 00:15:26,583 Train: 14465 sentences
2023-10-14 00:15:26,583 (train_with_dev=False, train_with_test=False)
2023-10-14 00:15:26,583 ----------------------------------------------------------------------------------------------------
2023-10-14 00:15:26,583 Training Params:
2023-10-14 00:15:26,583 - learning_rate: "0.00016"
2023-10-14 00:15:26,583 - mini_batch_size: "8"
2023-10-14 00:15:26,583 - max_epochs: "10"
2023-10-14 00:15:26,583 - shuffle: "True"
2023-10-14 00:15:26,583 ----------------------------------------------------------------------------------------------------
2023-10-14 00:15:26,583 Plugins:
2023-10-14 00:15:26,583 - TensorboardLogger
2023-10-14 00:15:26,583 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 00:15:26,584 ----------------------------------------------------------------------------------------------------
2023-10-14 00:15:26,584 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 00:15:26,584 - metric: "('micro avg', 'f1-score')"
2023-10-14 00:15:26,584 ----------------------------------------------------------------------------------------------------
2023-10-14 00:15:26,584 Computation:
2023-10-14 00:15:26,584 - compute on device: cuda:0
2023-10-14 00:15:26,584 - embedding storage: none
2023-10-14 00:15:26,584 ----------------------------------------------------------------------------------------------------
2023-10-14 00:15:26,584 Model training base path: "hmbench-letemps/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-3"
2023-10-14 00:15:26,584 ----------------------------------------------------------------------------------------------------
2023-10-14 00:15:26,584 ----------------------------------------------------------------------------------------------------
2023-10-14 00:15:26,584 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-14 00:16:57,553 epoch 1 - iter 180/1809 - loss 2.51944389 - time (sec): 90.97 - samples/sec: 413.65 - lr: 0.000016 - momentum: 0.000000
2023-10-14 00:18:29,092 epoch 1 - iter 360/1809 - loss 2.26637505 - time (sec): 182.51 - samples/sec: 406.16 - lr: 0.000032 - momentum: 0.000000
2023-10-14 00:20:01,815 epoch 1 - iter 540/1809 - loss 1.90816370 - time (sec): 275.23 - samples/sec: 409.55 - lr: 0.000048 - momentum: 0.000000
2023-10-14 00:21:31,625 epoch 1 - iter 720/1809 - loss 1.56194036 - time (sec): 365.04 - samples/sec: 413.28 - lr: 0.000064 - momentum: 0.000000
2023-10-14 00:23:01,817 epoch 1 - iter 900/1809 - loss 1.30444465 - time (sec): 455.23 - samples/sec: 414.24 - lr: 0.000080 - momentum: 0.000000
2023-10-14 00:24:33,627 epoch 1 - iter 1080/1809 - loss 1.11961360 - time (sec): 547.04 - samples/sec: 413.43 - lr: 0.000095 - momentum: 0.000000
2023-10-14 00:26:03,228 epoch 1 - iter 1260/1809 - loss 0.98696147 - time (sec): 636.64 - samples/sec: 412.31 - lr: 0.000111 - momentum: 0.000000
2023-10-14 00:27:35,596 epoch 1 - iter 1440/1809 - loss 0.88259280 - time (sec): 729.01 - samples/sec: 411.99 - lr: 0.000127 - momentum: 0.000000
2023-10-14 00:29:08,136 epoch 1 - iter 1620/1809 - loss 0.79457894 - time (sec): 821.55 - samples/sec: 413.43 - lr: 0.000143 - momentum: 0.000000
2023-10-14 00:30:36,487 epoch 1 - iter 1800/1809 - loss 0.72720679 - time (sec): 909.90 - samples/sec: 415.88 - lr: 0.000159 - momentum: 0.000000
2023-10-14 00:30:40,375 ----------------------------------------------------------------------------------------------------
2023-10-14 00:30:40,375 EPOCH 1 done: loss 0.7248 - lr: 0.000159
2023-10-14 00:31:18,366 DEV : loss 0.13587747514247894 - f1-score (micro avg) 0.5512
2023-10-14 00:31:18,430 saving best model
2023-10-14 00:31:19,337 ----------------------------------------------------------------------------------------------------
2023-10-14 00:32:49,760 epoch 2 - iter 180/1809 - loss 0.10182046 - time (sec): 90.42 - samples/sec: 417.47 - lr: 0.000158 - momentum: 0.000000
2023-10-14 00:34:19,933 epoch 2 - iter 360/1809 - loss 0.10191876 - time (sec): 180.59 - samples/sec: 413.42 - lr: 0.000156 - momentum: 0.000000
2023-10-14 00:35:49,015 epoch 2 - iter 540/1809 - loss 0.09941428 - time (sec): 269.68 - samples/sec: 429.78 - lr: 0.000155 - momentum: 0.000000
2023-10-14 00:37:16,788 epoch 2 - iter 720/1809 - loss 0.09701065 - time (sec): 357.45 - samples/sec: 432.34 - lr: 0.000153 - momentum: 0.000000
2023-10-14 00:38:45,501 epoch 2 - iter 900/1809 - loss 0.09626668 - time (sec): 446.16 - samples/sec: 431.73 - lr: 0.000151 - momentum: 0.000000
2023-10-14 00:40:13,863 epoch 2 - iter 1080/1809 - loss 0.09449862 - time (sec): 534.52 - samples/sec: 428.08 - lr: 0.000149 - momentum: 0.000000
2023-10-14 00:41:41,920 epoch 2 - iter 1260/1809 - loss 0.09221887 - time (sec): 622.58 - samples/sec: 427.87 - lr: 0.000148 - momentum: 0.000000
2023-10-14 00:43:10,709 epoch 2 - iter 1440/1809 - loss 0.09076303 - time (sec): 711.37 - samples/sec: 426.55 - lr: 0.000146 - momentum: 0.000000
2023-10-14 00:44:41,103 epoch 2 - iter 1620/1809 - loss 0.08931886 - time (sec): 801.76 - samples/sec: 425.61 - lr: 0.000144 - momentum: 0.000000
2023-10-14 00:46:10,438 epoch 2 - iter 1800/1809 - loss 0.08787461 - time (sec): 891.10 - samples/sec: 424.53 - lr: 0.000142 - momentum: 0.000000
2023-10-14 00:46:14,383 ----------------------------------------------------------------------------------------------------
2023-10-14 00:46:14,383 EPOCH 2 done: loss 0.0879 - lr: 0.000142
2023-10-14 00:46:52,480 DEV : loss 0.10200479626655579 - f1-score (micro avg) 0.6091
2023-10-14 00:46:52,536 saving best model
2023-10-14 00:46:55,099 ----------------------------------------------------------------------------------------------------
2023-10-14 00:48:24,885 epoch 3 - iter 180/1809 - loss 0.06314980 - time (sec): 89.78 - samples/sec: 416.42 - lr: 0.000140 - momentum: 0.000000
2023-10-14 00:49:56,919 epoch 3 - iter 360/1809 - loss 0.05882268 - time (sec): 181.82 - samples/sec: 422.98 - lr: 0.000139 - momentum: 0.000000
2023-10-14 00:51:27,745 epoch 3 - iter 540/1809 - loss 0.05798093 - time (sec): 272.64 - samples/sec: 420.56 - lr: 0.000137 - momentum: 0.000000
2023-10-14 00:52:57,778 epoch 3 - iter 720/1809 - loss 0.05728226 - time (sec): 362.67 - samples/sec: 419.92 - lr: 0.000135 - momentum: 0.000000
2023-10-14 00:54:27,582 epoch 3 - iter 900/1809 - loss 0.05773776 - time (sec): 452.48 - samples/sec: 423.51 - lr: 0.000133 - momentum: 0.000000
2023-10-14 00:55:56,659 epoch 3 - iter 1080/1809 - loss 0.05718530 - time (sec): 541.56 - samples/sec: 424.38 - lr: 0.000132 - momentum: 0.000000
2023-10-14 00:57:26,791 epoch 3 - iter 1260/1809 - loss 0.05787219 - time (sec): 631.69 - samples/sec: 420.89 - lr: 0.000130 - momentum: 0.000000
2023-10-14 00:58:55,840 epoch 3 - iter 1440/1809 - loss 0.05720897 - time (sec): 720.74 - samples/sec: 420.23 - lr: 0.000128 - momentum: 0.000000
2023-10-14 01:00:24,566 epoch 3 - iter 1620/1809 - loss 0.05641677 - time (sec): 809.46 - samples/sec: 419.21 - lr: 0.000126 - momentum: 0.000000
2023-10-14 01:01:57,412 epoch 3 - iter 1800/1809 - loss 0.05640597 - time (sec): 902.31 - samples/sec: 418.82 - lr: 0.000125 - momentum: 0.000000
2023-10-14 01:02:01,841 ----------------------------------------------------------------------------------------------------
2023-10-14 01:02:01,842 EPOCH 3 done: loss 0.0563 - lr: 0.000125
2023-10-14 01:02:41,767 DEV : loss 0.13950783014297485 - f1-score (micro avg) 0.6198
2023-10-14 01:02:41,831 saving best model
2023-10-14 01:02:44,406 ----------------------------------------------------------------------------------------------------
2023-10-14 01:04:14,083 epoch 4 - iter 180/1809 - loss 0.04170364 - time (sec): 89.67 - samples/sec: 407.77 - lr: 0.000123 - momentum: 0.000000
2023-10-14 01:05:46,581 epoch 4 - iter 360/1809 - loss 0.03858075 - time (sec): 182.17 - samples/sec: 411.86 - lr: 0.000121 - momentum: 0.000000
2023-10-14 01:07:17,093 epoch 4 - iter 540/1809 - loss 0.03981754 - time (sec): 272.68 - samples/sec: 412.97 - lr: 0.000119 - momentum: 0.000000
2023-10-14 01:08:47,525 epoch 4 - iter 720/1809 - loss 0.03847050 - time (sec): 363.11 - samples/sec: 411.25 - lr: 0.000117 - momentum: 0.000000
2023-10-14 01:10:18,340 epoch 4 - iter 900/1809 - loss 0.03892181 - time (sec): 453.93 - samples/sec: 411.93 - lr: 0.000116 - momentum: 0.000000
2023-10-14 01:11:50,274 epoch 4 - iter 1080/1809 - loss 0.04023787 - time (sec): 545.86 - samples/sec: 415.14 - lr: 0.000114 - momentum: 0.000000
2023-10-14 01:13:22,824 epoch 4 - iter 1260/1809 - loss 0.04131278 - time (sec): 638.41 - samples/sec: 415.03 - lr: 0.000112 - momentum: 0.000000
2023-10-14 01:14:51,158 epoch 4 - iter 1440/1809 - loss 0.04134067 - time (sec): 726.75 - samples/sec: 416.20 - lr: 0.000110 - momentum: 0.000000
2023-10-14 01:16:20,419 epoch 4 - iter 1620/1809 - loss 0.04065918 - time (sec): 816.01 - samples/sec: 418.02 - lr: 0.000109 - momentum: 0.000000
2023-10-14 01:17:50,525 epoch 4 - iter 1800/1809 - loss 0.04036183 - time (sec): 906.11 - samples/sec: 417.55 - lr: 0.000107 - momentum: 0.000000
2023-10-14 01:17:54,624 ----------------------------------------------------------------------------------------------------
2023-10-14 01:17:54,624 EPOCH 4 done: loss 0.0403 - lr: 0.000107
2023-10-14 01:18:35,366 DEV : loss 0.19245745241641998 - f1-score (micro avg) 0.6321
2023-10-14 01:18:35,431 saving best model
2023-10-14 01:18:38,017 ----------------------------------------------------------------------------------------------------
2023-10-14 01:20:09,743 epoch 5 - iter 180/1809 - loss 0.02576443 - time (sec): 91.72 - samples/sec: 418.78 - lr: 0.000105 - momentum: 0.000000
2023-10-14 01:21:44,400 epoch 5 - iter 360/1809 - loss 0.02641732 - time (sec): 186.38 - samples/sec: 408.40 - lr: 0.000103 - momentum: 0.000000
2023-10-14 01:23:15,834 epoch 5 - iter 540/1809 - loss 0.02651451 - time (sec): 277.81 - samples/sec: 407.70 - lr: 0.000101 - momentum: 0.000000
2023-10-14 01:24:43,605 epoch 5 - iter 720/1809 - loss 0.02751988 - time (sec): 365.58 - samples/sec: 409.65 - lr: 0.000100 - momentum: 0.000000
2023-10-14 01:26:12,328 epoch 5 - iter 900/1809 - loss 0.02767393 - time (sec): 454.31 - samples/sec: 414.53 - lr: 0.000098 - momentum: 0.000000
2023-10-14 01:27:41,085 epoch 5 - iter 1080/1809 - loss 0.02796036 - time (sec): 543.06 - samples/sec: 416.11 - lr: 0.000096 - momentum: 0.000000
2023-10-14 01:29:10,393 epoch 5 - iter 1260/1809 - loss 0.02785250 - time (sec): 632.37 - samples/sec: 414.60 - lr: 0.000094 - momentum: 0.000000
2023-10-14 01:30:45,505 epoch 5 - iter 1440/1809 - loss 0.02843191 - time (sec): 727.48 - samples/sec: 412.25 - lr: 0.000093 - momentum: 0.000000
2023-10-14 01:32:18,477 epoch 5 - iter 1620/1809 - loss 0.02924215 - time (sec): 820.46 - samples/sec: 412.99 - lr: 0.000091 - momentum: 0.000000
2023-10-14 01:33:51,099 epoch 5 - iter 1800/1809 - loss 0.02925813 - time (sec): 913.08 - samples/sec: 414.13 - lr: 0.000089 - momentum: 0.000000
2023-10-14 01:33:55,154 ----------------------------------------------------------------------------------------------------
2023-10-14 01:33:55,155 EPOCH 5 done: loss 0.0293 - lr: 0.000089
2023-10-14 01:34:33,142 DEV : loss 0.2196071296930313 - f1-score (micro avg) 0.6454
2023-10-14 01:34:33,199 saving best model
2023-10-14 01:34:35,762 ----------------------------------------------------------------------------------------------------
2023-10-14 01:36:06,097 epoch 6 - iter 180/1809 - loss 0.01696868 - time (sec): 90.33 - samples/sec: 434.02 - lr: 0.000087 - momentum: 0.000000
2023-10-14 01:37:36,160 epoch 6 - iter 360/1809 - loss 0.01685237 - time (sec): 180.39 - samples/sec: 426.33 - lr: 0.000085 - momentum: 0.000000
2023-10-14 01:39:05,475 epoch 6 - iter 540/1809 - loss 0.02067304 - time (sec): 269.71 - samples/sec: 422.42 - lr: 0.000084 - momentum: 0.000000
2023-10-14 01:40:35,631 epoch 6 - iter 720/1809 - loss 0.02166279 - time (sec): 359.87 - samples/sec: 418.94 - lr: 0.000082 - momentum: 0.000000
2023-10-14 01:42:08,370 epoch 6 - iter 900/1809 - loss 0.02213544 - time (sec): 452.60 - samples/sec: 412.98 - lr: 0.000080 - momentum: 0.000000
2023-10-14 01:43:40,493 epoch 6 - iter 1080/1809 - loss 0.02224410 - time (sec): 544.73 - samples/sec: 412.51 - lr: 0.000078 - momentum: 0.000000
2023-10-14 01:45:13,554 epoch 6 - iter 1260/1809 - loss 0.02163785 - time (sec): 637.79 - samples/sec: 414.18 - lr: 0.000077 - momentum: 0.000000
2023-10-14 01:46:46,037 epoch 6 - iter 1440/1809 - loss 0.02152121 - time (sec): 730.27 - samples/sec: 415.13 - lr: 0.000075 - momentum: 0.000000
2023-10-14 01:48:18,422 epoch 6 - iter 1620/1809 - loss 0.02245358 - time (sec): 822.66 - samples/sec: 412.61 - lr: 0.000073 - momentum: 0.000000
2023-10-14 01:49:50,889 epoch 6 - iter 1800/1809 - loss 0.02230601 - time (sec): 915.12 - samples/sec: 413.40 - lr: 0.000071 - momentum: 0.000000
2023-10-14 01:49:54,955 ----------------------------------------------------------------------------------------------------
2023-10-14 01:49:54,956 EPOCH 6 done: loss 0.0222 - lr: 0.000071
2023-10-14 01:50:36,500 DEV : loss 0.25768402218818665 - f1-score (micro avg) 0.6629
2023-10-14 01:50:36,559 saving best model
2023-10-14 01:50:39,137 ----------------------------------------------------------------------------------------------------
2023-10-14 01:52:12,077 epoch 7 - iter 180/1809 - loss 0.00921054 - time (sec): 92.94 - samples/sec: 411.47 - lr: 0.000069 - momentum: 0.000000
2023-10-14 01:53:50,733 epoch 7 - iter 360/1809 - loss 0.01168701 - time (sec): 191.59 - samples/sec: 395.89 - lr: 0.000068 - momentum: 0.000000
2023-10-14 01:55:21,611 epoch 7 - iter 540/1809 - loss 0.01305153 - time (sec): 282.47 - samples/sec: 400.63 - lr: 0.000066 - momentum: 0.000000
2023-10-14 01:56:55,925 epoch 7 - iter 720/1809 - loss 0.01260349 - time (sec): 376.78 - samples/sec: 404.80 - lr: 0.000064 - momentum: 0.000000
2023-10-14 01:58:26,714 epoch 7 - iter 900/1809 - loss 0.01297523 - time (sec): 467.57 - samples/sec: 406.34 - lr: 0.000062 - momentum: 0.000000
2023-10-14 02:00:02,490 epoch 7 - iter 1080/1809 - loss 0.01344991 - time (sec): 563.35 - samples/sec: 404.29 - lr: 0.000061 - momentum: 0.000000
2023-10-14 02:01:34,331 epoch 7 - iter 1260/1809 - loss 0.01433938 - time (sec): 655.19 - samples/sec: 406.81 - lr: 0.000059 - momentum: 0.000000
2023-10-14 02:03:06,795 epoch 7 - iter 1440/1809 - loss 0.01477713 - time (sec): 747.65 - samples/sec: 406.47 - lr: 0.000057 - momentum: 0.000000
2023-10-14 02:04:40,603 epoch 7 - iter 1620/1809 - loss 0.01477570 - time (sec): 841.46 - samples/sec: 406.11 - lr: 0.000055 - momentum: 0.000000
2023-10-14 02:06:15,178 epoch 7 - iter 1800/1809 - loss 0.01511790 - time (sec): 936.04 - samples/sec: 404.12 - lr: 0.000053 - momentum: 0.000000
2023-10-14 02:06:19,433 ----------------------------------------------------------------------------------------------------
2023-10-14 02:06:19,433 EPOCH 7 done: loss 0.0151 - lr: 0.000053
2023-10-14 02:06:59,410 DEV : loss 0.2951534390449524 - f1-score (micro avg) 0.6663
2023-10-14 02:06:59,467 saving best model
2023-10-14 02:07:02,034 ----------------------------------------------------------------------------------------------------
2023-10-14 02:08:31,365 epoch 8 - iter 180/1809 - loss 0.00690787 - time (sec): 89.33 - samples/sec: 414.43 - lr: 0.000052 - momentum: 0.000000
2023-10-14 02:10:01,793 epoch 8 - iter 360/1809 - loss 0.01015270 - time (sec): 179.75 - samples/sec: 417.93 - lr: 0.000050 - momentum: 0.000000
2023-10-14 02:11:31,245 epoch 8 - iter 540/1809 - loss 0.01075555 - time (sec): 269.21 - samples/sec: 425.42 - lr: 0.000048 - momentum: 0.000000
2023-10-14 02:12:59,192 epoch 8 - iter 720/1809 - loss 0.01026295 - time (sec): 357.15 - samples/sec: 426.20 - lr: 0.000046 - momentum: 0.000000
2023-10-14 02:14:29,940 epoch 8 - iter 900/1809 - loss 0.00994762 - time (sec): 447.90 - samples/sec: 425.93 - lr: 0.000044 - momentum: 0.000000
2023-10-14 02:16:04,150 epoch 8 - iter 1080/1809 - loss 0.00979174 - time (sec): 542.11 - samples/sec: 420.45 - lr: 0.000043 - momentum: 0.000000
2023-10-14 02:17:34,169 epoch 8 - iter 1260/1809 - loss 0.01043605 - time (sec): 632.13 - samples/sec: 420.83 - lr: 0.000041 - momentum: 0.000000
2023-10-14 02:19:02,279 epoch 8 - iter 1440/1809 - loss 0.01048305 - time (sec): 720.24 - samples/sec: 420.80 - lr: 0.000039 - momentum: 0.000000
2023-10-14 02:20:31,156 epoch 8 - iter 1620/1809 - loss 0.01031134 - time (sec): 809.12 - samples/sec: 421.20 - lr: 0.000037 - momentum: 0.000000
2023-10-14 02:21:59,294 epoch 8 - iter 1800/1809 - loss 0.01018123 - time (sec): 897.26 - samples/sec: 421.63 - lr: 0.000036 - momentum: 0.000000
2023-10-14 02:22:03,202 ----------------------------------------------------------------------------------------------------
2023-10-14 02:22:03,202 EPOCH 8 done: loss 0.0102 - lr: 0.000036
2023-10-14 02:22:42,340 DEV : loss 0.31866809725761414 - f1-score (micro avg) 0.6638
2023-10-14 02:22:42,399 ----------------------------------------------------------------------------------------------------
2023-10-14 02:24:15,293 epoch 9 - iter 180/1809 - loss 0.00817136 - time (sec): 92.89 - samples/sec: 415.41 - lr: 0.000034 - momentum: 0.000000
2023-10-14 02:25:48,900 epoch 9 - iter 360/1809 - loss 0.00980844 - time (sec): 186.50 - samples/sec: 417.93 - lr: 0.000032 - momentum: 0.000000
2023-10-14 02:27:21,286 epoch 9 - iter 540/1809 - loss 0.00968269 - time (sec): 278.88 - samples/sec: 413.52 - lr: 0.000030 - momentum: 0.000000
2023-10-14 02:28:53,866 epoch 9 - iter 720/1809 - loss 0.00851234 - time (sec): 371.46 - samples/sec: 413.77 - lr: 0.000028 - momentum: 0.000000
2023-10-14 02:30:25,601 epoch 9 - iter 900/1809 - loss 0.00876138 - time (sec): 463.20 - samples/sec: 412.00 - lr: 0.000027 - momentum: 0.000000
2023-10-14 02:31:58,342 epoch 9 - iter 1080/1809 - loss 0.00835601 - time (sec): 555.94 - samples/sec: 411.87 - lr: 0.000025 - momentum: 0.000000
2023-10-14 02:33:29,812 epoch 9 - iter 1260/1809 - loss 0.00813399 - time (sec): 647.41 - samples/sec: 411.93 - lr: 0.000023 - momentum: 0.000000
2023-10-14 02:35:00,772 epoch 9 - iter 1440/1809 - loss 0.00791238 - time (sec): 738.37 - samples/sec: 411.24 - lr: 0.000021 - momentum: 0.000000
2023-10-14 02:36:32,257 epoch 9 - iter 1620/1809 - loss 0.00783074 - time (sec): 829.86 - samples/sec: 409.66 - lr: 0.000020 - momentum: 0.000000
2023-10-14 02:38:03,837 epoch 9 - iter 1800/1809 - loss 0.00756760 - time (sec): 921.44 - samples/sec: 410.66 - lr: 0.000018 - momentum: 0.000000
2023-10-14 02:38:07,713 ----------------------------------------------------------------------------------------------------
2023-10-14 02:38:07,713 EPOCH 9 done: loss 0.0076 - lr: 0.000018
2023-10-14 02:38:45,463 DEV : loss 0.3397609293460846 - f1-score (micro avg) 0.6699
2023-10-14 02:38:45,524 saving best model
2023-10-14 02:38:48,090 ----------------------------------------------------------------------------------------------------
2023-10-14 02:40:17,848 epoch 10 - iter 180/1809 - loss 0.00675340 - time (sec): 89.75 - samples/sec: 418.97 - lr: 0.000016 - momentum: 0.000000
2023-10-14 02:41:47,430 epoch 10 - iter 360/1809 - loss 0.00699592 - time (sec): 179.34 - samples/sec: 410.52 - lr: 0.000014 - momentum: 0.000000
2023-10-14 02:43:19,155 epoch 10 - iter 540/1809 - loss 0.00674174 - time (sec): 271.06 - samples/sec: 414.12 - lr: 0.000012 - momentum: 0.000000
2023-10-14 02:44:53,881 epoch 10 - iter 720/1809 - loss 0.00580321 - time (sec): 365.79 - samples/sec: 407.45 - lr: 0.000011 - momentum: 0.000000
2023-10-14 02:46:33,110 epoch 10 - iter 900/1809 - loss 0.00563464 - time (sec): 465.02 - samples/sec: 403.22 - lr: 0.000009 - momentum: 0.000000
2023-10-14 02:48:05,921 epoch 10 - iter 1080/1809 - loss 0.00536878 - time (sec): 557.83 - samples/sec: 403.01 - lr: 0.000007 - momentum: 0.000000
2023-10-14 02:49:38,498 epoch 10 - iter 1260/1809 - loss 0.00519036 - time (sec): 650.40 - samples/sec: 406.07 - lr: 0.000005 - momentum: 0.000000
2023-10-14 02:51:10,923 epoch 10 - iter 1440/1809 - loss 0.00506535 - time (sec): 742.83 - samples/sec: 405.69 - lr: 0.000004 - momentum: 0.000000
2023-10-14 02:52:43,275 epoch 10 - iter 1620/1809 - loss 0.00520096 - time (sec): 835.18 - samples/sec: 407.55 - lr: 0.000002 - momentum: 0.000000
2023-10-14 02:54:18,302 epoch 10 - iter 1800/1809 - loss 0.00505952 - time (sec): 930.21 - samples/sec: 406.29 - lr: 0.000000 - momentum: 0.000000
2023-10-14 02:54:22,857 ----------------------------------------------------------------------------------------------------
2023-10-14 02:54:22,857 EPOCH 10 done: loss 0.0050 - lr: 0.000000
2023-10-14 02:55:02,618 DEV : loss 0.34813550114631653 - f1-score (micro avg) 0.6654
2023-10-14 02:55:03,588 ----------------------------------------------------------------------------------------------------
2023-10-14 02:55:03,590 Loading model from best epoch ...
2023-10-14 02:55:07,452 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
2023-10-14 02:56:07,325
Results:
- F-score (micro) 0.6407
- F-score (macro) 0.4712
- Accuracy 0.4813
By class:
precision recall f1-score support
loc 0.6545 0.7563 0.7017 591
pers 0.5705 0.7143 0.6343 357
org 0.1000 0.0633 0.0775 79
micro avg 0.5992 0.6884 0.6407 1027
macro avg 0.4416 0.5113 0.4712 1027
weighted avg 0.5826 0.6884 0.6303 1027
2023-10-14 02:56:07,325 ----------------------------------------------------------------------------------------------------