stefan-it's picture
Upload folder using huggingface_hub
8431454
2023-10-14 11:52:28,279 ----------------------------------------------------------------------------------------------------
2023-10-14 11:52:28,281 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 11:52:28,281 ----------------------------------------------------------------------------------------------------
2023-10-14 11:52:28,281 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
2023-10-14 11:52:28,281 ----------------------------------------------------------------------------------------------------
2023-10-14 11:52:28,282 Train: 14465 sentences
2023-10-14 11:52:28,282 (train_with_dev=False, train_with_test=False)
2023-10-14 11:52:28,282 ----------------------------------------------------------------------------------------------------
2023-10-14 11:52:28,282 Training Params:
2023-10-14 11:52:28,282 - learning_rate: "0.00016"
2023-10-14 11:52:28,282 - mini_batch_size: "8"
2023-10-14 11:52:28,282 - max_epochs: "10"
2023-10-14 11:52:28,282 - shuffle: "True"
2023-10-14 11:52:28,282 ----------------------------------------------------------------------------------------------------
2023-10-14 11:52:28,282 Plugins:
2023-10-14 11:52:28,282 - TensorboardLogger
2023-10-14 11:52:28,282 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 11:52:28,282 ----------------------------------------------------------------------------------------------------
2023-10-14 11:52:28,282 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 11:52:28,282 - metric: "('micro avg', 'f1-score')"
2023-10-14 11:52:28,283 ----------------------------------------------------------------------------------------------------
2023-10-14 11:52:28,283 Computation:
2023-10-14 11:52:28,283 - compute on device: cuda:0
2023-10-14 11:52:28,283 - embedding storage: none
2023-10-14 11:52:28,283 ----------------------------------------------------------------------------------------------------
2023-10-14 11:52:28,283 Model training base path: "hmbench-letemps/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-4"
2023-10-14 11:52:28,283 ----------------------------------------------------------------------------------------------------
2023-10-14 11:52:28,283 ----------------------------------------------------------------------------------------------------
2023-10-14 11:52:28,283 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-14 11:54:01,906 epoch 1 - iter 180/1809 - loss 2.54702501 - time (sec): 93.62 - samples/sec: 389.75 - lr: 0.000016 - momentum: 0.000000
2023-10-14 11:55:34,893 epoch 1 - iter 360/1809 - loss 2.27602894 - time (sec): 186.61 - samples/sec: 400.06 - lr: 0.000032 - momentum: 0.000000
2023-10-14 11:57:10,309 epoch 1 - iter 540/1809 - loss 1.92434741 - time (sec): 282.02 - samples/sec: 402.87 - lr: 0.000048 - momentum: 0.000000
2023-10-14 11:58:44,161 epoch 1 - iter 720/1809 - loss 1.57584407 - time (sec): 375.88 - samples/sec: 404.57 - lr: 0.000064 - momentum: 0.000000
2023-10-14 12:00:18,267 epoch 1 - iter 900/1809 - loss 1.32127986 - time (sec): 469.98 - samples/sec: 403.03 - lr: 0.000080 - momentum: 0.000000
2023-10-14 12:01:56,606 epoch 1 - iter 1080/1809 - loss 1.13001630 - time (sec): 568.32 - samples/sec: 402.37 - lr: 0.000095 - momentum: 0.000000
2023-10-14 12:03:33,121 epoch 1 - iter 1260/1809 - loss 0.99420573 - time (sec): 664.84 - samples/sec: 401.11 - lr: 0.000111 - momentum: 0.000000
2023-10-14 12:05:10,045 epoch 1 - iter 1440/1809 - loss 0.89109878 - time (sec): 761.76 - samples/sec: 400.07 - lr: 0.000127 - momentum: 0.000000
2023-10-14 12:06:47,130 epoch 1 - iter 1620/1809 - loss 0.81100176 - time (sec): 858.85 - samples/sec: 397.02 - lr: 0.000143 - momentum: 0.000000
2023-10-14 12:08:23,566 epoch 1 - iter 1800/1809 - loss 0.74361325 - time (sec): 955.28 - samples/sec: 396.06 - lr: 0.000159 - momentum: 0.000000
2023-10-14 12:08:27,689 ----------------------------------------------------------------------------------------------------
2023-10-14 12:08:27,690 EPOCH 1 done: loss 0.7418 - lr: 0.000159
2023-10-14 12:09:05,444 DEV : loss 0.13058921694755554 - f1-score (micro avg) 0.4901
2023-10-14 12:09:05,501 saving best model
2023-10-14 12:09:06,504 ----------------------------------------------------------------------------------------------------
2023-10-14 12:10:41,018 epoch 2 - iter 180/1809 - loss 0.11763819 - time (sec): 94.51 - samples/sec: 398.13 - lr: 0.000158 - momentum: 0.000000
2023-10-14 12:12:15,894 epoch 2 - iter 360/1809 - loss 0.11288606 - time (sec): 189.39 - samples/sec: 401.69 - lr: 0.000156 - momentum: 0.000000
2023-10-14 12:13:53,929 epoch 2 - iter 540/1809 - loss 0.11069431 - time (sec): 287.42 - samples/sec: 392.41 - lr: 0.000155 - momentum: 0.000000
2023-10-14 12:15:25,447 epoch 2 - iter 720/1809 - loss 0.10746830 - time (sec): 378.94 - samples/sec: 395.96 - lr: 0.000153 - momentum: 0.000000
2023-10-14 12:16:57,242 epoch 2 - iter 900/1809 - loss 0.10386008 - time (sec): 470.74 - samples/sec: 398.06 - lr: 0.000151 - momentum: 0.000000
2023-10-14 12:18:32,290 epoch 2 - iter 1080/1809 - loss 0.10038808 - time (sec): 565.78 - samples/sec: 399.27 - lr: 0.000149 - momentum: 0.000000
2023-10-14 12:20:11,902 epoch 2 - iter 1260/1809 - loss 0.09720591 - time (sec): 665.40 - samples/sec: 398.88 - lr: 0.000148 - momentum: 0.000000
2023-10-14 12:21:47,580 epoch 2 - iter 1440/1809 - loss 0.09543514 - time (sec): 761.07 - samples/sec: 398.87 - lr: 0.000146 - momentum: 0.000000
2023-10-14 12:23:21,787 epoch 2 - iter 1620/1809 - loss 0.09204174 - time (sec): 855.28 - samples/sec: 399.25 - lr: 0.000144 - momentum: 0.000000
2023-10-14 12:24:54,948 epoch 2 - iter 1800/1809 - loss 0.09031627 - time (sec): 948.44 - samples/sec: 399.02 - lr: 0.000142 - momentum: 0.000000
2023-10-14 12:24:59,031 ----------------------------------------------------------------------------------------------------
2023-10-14 12:24:59,031 EPOCH 2 done: loss 0.0905 - lr: 0.000142
2023-10-14 12:25:38,013 DEV : loss 0.09481607377529144 - f1-score (micro avg) 0.6302
2023-10-14 12:25:38,069 saving best model
2023-10-14 12:25:41,560 ----------------------------------------------------------------------------------------------------
2023-10-14 12:27:25,068 epoch 3 - iter 180/1809 - loss 0.05732527 - time (sec): 103.50 - samples/sec: 369.83 - lr: 0.000140 - momentum: 0.000000
2023-10-14 12:28:58,543 epoch 3 - iter 360/1809 - loss 0.05785601 - time (sec): 196.98 - samples/sec: 379.54 - lr: 0.000139 - momentum: 0.000000
2023-10-14 12:30:28,432 epoch 3 - iter 540/1809 - loss 0.05962819 - time (sec): 286.87 - samples/sec: 391.90 - lr: 0.000137 - momentum: 0.000000
2023-10-14 12:31:58,859 epoch 3 - iter 720/1809 - loss 0.05938527 - time (sec): 377.29 - samples/sec: 402.14 - lr: 0.000135 - momentum: 0.000000
2023-10-14 12:33:36,210 epoch 3 - iter 900/1809 - loss 0.05901793 - time (sec): 474.64 - samples/sec: 400.35 - lr: 0.000133 - momentum: 0.000000
2023-10-14 12:35:21,233 epoch 3 - iter 1080/1809 - loss 0.05760677 - time (sec): 579.67 - samples/sec: 394.24 - lr: 0.000132 - momentum: 0.000000
2023-10-14 12:37:02,151 epoch 3 - iter 1260/1809 - loss 0.05767185 - time (sec): 680.59 - samples/sec: 391.76 - lr: 0.000130 - momentum: 0.000000
2023-10-14 12:38:38,632 epoch 3 - iter 1440/1809 - loss 0.05754962 - time (sec): 777.07 - samples/sec: 390.27 - lr: 0.000128 - momentum: 0.000000
2023-10-14 12:40:18,411 epoch 3 - iter 1620/1809 - loss 0.05726013 - time (sec): 876.85 - samples/sec: 388.54 - lr: 0.000126 - momentum: 0.000000
2023-10-14 12:41:58,428 epoch 3 - iter 1800/1809 - loss 0.05737904 - time (sec): 976.86 - samples/sec: 387.15 - lr: 0.000125 - momentum: 0.000000
2023-10-14 12:42:03,646 ----------------------------------------------------------------------------------------------------
2023-10-14 12:42:03,646 EPOCH 3 done: loss 0.0576 - lr: 0.000125
2023-10-14 12:42:52,619 DEV : loss 0.1367848813533783 - f1-score (micro avg) 0.6292
2023-10-14 12:42:52,685 ----------------------------------------------------------------------------------------------------
2023-10-14 12:44:28,197 epoch 4 - iter 180/1809 - loss 0.03871364 - time (sec): 95.51 - samples/sec: 381.69 - lr: 0.000123 - momentum: 0.000000
2023-10-14 12:46:00,663 epoch 4 - iter 360/1809 - loss 0.03856325 - time (sec): 187.98 - samples/sec: 395.23 - lr: 0.000121 - momentum: 0.000000
2023-10-14 12:47:33,114 epoch 4 - iter 540/1809 - loss 0.04060601 - time (sec): 280.43 - samples/sec: 403.91 - lr: 0.000119 - momentum: 0.000000
2023-10-14 12:49:02,712 epoch 4 - iter 720/1809 - loss 0.03906737 - time (sec): 370.02 - samples/sec: 404.57 - lr: 0.000117 - momentum: 0.000000
2023-10-14 12:50:32,763 epoch 4 - iter 900/1809 - loss 0.03994477 - time (sec): 460.08 - samples/sec: 407.04 - lr: 0.000116 - momentum: 0.000000
2023-10-14 12:52:04,771 epoch 4 - iter 1080/1809 - loss 0.03951109 - time (sec): 552.08 - samples/sec: 408.84 - lr: 0.000114 - momentum: 0.000000
2023-10-14 12:53:36,911 epoch 4 - iter 1260/1809 - loss 0.04014654 - time (sec): 644.22 - samples/sec: 409.76 - lr: 0.000112 - momentum: 0.000000
2023-10-14 12:55:12,371 epoch 4 - iter 1440/1809 - loss 0.03986687 - time (sec): 739.68 - samples/sec: 409.89 - lr: 0.000110 - momentum: 0.000000
2023-10-14 12:56:48,117 epoch 4 - iter 1620/1809 - loss 0.03943713 - time (sec): 835.43 - samples/sec: 407.22 - lr: 0.000109 - momentum: 0.000000
2023-10-14 12:58:28,134 epoch 4 - iter 1800/1809 - loss 0.03893369 - time (sec): 935.45 - samples/sec: 404.27 - lr: 0.000107 - momentum: 0.000000
2023-10-14 12:58:32,852 ----------------------------------------------------------------------------------------------------
2023-10-14 12:58:32,853 EPOCH 4 done: loss 0.0389 - lr: 0.000107
2023-10-14 12:59:11,971 DEV : loss 0.19053448736667633 - f1-score (micro avg) 0.6399
2023-10-14 12:59:12,035 saving best model
2023-10-14 12:59:13,027 ----------------------------------------------------------------------------------------------------
2023-10-14 13:00:43,768 epoch 5 - iter 180/1809 - loss 0.02307068 - time (sec): 90.74 - samples/sec: 413.03 - lr: 0.000105 - momentum: 0.000000
2023-10-14 13:02:22,128 epoch 5 - iter 360/1809 - loss 0.02459229 - time (sec): 189.10 - samples/sec: 413.96 - lr: 0.000103 - momentum: 0.000000
2023-10-14 13:03:56,196 epoch 5 - iter 540/1809 - loss 0.02726105 - time (sec): 283.17 - samples/sec: 412.87 - lr: 0.000101 - momentum: 0.000000
2023-10-14 13:05:29,203 epoch 5 - iter 720/1809 - loss 0.02847084 - time (sec): 376.17 - samples/sec: 407.73 - lr: 0.000100 - momentum: 0.000000
2023-10-14 13:07:00,178 epoch 5 - iter 900/1809 - loss 0.02858724 - time (sec): 467.15 - samples/sec: 407.20 - lr: 0.000098 - momentum: 0.000000
2023-10-14 13:08:32,646 epoch 5 - iter 1080/1809 - loss 0.02877847 - time (sec): 559.62 - samples/sec: 407.69 - lr: 0.000096 - momentum: 0.000000
2023-10-14 13:10:07,347 epoch 5 - iter 1260/1809 - loss 0.02796208 - time (sec): 654.32 - samples/sec: 406.44 - lr: 0.000094 - momentum: 0.000000
2023-10-14 13:11:43,274 epoch 5 - iter 1440/1809 - loss 0.02825755 - time (sec): 750.24 - samples/sec: 404.27 - lr: 0.000093 - momentum: 0.000000
2023-10-14 13:13:26,339 epoch 5 - iter 1620/1809 - loss 0.02890447 - time (sec): 853.31 - samples/sec: 398.64 - lr: 0.000091 - momentum: 0.000000
2023-10-14 13:14:59,584 epoch 5 - iter 1800/1809 - loss 0.02958885 - time (sec): 946.55 - samples/sec: 399.42 - lr: 0.000089 - momentum: 0.000000
2023-10-14 13:15:03,940 ----------------------------------------------------------------------------------------------------
2023-10-14 13:15:03,940 EPOCH 5 done: loss 0.0297 - lr: 0.000089
2023-10-14 13:15:45,600 DEV : loss 0.23190708458423615 - f1-score (micro avg) 0.6378
2023-10-14 13:15:45,666 ----------------------------------------------------------------------------------------------------
2023-10-14 13:17:21,707 epoch 6 - iter 180/1809 - loss 0.01812127 - time (sec): 96.04 - samples/sec: 408.69 - lr: 0.000087 - momentum: 0.000000
2023-10-14 13:18:53,389 epoch 6 - iter 360/1809 - loss 0.02039867 - time (sec): 187.72 - samples/sec: 405.76 - lr: 0.000085 - momentum: 0.000000
2023-10-14 13:20:29,251 epoch 6 - iter 540/1809 - loss 0.01933017 - time (sec): 283.58 - samples/sec: 401.10 - lr: 0.000084 - momentum: 0.000000
2023-10-14 13:22:02,753 epoch 6 - iter 720/1809 - loss 0.02027772 - time (sec): 377.08 - samples/sec: 399.93 - lr: 0.000082 - momentum: 0.000000
2023-10-14 13:23:37,511 epoch 6 - iter 900/1809 - loss 0.02122834 - time (sec): 471.84 - samples/sec: 398.25 - lr: 0.000080 - momentum: 0.000000
2023-10-14 13:25:13,342 epoch 6 - iter 1080/1809 - loss 0.02222813 - time (sec): 567.67 - samples/sec: 398.86 - lr: 0.000078 - momentum: 0.000000
2023-10-14 13:26:46,706 epoch 6 - iter 1260/1809 - loss 0.02238050 - time (sec): 661.04 - samples/sec: 399.34 - lr: 0.000077 - momentum: 0.000000
2023-10-14 13:28:21,436 epoch 6 - iter 1440/1809 - loss 0.02161858 - time (sec): 755.77 - samples/sec: 400.35 - lr: 0.000075 - momentum: 0.000000
2023-10-14 13:29:57,051 epoch 6 - iter 1620/1809 - loss 0.02203621 - time (sec): 851.38 - samples/sec: 399.42 - lr: 0.000073 - momentum: 0.000000
2023-10-14 13:31:35,172 epoch 6 - iter 1800/1809 - loss 0.02164417 - time (sec): 949.50 - samples/sec: 398.33 - lr: 0.000071 - momentum: 0.000000
2023-10-14 13:31:39,297 ----------------------------------------------------------------------------------------------------
2023-10-14 13:31:39,298 EPOCH 6 done: loss 0.0216 - lr: 0.000071
2023-10-14 13:32:24,318 DEV : loss 0.272296279668808 - f1-score (micro avg) 0.6524
2023-10-14 13:32:24,397 saving best model
2023-10-14 13:32:31,032 ----------------------------------------------------------------------------------------------------
2023-10-14 13:34:10,803 epoch 7 - iter 180/1809 - loss 0.01105902 - time (sec): 99.77 - samples/sec: 406.73 - lr: 0.000069 - momentum: 0.000000
2023-10-14 13:35:57,197 epoch 7 - iter 360/1809 - loss 0.01259388 - time (sec): 206.16 - samples/sec: 376.08 - lr: 0.000068 - momentum: 0.000000
2023-10-14 13:37:31,991 epoch 7 - iter 540/1809 - loss 0.01335206 - time (sec): 300.95 - samples/sec: 382.63 - lr: 0.000066 - momentum: 0.000000
2023-10-14 13:39:05,951 epoch 7 - iter 720/1809 - loss 0.01459110 - time (sec): 394.91 - samples/sec: 385.81 - lr: 0.000064 - momentum: 0.000000
2023-10-14 13:40:38,429 epoch 7 - iter 900/1809 - loss 0.01490005 - time (sec): 487.39 - samples/sec: 388.72 - lr: 0.000062 - momentum: 0.000000
2023-10-14 13:42:11,096 epoch 7 - iter 1080/1809 - loss 0.01507190 - time (sec): 580.06 - samples/sec: 391.67 - lr: 0.000061 - momentum: 0.000000
2023-10-14 13:43:41,678 epoch 7 - iter 1260/1809 - loss 0.01473060 - time (sec): 670.64 - samples/sec: 395.61 - lr: 0.000059 - momentum: 0.000000
2023-10-14 13:45:12,938 epoch 7 - iter 1440/1809 - loss 0.01465842 - time (sec): 761.90 - samples/sec: 399.60 - lr: 0.000057 - momentum: 0.000000
2023-10-14 13:46:44,587 epoch 7 - iter 1620/1809 - loss 0.01508552 - time (sec): 853.55 - samples/sec: 400.09 - lr: 0.000055 - momentum: 0.000000
2023-10-14 13:48:13,609 epoch 7 - iter 1800/1809 - loss 0.01547761 - time (sec): 942.57 - samples/sec: 400.65 - lr: 0.000053 - momentum: 0.000000
2023-10-14 13:48:18,270 ----------------------------------------------------------------------------------------------------
2023-10-14 13:48:18,270 EPOCH 7 done: loss 0.0154 - lr: 0.000053
2023-10-14 13:48:57,977 DEV : loss 0.29768648743629456 - f1-score (micro avg) 0.6376
2023-10-14 13:48:58,052 ----------------------------------------------------------------------------------------------------
2023-10-14 13:50:33,364 epoch 8 - iter 180/1809 - loss 0.01237417 - time (sec): 95.31 - samples/sec: 401.98 - lr: 0.000052 - momentum: 0.000000
2023-10-14 13:52:16,095 epoch 8 - iter 360/1809 - loss 0.01212823 - time (sec): 198.04 - samples/sec: 390.76 - lr: 0.000050 - momentum: 0.000000
2023-10-14 13:53:49,344 epoch 8 - iter 540/1809 - loss 0.01054509 - time (sec): 291.29 - samples/sec: 395.46 - lr: 0.000048 - momentum: 0.000000
2023-10-14 13:55:23,207 epoch 8 - iter 720/1809 - loss 0.01148564 - time (sec): 385.15 - samples/sec: 393.79 - lr: 0.000046 - momentum: 0.000000
2023-10-14 13:56:59,828 epoch 8 - iter 900/1809 - loss 0.01104383 - time (sec): 481.77 - samples/sec: 394.42 - lr: 0.000044 - momentum: 0.000000
2023-10-14 13:58:32,234 epoch 8 - iter 1080/1809 - loss 0.01171246 - time (sec): 574.18 - samples/sec: 394.37 - lr: 0.000043 - momentum: 0.000000
2023-10-14 14:00:06,597 epoch 8 - iter 1260/1809 - loss 0.01136383 - time (sec): 668.54 - samples/sec: 396.67 - lr: 0.000041 - momentum: 0.000000
2023-10-14 14:01:39,016 epoch 8 - iter 1440/1809 - loss 0.01180198 - time (sec): 760.96 - samples/sec: 397.42 - lr: 0.000039 - momentum: 0.000000
2023-10-14 14:03:17,681 epoch 8 - iter 1620/1809 - loss 0.01178500 - time (sec): 859.63 - samples/sec: 395.94 - lr: 0.000037 - momentum: 0.000000
2023-10-14 14:04:51,444 epoch 8 - iter 1800/1809 - loss 0.01159822 - time (sec): 953.39 - samples/sec: 396.91 - lr: 0.000036 - momentum: 0.000000
2023-10-14 14:04:55,478 ----------------------------------------------------------------------------------------------------
2023-10-14 14:04:55,479 EPOCH 8 done: loss 0.0116 - lr: 0.000036
2023-10-14 14:05:34,550 DEV : loss 0.32400456070899963 - f1-score (micro avg) 0.6441
2023-10-14 14:05:34,614 ----------------------------------------------------------------------------------------------------
2023-10-14 14:07:04,671 epoch 9 - iter 180/1809 - loss 0.00428776 - time (sec): 90.06 - samples/sec: 403.01 - lr: 0.000034 - momentum: 0.000000
2023-10-14 14:08:39,160 epoch 9 - iter 360/1809 - loss 0.00657256 - time (sec): 184.54 - samples/sec: 394.04 - lr: 0.000032 - momentum: 0.000000
2023-10-14 14:10:22,284 epoch 9 - iter 540/1809 - loss 0.00795673 - time (sec): 287.67 - samples/sec: 385.39 - lr: 0.000030 - momentum: 0.000000
2023-10-14 14:11:58,636 epoch 9 - iter 720/1809 - loss 0.00803158 - time (sec): 384.02 - samples/sec: 389.03 - lr: 0.000028 - momentum: 0.000000
2023-10-14 14:13:34,188 epoch 9 - iter 900/1809 - loss 0.00763463 - time (sec): 479.57 - samples/sec: 392.43 - lr: 0.000027 - momentum: 0.000000
2023-10-14 14:15:10,464 epoch 9 - iter 1080/1809 - loss 0.00728638 - time (sec): 575.85 - samples/sec: 391.55 - lr: 0.000025 - momentum: 0.000000
2023-10-14 14:16:47,265 epoch 9 - iter 1260/1809 - loss 0.00750028 - time (sec): 672.65 - samples/sec: 391.08 - lr: 0.000023 - momentum: 0.000000
2023-10-14 14:18:20,876 epoch 9 - iter 1440/1809 - loss 0.00787103 - time (sec): 766.26 - samples/sec: 394.45 - lr: 0.000021 - momentum: 0.000000
2023-10-14 14:19:54,396 epoch 9 - iter 1620/1809 - loss 0.00770686 - time (sec): 859.78 - samples/sec: 396.21 - lr: 0.000020 - momentum: 0.000000
2023-10-14 14:21:40,785 epoch 9 - iter 1800/1809 - loss 0.00782706 - time (sec): 966.17 - samples/sec: 391.34 - lr: 0.000018 - momentum: 0.000000
2023-10-14 14:21:45,616 ----------------------------------------------------------------------------------------------------
2023-10-14 14:21:45,616 EPOCH 9 done: loss 0.0078 - lr: 0.000018
2023-10-14 14:22:26,271 DEV : loss 0.3510294556617737 - f1-score (micro avg) 0.6469
2023-10-14 14:22:26,332 ----------------------------------------------------------------------------------------------------
2023-10-14 14:24:08,212 epoch 10 - iter 180/1809 - loss 0.00465200 - time (sec): 101.88 - samples/sec: 366.12 - lr: 0.000016 - momentum: 0.000000
2023-10-14 14:25:42,942 epoch 10 - iter 360/1809 - loss 0.00473108 - time (sec): 196.61 - samples/sec: 384.51 - lr: 0.000014 - momentum: 0.000000
2023-10-14 14:27:21,238 epoch 10 - iter 540/1809 - loss 0.00620538 - time (sec): 294.90 - samples/sec: 387.50 - lr: 0.000012 - momentum: 0.000000
2023-10-14 14:28:57,390 epoch 10 - iter 720/1809 - loss 0.00594168 - time (sec): 391.06 - samples/sec: 387.91 - lr: 0.000011 - momentum: 0.000000
2023-10-14 14:30:32,580 epoch 10 - iter 900/1809 - loss 0.00587265 - time (sec): 486.25 - samples/sec: 389.34 - lr: 0.000009 - momentum: 0.000000
2023-10-14 14:32:07,017 epoch 10 - iter 1080/1809 - loss 0.00574894 - time (sec): 580.68 - samples/sec: 391.80 - lr: 0.000007 - momentum: 0.000000
2023-10-14 14:33:40,067 epoch 10 - iter 1260/1809 - loss 0.00587125 - time (sec): 673.73 - samples/sec: 393.69 - lr: 0.000005 - momentum: 0.000000
2023-10-14 14:35:15,347 epoch 10 - iter 1440/1809 - loss 0.00584501 - time (sec): 769.01 - samples/sec: 395.67 - lr: 0.000004 - momentum: 0.000000
2023-10-14 14:36:49,543 epoch 10 - iter 1620/1809 - loss 0.00598522 - time (sec): 863.21 - samples/sec: 395.43 - lr: 0.000002 - momentum: 0.000000
2023-10-14 14:38:23,873 epoch 10 - iter 1800/1809 - loss 0.00617272 - time (sec): 957.54 - samples/sec: 394.90 - lr: 0.000000 - momentum: 0.000000
2023-10-14 14:38:28,207 ----------------------------------------------------------------------------------------------------
2023-10-14 14:38:28,208 EPOCH 10 done: loss 0.0062 - lr: 0.000000
2023-10-14 14:39:11,475 DEV : loss 0.35460981726646423 - f1-score (micro avg) 0.6421
2023-10-14 14:39:13,412 ----------------------------------------------------------------------------------------------------
2023-10-14 14:39:13,414 Loading model from best epoch ...
2023-10-14 14:39:17,156 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
2023-10-14 14:40:16,219
Results:
- F-score (micro) 0.6364
- F-score (macro) 0.4866
- Accuracy 0.4794
By class:
precision recall f1-score support
loc 0.6326 0.7547 0.6883 591
pers 0.5737 0.7087 0.6341 357
org 0.1731 0.1139 0.1374 79
micro avg 0.5910 0.6894 0.6364 1027
macro avg 0.4598 0.5258 0.4866 1027
weighted avg 0.5768 0.6894 0.6271 1027
2023-10-14 14:40:16,219 ----------------------------------------------------------------------------------------------------